HTML Entities & Charset & URL Encode

HTML Entities

Character entities are used to display reserved characters in HTML.






The advantage of using an entity name, instead of a number, is that the name is easier to remember.


The disadvantage is that browsers may not support all entity names, but the support for numbers (十进制十六进制)is good.


If you use an HTML entity name, or number, the character will always display correctly. This is independent of what character set (encoding) your page uses!


Tip: Remember that browsers will always truncate spaces in HTML pages. If youwrite 10 spaces in your text, the browser willremove 9 of them. To add real spaces to your
text, you can use the   character entity.

为了在文本中加入1个以上的空格,可以使用   字符实体

此外,Entity names是 case sensitive 大小写敏感的!

HTML Charset

为正确显示网页,浏览器必须知道使用哪种 character set (character encoding)字符集。

ASCII was the first character encoding standard (also called character set). It define 127 different alphanumeric 含有字母数字的

characters that could be used on the internet.

ASCII supported numbers (0-9), English letters (A-Z), and some special characters like ! $ + - ( ) @ < > .

ANSI (Windows-1252) was the default character set for Windows (up to Windows 95). It supported 256 different codes.

ISO-8859-1, was the default character set for HTML 4. It also supported 256 different codes.

Because ANSI and ISO was too limited, the default character encoding was changed to UTF-8 in HTML5(All HTML 4 processors also support UTF-8).

UTF-8 Unicode covers (almost) all the characters and symbols in the world.


<meta charset="UTF-8">


<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">

HTML URL(Uniform Resource Locators)

URL 也就是 web address。

URL 格式


scheme - defines the type of Internet service. (most common type is http
host - defines the domain host (default host for http is www)
domain - defines the Internet domain name, likew3schools.com
port - defines the port number at the host (default port number for http is 80)
path - defines a path at the server (If omitted, the document must be stored at theroot directory of the site)
filename - defines the name of a document/resource

URL Encoding

URLs can only be sent over the Internet using the ASCII character-set.

Since URLs often contain characters outside the ASCII set (因为URL 中经常含有ASCII字符集以外的字符,比如请求参数里含有字母、带有音标,如法语西语字母)

URL encoding converts characters into a format that can be transmitted over the Internet.


URL encoding replaces non ASCII characters with a "%" followed by two hexadecimal digits.

URL编码用一个 % 紧跟着两个十六进制数字来替换URL中非ASCII字符。

URLs cannot contain spaces, normally replaces a space with a plus (+) sign or%20.

