您的位置:首页 > 编程语言 > PHP开发

PHP中htmlentities和 htmlspecialchars区别

2016-06-11 13:56 471 查看
简介:

<code class="hljs mel has-numbering"><span class="hljs-keyword">string</span> htmlspecialchars ( <span class="hljs-keyword">string</span> <span class="hljs-variable">$string</span> [, <span class="hljs-keyword">int</span> <span class="hljs-variable">$flags</span> = ENT_COMPAT | ENT_HTML401 [, <span class="hljs-keyword">string</span> <span class="hljs-variable">$encoding</span> = ini_get(<span class="hljs-string">"default_charset"</span>) [, bool <span class="hljs-variable">$double_encode</span> = true ]]] )

<span class="hljs-keyword">string</span> htmlentities ( <span class="hljs-keyword">string</span> <span class="hljs-variable">$string</span> [, <span class="hljs-keyword">int</span> <span class="hljs-variable">$flags</span> = ENT_COMPAT | ENT_HTML401 [, <span class="hljs-keyword">string</span> <span class="hljs-variable">$encoding</span> = ini_get(<span class="hljs-string">"default_charset"</span>) [, bool <span class="hljs-variable">$double_encode</span> = true ]]] )
</code><ul style="" class="pre-numbering"><li>1</li><li>2</li><li>3</li><li>4</li></ul><ul style="" class="pre-numbering"><li>1</li><li>2</li><li>3</li><li>4</li></ul>

从上面php中htmlentities和htmlspecialcharx的接口定义可以知道:

两者都是将html特殊符号(如< > & ’ “)等转化为一个替代的html entity(如:< 对应<),但是两者之间还是有一些区别。

用法如下:

<code class="hljs bash has-numbering"><span class="hljs-variable">$str</span> = <span class="hljs-string">"A 'quote' is <b>bold</b>"</span>;

<span class="hljs-built_in">echo</span> htmlentities(<span class="hljs-variable">$str</span>);
// Outputs: A <span class="hljs-string">'quote'</span> is <b>bold</b>

<span class="hljs-built_in">echo</span> htmlentities(<span class="hljs-variable">$str</span>, ENT_QUOTES);
// Outputs: A <span class="hljs-string">'quote'</span> is <b>bold</b>

</code><ul style="" class="pre-numbering"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li><li>11</li></ul><ul style="" class="pre-numbering"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li><li>11</li></ul>

参数说明:

flag 标志是否处理单引号、双引号:

ENT_COMPAT Will convert double-quotes and leave single-quotes alone.

ENT_QUOTES Will convert both double and single quotes.

ENT_NOQUOTES Will leave both double and single quotes unconverted.

encoding:字符集编码

差别:

要理解两者的差别,先看两个文档说明:

1. 简介:

html_entity_decode — Convert all HTML entities to their applicable characters
htmlentities — Convert all applicable characters to HTML entities
htmlspecialchars_decode — Convert special HTML entities back to characters
htmlspecialchars - Convert special characters to HTML entities

2. 文档:

htmlentities:

This function is identical to htmlspecialchars() in all ways, * except with htmlentities(), all characters which have HTML character entity equivalents are translated into these entities.*

htmlspecialchars :

If you require all input substrings that have associated named entities to be translated, use htmlentities() instead.

重点:

htmlspecialchars 只转换(& < > ’ ” )这几个字符

而htmlentities 转换所有含有对应“html实体”的特殊字符,比如货币表示符号欧元英镑等、版权符号等。 其他符号列表可以参考:http://www.thesauruslex.com/typo/eng/enghtml.htm

Eg:

<code class="hljs php has-numbering"><span class="hljs-comment">//**注意这里的欧元字符的区别:**</span>
<span class="hljs-keyword">echo</span> htmlentities(<span class="hljs-string">'€ <>"'</span>).<span class="hljs-string">"\r\n"</span>;
<span class="hljs-comment">//€ <>"</span>

<span class="hljs-keyword">echo</span> htmlspecialchars(<span class="hljs-string">'€ <>"'</span>).<span class="hljs-string">"\r\n"</span>;
<span class="hljs-comment">//€ <>"</span>
</code><ul style="" class="pre-numbering"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li></ul><ul style="" class="pre-numbering"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li></ul>

网络理解的错误

有一些人对这两个字符的差别存在很大的理解错误。比如这里:http://www.cnblogs.com/A-Song/archive/2011/12/20/2294599.html

说:

htmlspecialchars 只转化上面这几个html字符,而 htmlentities 却会转化所有的html代码,连同里面的它无法识别的中文字符也给转化了。

用的是下面这个例子:

<code class="hljs xml has-numbering"><span class="php"><span class="hljs-preprocessor"><?php</span>
<span class="hljs-variable">$str</span>=<span class="hljs-string">'<a href="test.html">测试页面</a>'</span>;
<span class="hljs-keyword">echo</span> htmlentities(<span class="hljs-variable">$str</span>);

<span class="hljs-comment">// <a href="test.html">²âÊÔÒ³Ãæ</a> </span>

<span class="hljs-variable">$str</span>=<span class="hljs-string">'<a href="test.html">测试页面</a>'</span>;
<span class="hljs-keyword">echo</span> htmlspecialchars(<span class="hljs-variable">$str</span>);
<span class="hljs-comment">// <a href="test.html">测试页面</a> </span>

<span class="hljs-preprocessor">?></span></span>
</code><ul style="" class="pre-numbering"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li><li>11</li><li>12</li></ul><ul style="" class="pre-numbering"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li><li>11</li><li>12</li></ul>

结论是,有中文的时候,最好用 htmlspecialchars ,否则可能乱码。

**实际上上面的理解是错误的,因为htmlentities还有第三个编码的参数。

使用正确的编码,就可以消除上面的中文错误。如下:**

htmlentities 还有三个可选参数,分别是 quotestyle、charset、
doubleencode,手册对charset
参数是这样描述的:

Defines character set used in conversion. The default character set is ISO-8859-1.

从上面程序输出的结果判断,$str 是 GB2312 编码的,“测试页面”几个字对应的十六进制值是:

B2 E2 CA D4 D2 B3 C3 E6

然而却被当成 ISO-8859-1 编码来解析:

²âÊÔÒ³Ãæ

正好对应 HTML character entity 里的:

²âÊÔÒ³Ãæ

当然会被 htmlentities 转义掉,但是只要加上正确的编码作为参数,根本就不会出现所谓的中文乱码问题:

<code class="hljs bash has-numbering"><span class="hljs-variable">$str</span>=<span class="hljs-string">'<a href="test.html">测试页面</a>'</span>;

<span class="hljs-built_in">echo</span> htmlentities(<span class="hljs-variable">$str</span>, ENT_COMPAT, <span class="hljs-string">'gb2312'</span>);
// <a href=<span class="hljs-string">"test.html"</span>>测试页面</a>三人成虎,以讹传讹。
</code><ul style="" class="pre-numbering"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li></ul><ul style="" class="pre-numbering"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li></ul>

** 结论:htmlentities 和 htmlspecialchars 的区别在于 htmlentities 会转化所有的 html character entity,而htmlspecialchars 只会转化手册上列出的几个 html character entity (也就是会影响 html 解析的那几个基本字符)。一般来说,使用 htmlspecialchars 转化掉基本字符就已经足够了,没有必要使用 htmlentities。实在要使用 htmlentities 时,要注意为第三个参数传递正确的编码。

**
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: