您的位置:首页 > 其它

处理xml的命令工具

2016-07-01 18:32 295 查看
down vote
XMLStarlet or another XPath engine is the correct tool for this job.

For instance, with 
data.xml
 containing
the following:
<root>
<item>
<title>15:54:57 - George:</title>
<description>Diane DeConn? You saw Diane DeConn!</description>
</item>
<item>
<title>15:55:17 - Jerry:</title>
<description>Something huh?</description>
</item>
</root>


...you can extract only the first title with the following:
xmlstarlet sel -t -m '//title[1]' -v . -n <data.xml


Trying to use sed for this job is troublesome. For instance, the regex-based approaches won't work
if the title has attributes; won't handle CDATA sections; won't correctly recognize namespace mappings; can't determine whether a portion of the XML documented is commented out; won't unescape attribute references (such as changing 
Brewster
& Jobs
 to 
Brewster
& Jobs
), and so forth.

、、、、、、、、、、、、、、、、、、、、、、、、、、、、、

Do you really have to use only those tools? They're not designed for XML processing, and although it's possible to get something that works OK most of the time, it will fail on edge cases, like encoding,
line breaks, etc.

I recommend xml_grep:
xml_grep 'job' jobs.xml --text_only


Which gives the output:
programming


On ubuntu/debian, xml_grep is in the xml-twig-tools package.

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  xml