您的位置：首页 > 其它

dom4j-cookbook

2016-06-24 15:06 260 查看

【0】README
1）本文译自http://dom4j.sourceforge.net/dom4j-1.6.1/cookbook.html
2）intro：

2.1）dom4j 是一个对象模型，在内存中表示一颗XML 树。dom4j 提供了易于使用的API以提供强大的处理特性，操纵或控制 XML 和结合 XPath， XSLT 以及 SAX， JAXP 和 DOM 来进行处理；
2.2）dom4j 是基于接口来设计的，来提供高可配置的实现策略。你只需提供一个DocumentFactory的实现就可以创建你自己的XML树实现。这使得我们易于重用dom4j 的代码，当扩展dom4j来提供所需特性的实现的时候；

【1】读取XML 数据
1）intro：dom4j 附带了一组builder 类用于解析xml 数据和创建类似于树的对象结构。读取XML 数据的代码如下：

public class DeployFileLoaderSample {
/** dom4j object model representation of a xml document. Note: We use the interface(!) not its implementation */
private Document doc;
/**
* Loads a document from a file.
* @param aFile the data source
* @throw a org.dom4j.DocumentExcepiton occurs on parsing failure.
*/
public void parseWithSAX(File aFile) throws DocumentException {
SAXReader xmlReader = new SAXReader();
this.doc = xmlReader.read(aFile);
}
/**
* Loads a document from a file.
* @param aURL the data source
* @throw a org.dom4j.DocumentExcepiton occurs on parsing failure.
*/
public void parseWithSAX(URL aURL) throws DocumentException {
SAXReader xmlReader = new SAXReader();
this.doc = xmlReader.read(aURL);
}
public Document getDoc() {
return doc;
}
}

2）以上代码阐明了使用 SAXReader根据给定文件来创建一个完整dom4j 树。org.dom4j.io 包包含了一组类用于创建和序列化XML对象。其中read() 方法被重载了使得你能够传递表示不同资源的对象；

java.lang.String - a SystemId is a String that contains a URI e.g. a URL to a XML file
java.net.URL - represents a Uniform Resource Loader or a Uniform Resource Identifier. Encapsulates a URL.
java.io.InputStream - an open input stream that transports xml data
java.io.Reader - more compatible. Has abilitiy to specify encoding scheme
org.sax.InputSource - a single input source for a XML entity.

2.1）添加新方法为为 DeployFileCreator 增加更多的扩展性，代码还是上面那个代码；

3）测试用例如下

@Test
public void readXML() {
String base = System.getProperty("user.dir") + File.separator
+ "src" + File.separator;

DeployFileLoaderSample sample = new DeployFileLoaderSample();
try { // via parameter of URL type.
sample.parseWithSAX(new URL("file:" + base + "pom.xml"));
Document doc = sample.getDoc();
System.out.println(doc.asXML());
} catch (Exception e) {
e.printStackTrace();
}

try { // via parameter of File type.
sample.parseWithSAX(new File(base + "pom.xml"));
Document doc = sample.getDoc();
System.out.println(doc.asXML());
} catch (Exception e) {
e.printStackTrace();
}
}

【2】dom4j 和其他XML API 整合

1）intro：dom4j 也提供了类用于和两个原始 XML 处理API（SAX 和 DOM）进行整合。

2）DomReader类：允许你将一个存在的 DOM 树转换为 dom4j 树。你也可以转换一个DOM 文档，DOM 节点分支和单个元素；代码如下：

public class DOMIntegratorSample {

public DOMIntegratorSample() {}

public org.w3c.dom.Document parse(URL url) {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder builder = factory.newDocumentBuilder();
return builder.parse(url.toString());
} catch (Exception e) {
e.printStackTrace();
return null;
}
}

/** converts a W3C DOM document into a dom4j document */
public Document buildDocment(org.w3c.dom.Document domDocument) {
DOMReader xmlReader = new DOMReader();
return xmlReader.read(domDocument);
}
}

public String base = System.getProperty("user.dir") + File.separator
+ "src" + File.separator;

@Test // 测试用例,.
public void testIntegrate() {
DOMIntegratorSample sample = new DOMIntegratorSample();
try {
org.w3c.dom.Document doc = sample.parse(new URL("file:"+ base + "pom.xml"));
Document doc4j  = sample.buildDocment(doc);
System.out.println(doc4j.asXML());
} catch (Exception e) {
e.printStackTrace();
}
}

【3】DocumentFactory 的秘密

1）intro：从头到尾创建一个 Document，代码如下：

public class GranuatedDeployFileCreator {
private DocumentFactory factory;
private Document doc;

public GranuatedDeployFileCreator() {
this.factory = DocumentFactory.getInstance(); // 单例方法.
}
public void generateDoc(String aRootElement) {
doc = factory.createDocument();
Element root = doc.addElement(aRootElement);
}
}

1.1）测试用例如下：

@Test
public void testGenerateDoc() {
GranuatedDeployFileCreator creator = new GranuatedDeployFileCreator();

creator.generateDoc("project");
Document doc = creator.getDoc();
System.out.println(doc.asXML());
}

2）Document 和 Element 接口有许多助手方法以简单的方式来创动态建 XML 文档；

public class Foo {

public Foo() {}

public Document createDocument() {
Document document = DocumentHelper.createDocument();
Element root = document.addElement("root");
Element author2 = root.addElement("author").addAttribute("name", "Toby").addAttribute("location", "Germany")
.addText("Tobias Rademacher");
Element author1 = root.addElement("author").addAttribute("name", "James").addAttribute("location", "UK")
.addText("James Strachan");
return document;
}
}

2.1）测试用例如下：

@Test
public void testCreateDocByHelper() {
Foo foo = new Foo();

Document doc = foo.createDocument();
System.out.println(doc.asXML());
}

2.2）dom4j 是基于API 的接口。这意味着dom4j中的 DocumentFactory 和阅读器类总是使用 org.dom4j 接口而不是其实现类。集合 API 和 W3C 的DOM 也采用了这种方式；

2.3）一旦你解析后创建了一个文档，你就想要将其序列化到硬盘或普通流中。dom4j 提供了一组类以以下四种方式来序列化你的 dom4j 树； XML + HTML + DOM + SAX Events；

【4】序列化到XML

1）intro： 使用 XMLWriter 构造器根据给定的字符编码来传递输出流。相比于输出流，Writer 更容易使用，因为Writer 是基于字符串的，因此有很少的编码问题。Writer.write()方法被重写了，你可以按需逐个写出dom4j对象；

2）代码如下：

// 序列化xml
public class DeployFileCreator3 {
private Document doc;
public DeployFileCreator3(Document doc) {
this.doc = doc;
}

public void serializetoXML(OutputStream out, String aEncodingScheme) throws Exception {
OutputFormat outformat = OutputFormat.createPrettyPrint();
outformat.setEncoding(aEncodingScheme);
XMLWriter writer = new XMLWriter(out, outformat);
writer.write(this.doc);
writer.flush();
writer.close();
}

}

3）测试用例

@Test
public void testSerializetoXML() {
Foo foo = new Foo();

Document doc = foo.createDocument();
DeployFileCreator3 creator = new DeployFileCreator3(doc);
try {
creator.serializetoXML(new FileOutputStream(base + "serializable.xml"),
"UTF-8");
System.out.println("serializable successfully.");
} catch (Exception e) {
e.printStackTrace();
}
}

// customize output format.
public class DeployFileCreator4 {
private Document doc;
private OutputFormat outFormat;
public DeployFileCreator4(Document doc) {
this.outFormat = OutputFormat.createPrettyPrint();
this.doc = doc;
}
public DeployFileCreator4(Document doc, OutputFormat outFormat) {
this.doc = doc;
this.outFormat = outFormat;
}
public void writeAsXML(OutputStream out) throws Exception {
XMLWriter writer = new XMLWriter(out, this.outFormat);
writer.write(this.doc);
}
public void writeAsXML(OutputStream out, String encoding) throws Exception {
this.outFormat.setEncoding(encoding);
this.writeAsXML(out);
}
}

2）OutputFormat中一个有趣的特性是能够设置字符编码。使用这种机制设置XMLWriter的编码方式是一个好习惯，使用这种编码方式创建OutputStream 和输出XML的声明。
3）测试用例：

@Test
public void testCustomizeOutputFormat() {
Foo foo = new Foo();

Document doc = foo.createDocument();
OutputFormat format = OutputFormat.createCompactFormat();
format.setEncoding("UTF-8");
DeployFileCreator4 creator = new DeployFileCreator4(
doc, format);
try {
creator.writeAsXML(new FileOutputStream(base + "customizeFormat.xml"));
System.out.println("successful customize format");
} catch (Exception e) {
e.printStackTrace();
}
}

<?xml version="1.0" encoding="UTF-8"?>
<root><author name="Toby" location="Germany">Tobias Rademacher</author><author name="James" location="UK">James Strachan</author></root>
【5】打印HTML
1）intro：HTMLWriter 带有一个dom4j 树且会将该树格式化为 HMTL流。这个格式化器类似于 XMLWriter 但输出的是 CDATA 和实体区域而不是序列化格式的XML，且它支持许多没有结束标签的HTML
元素。如<br>；
2）代码如下：

public class PrintHTML {
private Document doc;
private OutputFormat outFormat;

public PrintHTML(Document doc) {
this.outFormat = OutputFormat.createPrettyPrint();
this.doc = doc;
}

public PrintHTML(Document doc, OutputFormat outFormat) {
this.doc = doc;
this.outFormat = outFormat;
}

public void writeAsHTML(OutputStream out) throws Exception {
HTMLWriter writer = new HTMLWriter(out, this.outFormat);
writer.write(this.doc);
writer.flush();
}
}

3）测试用例：

@Test
public void testPrintHTML() {
Foo foo = new Foo();

Document doc = foo.createDocument();
PrintHTML creator = new PrintHTML(doc);
try {
creator.writeAsHTML(new FileOutputStream(base + "printHtml.html"));
System.out.println("PrintHTML successfully");
} catch (Exception e) {
e.printStackTrace();
}
}

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航