您的位置:首页 > Web前端 > HTML

应用POI,word2007转html

2015-03-04 16:49 309 查看
poi 3.9

http://poi.apache.org/

Java代码


import java.io.File;

import java.io.FileInputStream;

import java.io.FileOutputStream;

import java.io.IOException;

import java.io.InputStream;

import java.io.OutputStream;

import org.apache.poi.xwpf.converter.core.FileImageExtractor;

import org.apache.poi.xwpf.converter.core.FileURIResolver;

import org.apache.poi.xwpf.converter.xhtml.XHTMLConverter;

import org.apache.poi.xwpf.converter.xhtml.XHTMLOptions;

import org.apache.poi.xwpf.usermodel.XWPFDocument;

import org.apache.poi.xwpf.usermodel.XWPFPictureData;

//import org.junit.Assert;

//import org.junit.Test;

public class word07toHtml {

//@Test

public static void canExtractImage() throws IOException {

File f = new File("d:/test/test.docx");

if (!f.exists()) {

System.out.println("Sorry File does not Exists!");

} else {

if (f.getName().endsWith(".docx") || f.getName().endsWith(".DOCX")) {

// 1) Load DOCX into XWPFDocument

InputStream in = new FileInputStream(f);

XWPFDocument document = new XWPFDocument(in);

// 2) Prepare XHTML options (here we set the IURIResolver to

// load images from a "word/media" folder)

File imageFolderFile = new File("d:/test/media");

XHTMLOptions options = XHTMLOptions.create().URIResolver(

new FileURIResolver(imageFolderFile));

options.setExtractor(new FileImageExtractor(imageFolderFile));

//options.setIgnoreStylesIfUnused(false);

//options.setFragment(true);

// 3) Convert XWPFDocument to XHTML

OutputStream out = new FileOutputStream(new File(

"d:/test/test.htm"));

XHTMLConverter.getInstance().convert(document, out, options);

} else {

System.out.println("Enter only MS Office 2007+ files");

}

}

}

public static void main(String args[]) {

try {

canExtractImage();

} catch (IOException e) {

// TODO Auto-generated catch block

e.printStackTrace();

}

}

}

其中org.apache.poi.xwpf.converter需要扩展包

如果你的项目用到了maven做如下配置即可,若果没用maven,请从本文附件下载

1.0.4 对应的是 poi 3.9

1.0.0 对应的是 poi 3.8

import org.apache.poi.xwpf.converter.xhtml.XHTMLConverter;

import org.apache.poi.xwpf.converter.xhtml.XHTMLOptions;

所需jar包

Xml代码


<dependencies>

<dependency>

<groupId> fr.opensagres.xdocreport</groupId>

<artifactId> org.apache.poi.xwpf.converter.core</artifactId>

<version> 1.0.4</version>

</dependency>

<dependency>

<groupId> fr.opensagres.xdocreport</groupId>

<artifactId> org.apache.poi.xwpf.converter.xhtml</artifactId>

<version> 1.0.4</version>

</dependency>

</dependencies>

如果报错:

java.lang.ClassNotFoundException: org.openxmlformats.schemas.wordprocessingml.x2006.main.impl.CTSectPrImpl$1HeaderReferenceList

请添加 ooxml-schemas-1.1.jar

java.lang.ClassNotFoundException: org.openxmlformats.schemas.wordprocessingml.x2006.main.impl.CTBodyImpl$1TblList

也是需要 ooxml-schemas-1.1.jar

用maven的会自动下来,没用maven的请从本文附件下载ooxml-schemas-1.1.rar,需要解压

不过,发现转换后的table没有边框,有待解决
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: