您的位置：首页 > 其它

xpdf读取pdf文件并根据pdf内容修改文件名称

2014-05-15 22:26 435 查看

package com.sunlei;

import java.io.BufferedReader;
import java.io.File;
import java.io.IOException;
import java.io.InputStreamReader;

import javax.crypto.spec.IvParameterSpec;

public class Rename {

/**
* @param args
* @throws IOException
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
// System.out.println("hello\n");
File file = new File("D:\\pdf");//pdf文件夹
String xpdfPath = "D:\\TDDOWNLOAD\\xpdfbin-win-3.03\\bin32\\pdfinfo.exe ";
//pdfinfo.exe文件夹，注意这个exe和后面的pdf文件名有空格，所以这里有空格
File[] fileListFiles = file.listFiles();// 取出文件夹下所有的文件
for (int i = 0; i < fileListFiles.length; i++) {
String cmd = xpdfPath + fileListFiles[i].getAbsolutePath();
try {
Process process = Runtime.getRuntime().exec(cmd);
BufferedReader br = new BufferedReader(new InputStreamReader(
process.getInputStream()));//获得exe执行程序返回结果
String firstLine = br.readLine();//只需要读取第一行就行，只要标题
// System.out.println(firstLine);
// System.out.println(firstLine.indexOf('D')); //下面substring的时候为什么是16，是通过这个实验出来的
if (firstLine != null) {
String subTitle = firstLine.substring(16);
if (!subTitle.equals("")) {

subTitle = subTitle.replace(':', ' ');// 去掉文件名不合规范的
subTitle = subTitle.replace('*', ' ');
subTitle = subTitle.replace('/', ' ');
subTitle = subTitle.replace('?', ' ');
String title = subTitle + ".pdf";//加上后缀名
File newFile;
if (title != "untitled.pdf" && title != ".pdf") {
newFile = new File("D:/pdf/" + title);
// System.out.println(title);
if (fileListFiles[i].renameTo(newFile)) {//修改文件名
System.out.println(fileListFiles[i].getName()
+ "修改成功");
} else {
System.out.println(fileListFiles[i].getName()
+ "修改失败");
}
}
br.close();//别忘了关闭流
process.destroy();
}else {
System.out.println(fileListFiles[i].getName()
+ "因为文中没有文件title而修改失败");
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

}

}
}

1：准备工作

上网查资料，C++读取pdf库，java读取pdf库，最后找到了xpdf库，还好，下载地址【下载xpdf地址】，我下载的是windows版的，然后按照步骤实验了一下

实验，打开压缩包，读了读readme，然后进入bin32文件夹，里面好多exe可执行文件，好吧，开始搞起

拷贝一个pdf文件进去，然后cmd命令行进入bin32文件夹，

[html] view plaincopy

pdftotext.exe 5026a001.pdf

然后果然生成了一个5026a001.txt，打开一看，哇，完美转换，看来不需要配置什么东西就能执行。

开始写程序，java代码的，开始的想法是通过pdftotext转成txt，然后解析txt文档，后来一看还有一个可执行程序pdfinfo.exe，感觉这个是读取pdf文档信息的程序。

命令行

[html] view plaincopy

pdfinfo.exe 5026a001.pdf

然后在屏幕上完美输出，说明他的标题直接可以拿到，那太好了，verygood

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航