Java_常见格式的文件Java读写插件
2017-04-06 18:09
429 查看
How do I access the XYZ file format in java ?
Specifications for many file formats can be found at Wotsit. A large database of file extensions be found at www.file-extensions.org and dotwhat.net
And if you don't know what type a given file is, they there are various way to determine it programmatically: http://www.rgagnon.com/javadetails/java-0487.html
An interesting article about Microsoft's binary file formats, especially DOC and XLS, is Why are the Microsoft Office file formats so complicated? (And some workarounds) It also mentions some alternatives to dealing with those formats directly.
Access
JDBC/ODBC bridge - JDBC driver for ODBC databases, comes as part of the JDK; on Linux, you'll have to get ODBC up and running first:http://www.unixodbc.org/
Jackcess - library to read and write MDB files
HXTT Access - commercial pure Java JDBC driver for MS Access
CGM
cgmva - an applet to display CGM files; comes with source code
CHM
JChm - library to read CHM files
Excel
Apache Commons CSV, Ostermiller Utils, CSVObjects, CSVBeans, opencsv, Java CSV, Super CSV - libraries to read and write CSV files. CSV is not as easy to read and write as it first looks - once all the special cases are considered, one might as well use a library.
POI - library to read and write XLS and XLSX files
JExcelAPI - library to read and write XLS (but not XLSX) files
jXLS - library for writing XLS files based on templates
Java2Excel - library for creating Excel files based on Collections
It is possible to use JDBC to read Excel files
Obba works with Excel spreadsheets on Windows
OpenXLS - "OpenXLS is the open source version of ExtenXLS - a Java spreadsheet SDK that allows you to read, modify and create Java Excel spreadsheets from your Java applications."
Gedcom
GDBI
GenJ
HDF (Hierarchical Data Format)
Java products by the HDF Group
Image and movie files
ImageJ - Java image processing application and library that has plugins for lots of image file formats
JIMI - library to read and write BMP, CUR, GIF, ICO, JPEG, PICT, PNG, PSD, Sun Raster, TGA, TIFF, XBM and XPM. There's a plugin for using JIMI with ImageJ, which also includes a couple of JIMI patches.
GIF write, TIFF, RAW, PNM and JPEG2000 read/write support for ImageIO: JAI Image I/O Tools
Reading QuickTime files in Java. Apple's QT4J library is unfortunately no longer supported.
MP4 parser
INI
ini4j "is a simple Java API for handling configuration files in Windows .ini format."
Matlab
JMatIO - Matlab's MAT-file I/O in JAVA
OpenDocument (ODF)
basic Java code for reading ODF files is here
ODFDOM is a Java library for accessing ODF files.
jDocument.org has an open-source library for accessing all Open Document file types.
Obba works with OpenOffice? spreadsheets
Office2FO converts ODF documents to XSL-FO documents, making possible further transformations (like conversion to PDF using FOP)
Office Open XML
These are the new XML-based Microsoft Office formats.
OpenXML4J
docx4j - create and edit docx documents using a JAXB content model matching the WordML schema
Apache POI implements these formats.
OpenOffice Java API
OpenOffice can read a number of file formats, and makes them accessible through its API. A starting point might be this article, this article and of course theOO developer site
Some introductory information about the OO file format can be found here and here
oooview is an OO Viewer written in Java.
JODConverter is a Java library that uses the OO Java API to perform document conversions between any formats supported by OO
Outlook
The Apache POI project developed some code that can read the texual contents of Outlook's MSG files. This page talks about that.
Xena can convert multiple file formats -including MSG- to XML. Either the result of that conversion, or Xena's source code, may be helpful.
JPST can read and extract PST files.
PDF
PDF is a hard to read format. The best one can do is try to extract the text contained in a PDF file.
iText - library to create PDFs; see ItextExample for a code example. The older version iText 2 (which uses a more permissive license) is also available: jar file, javadocs
FOP - libray to create PDFs (and other formats) from XML by using XSL-FO transformations
FlyingSaucer - library to convert CSS-styled XHTML to PDF
PDFBox - library that can merge, split and print PDFs, extract text, create images from PDFs, encrypt/decrypt PDFs, fill in PDF forms and more
PDF Clown - general-purpose library to read/create/modify PDF files. It features a rich multi-layered object model that allows access even to each single content stream instruction.
JPedal - library for viewing and printing PDFs, can also extract text (how to print PDFs); commercial (the LGPL version provides PDF viewing only)
PDFTextStream - commercial library to extract text from PDFs
PDF Renderer is a more up-to-date PDF viewer that renders using Java2D. Download, Examples, Printing PDFs
ICEPdf is another library that can render PDFs.
Qoppa offers numerous libraries for PDF-related tasks
Aspose.Pdf for Java is a commercial library for reading and writing PDFs
jPod is a rich PDF manipulation and rendering framework
PowerPoint
The Apache POI project developed some code that can open and (to a limited extent) edit PPT files. This page talks about it.
Project
The MPXJ library can work with several Project file formats.
PST
LibPST is a C library that could be used through JNI.
Xena can convert multiple file formats -including PST- to XML. Either the result of that conversion, or Xena's source code, may be helpful.
java-libpst is a pure Java library that can access 64bit PST files.
QIF (used by Microsoft Money and Quicken)
Buddi and Eurobudget are Java applications that can import and export QIF files (and thus contain code you may be able to use in your application). Both are licensed under the GPL.
RTF
jRTF can create RTFs
iText 2 can create RTFs: jar file, javadocs
JavaCC - is a lexer/parser for which an RTF grammar is available. From that an RTF reader can be constructed.
Visio
The Apache POI project developed some code that can read Visio files. This page talks about that.
Word
POI - library to read and write DOC and DOCX files. It can also be used for extracting the text of a document.
WordApi.exe is native Windows component with a Java interface, which lets you create Word documents, and alter word templates. Some impressions about it can be found here.
Java2Word - library to create Word documents, especially reports, on the fly.
Something else?
If you encounter an obscure format for which no library is available, it may be feasible to create a reader for it if you have a file format description (which may be available on Wotsit, see link above). Several libraries, so-called lexers and parsers, are available that help in creating a reader, especially if the file format is ASCII, and not binary. You will need knowledge of regular expressions, though. Some file formats that have been tackled using this approach include RTF, CSV, HPGL and PBM/PGM/PPM. Lexers are easier to start with, but parsers can do more of the work for you. All these have ready-to-use examples on their web sites.
Lexers: JFlex (introductory article in the JavaRanch Journal)
Parsers: Antlr, SableCC, JavaCC
Specifications for many file formats can be found at Wotsit. A large database of file extensions be found at www.file-extensions.org and dotwhat.net
And if you don't know what type a given file is, they there are various way to determine it programmatically: http://www.rgagnon.com/javadetails/java-0487.html
An interesting article about Microsoft's binary file formats, especially DOC and XLS, is Why are the Microsoft Office file formats so complicated? (And some workarounds) It also mentions some alternatives to dealing with those formats directly.
Access
JDBC/ODBC bridge - JDBC driver for ODBC databases, comes as part of the JDK; on Linux, you'll have to get ODBC up and running first:http://www.unixodbc.org/
Jackcess - library to read and write MDB files
HXTT Access - commercial pure Java JDBC driver for MS Access
CGM
cgmva - an applet to display CGM files; comes with source code
CHM
JChm - library to read CHM files
Excel
Apache Commons CSV, Ostermiller Utils, CSVObjects, CSVBeans, opencsv, Java CSV, Super CSV - libraries to read and write CSV files. CSV is not as easy to read and write as it first looks - once all the special cases are considered, one might as well use a library.
POI - library to read and write XLS and XLSX files
JExcelAPI - library to read and write XLS (but not XLSX) files
jXLS - library for writing XLS files based on templates
Java2Excel - library for creating Excel files based on Collections
It is possible to use JDBC to read Excel files
Obba works with Excel spreadsheets on Windows
OpenXLS - "OpenXLS is the open source version of ExtenXLS - a Java spreadsheet SDK that allows you to read, modify and create Java Excel spreadsheets from your Java applications."
Gedcom
GDBI
GenJ
HDF (Hierarchical Data Format)
Java products by the HDF Group
Image and movie files
ImageJ - Java image processing application and library that has plugins for lots of image file formats
JIMI - library to read and write BMP, CUR, GIF, ICO, JPEG, PICT, PNG, PSD, Sun Raster, TGA, TIFF, XBM and XPM. There's a plugin for using JIMI with ImageJ, which also includes a couple of JIMI patches.
GIF write, TIFF, RAW, PNM and JPEG2000 read/write support for ImageIO: JAI Image I/O Tools
Reading QuickTime files in Java. Apple's QT4J library is unfortunately no longer supported.
MP4 parser
INI
ini4j "is a simple Java API for handling configuration files in Windows .ini format."
Matlab
JMatIO - Matlab's MAT-file I/O in JAVA
OpenDocument (ODF)
basic Java code for reading ODF files is here
ODFDOM is a Java library for accessing ODF files.
jDocument.org has an open-source library for accessing all Open Document file types.
Obba works with OpenOffice? spreadsheets
Office2FO converts ODF documents to XSL-FO documents, making possible further transformations (like conversion to PDF using FOP)
Office Open XML
These are the new XML-based Microsoft Office formats.
OpenXML4J
docx4j - create and edit docx documents using a JAXB content model matching the WordML schema
Apache POI implements these formats.
OpenOffice Java API
OpenOffice can read a number of file formats, and makes them accessible through its API. A starting point might be this article, this article and of course theOO developer site
Some introductory information about the OO file format can be found here and here
oooview is an OO Viewer written in Java.
JODConverter is a Java library that uses the OO Java API to perform document conversions between any formats supported by OO
Outlook
The Apache POI project developed some code that can read the texual contents of Outlook's MSG files. This page talks about that.
Xena can convert multiple file formats -including MSG- to XML. Either the result of that conversion, or Xena's source code, may be helpful.
JPST can read and extract PST files.
PDF is a hard to read format. The best one can do is try to extract the text contained in a PDF file.
iText - library to create PDFs; see ItextExample for a code example. The older version iText 2 (which uses a more permissive license) is also available: jar file, javadocs
FOP - libray to create PDFs (and other formats) from XML by using XSL-FO transformations
FlyingSaucer - library to convert CSS-styled XHTML to PDF
PDFBox - library that can merge, split and print PDFs, extract text, create images from PDFs, encrypt/decrypt PDFs, fill in PDF forms and more
PDF Clown - general-purpose library to read/create/modify PDF files. It features a rich multi-layered object model that allows access even to each single content stream instruction.
JPedal - library for viewing and printing PDFs, can also extract text (how to print PDFs); commercial (the LGPL version provides PDF viewing only)
PDFTextStream - commercial library to extract text from PDFs
PDF Renderer is a more up-to-date PDF viewer that renders using Java2D. Download, Examples, Printing PDFs
ICEPdf is another library that can render PDFs.
Qoppa offers numerous libraries for PDF-related tasks
Aspose.Pdf for Java is a commercial library for reading and writing PDFs
jPod is a rich PDF manipulation and rendering framework
PowerPoint
The Apache POI project developed some code that can open and (to a limited extent) edit PPT files. This page talks about it.
Project
The MPXJ library can work with several Project file formats.
PST
LibPST is a C library that could be used through JNI.
Xena can convert multiple file formats -including PST- to XML. Either the result of that conversion, or Xena's source code, may be helpful.
java-libpst is a pure Java library that can access 64bit PST files.
QIF (used by Microsoft Money and Quicken)
Buddi and Eurobudget are Java applications that can import and export QIF files (and thus contain code you may be able to use in your application). Both are licensed under the GPL.
RTF
jRTF can create RTFs
iText 2 can create RTFs: jar file, javadocs
JavaCC - is a lexer/parser for which an RTF grammar is available. From that an RTF reader can be constructed.
Visio
The Apache POI project developed some code that can read Visio files. This page talks about that.
Word
POI - library to read and write DOC and DOCX files. It can also be used for extracting the text of a document.
WordApi.exe is native Windows component with a Java interface, which lets you create Word documents, and alter word templates. Some impressions about it can be found here.
Java2Word - library to create Word documents, especially reports, on the fly.
Something else?
If you encounter an obscure format for which no library is available, it may be feasible to create a reader for it if you have a file format description (which may be available on Wotsit, see link above). Several libraries, so-called lexers and parsers, are available that help in creating a reader, especially if the file format is ASCII, and not binary. You will need knowledge of regular expressions, though. Some file formats that have been tackled using this approach include RTF, CSV, HPGL and PBM/PGM/PPM. Lexers are easier to start with, but parsers can do more of the work for you. All these have ready-to-use examples on their web sites.
Lexers: JFlex (introductory article in the JavaRanch Journal)
Parsers: Antlr, SableCC, JavaCC
相关文章推荐
- 识别常见编码格式文件并转换成UTF-8编码 的java实现 源码
- java文件读写操作指定编码格式[转]
- Java读写CSV格式文件(opencsv)
- Java生成自己的软件才能读写的独特格式文件
- Java读写CSV格式文件(opencsv)
- 编写一个读写倾斜测量数据.s3c文件格式的OSG插件osgdb_s3c
- Java读写CSV格式文件(opencsv)
- Java读写CSV格式文件(opencsv)
- Java 中读写文件内容常见的几种方法
- Java 读写GZIP格式文件
- java文件读写操作指定编码格式
- Java读写CSV格式文件(opencsv)
- Java 读写GZIP格式文件
- Java:读写CSV格式文件(opencsv)
- Java 读写json格式的文件方法详解
- Java读写CSV格式文件(opencsv)
- Java读写CSV格式文件(opencsv)
- 识别常见编码格式文件并转换成UTF-8编码的java实现
- java文件读写操作指定编码格式
- json学习六——>Java 读写json格式的文件方法详解