Antenna House PDFXML Conversion Library(英語版)
PDFをXMLに変換
Antenna House 社の製品
2004 年より日本国内にてComponentSourceで販売中。
価格:¥ 2,075,370 (税込)〜 バージョン: V2.0 MR1 更新日: Dec 6, 2017
PDFをXMLに変換
Antenna House PDFXML Conversion Library allows you to unlock the content from your legacy PDFs. If you want to reuse content from old PDFs, you no longer need to retype or go through the trouble of reconstructing your documents’ content from the PDF binary format. Antenna House PDFXML is designed for those organizations that need to convert large volumes of PDFs into XML, HTML5, XSL-FO, DocBook, or any other file formats. The Antenna House PDFXML Conversion Library extracts text, tables and images from PDFs and converts them to an XML format called "AHPDFXML". The data can then be transformed to any desired output by applying XSLT stylesheets.
Benefits and uses for XML include:
The Antenna House PDFXML Conversion Library is a C/C++ library which also includes a Command-line program, that generates a richly structured XML document from the PDFs by using Antenna House’s PDF Analyzer Technology.
How it works:
What is AHPDFXML
The XML format outputted by this conversion library is called Antenna House PDFXML format. It is a verbose format defined by Antenna House representing the content of a PDF in an intermediate XML structure. It is created by converting the contents in a PDF into XML expressions for text, tables, and images.
Antenna House PDFXML consists of multiple files:
The resulting XML can then be transformed with XSLT to any format that displays the document structure such as XSL-FO, DocBook, HTML5, or simply text. With Antenna House PDFXML, you now have the means to take advantage of PDF content for a wide range of environments. Transforming PDF content to XML makes it much easier to reuse, transform, manipulate, and search for data. By applying an XSLT stylesheet, there is more flexibility to processing data depending on how it’s being used.
PDF Support
Antenna House PDFXML Conversion Library supports: