PDFlib pCOS provides a simple and elegant facility for retrieving any information from a PDF document which is not part of the page contents. For example, PDF metadata, interactive elements (links etc.), or page dimensions can easily be queried with pCOS.
With pCOS you can extract a variety of interesting items and create output for different purposes. By processing multiple PDF documents with a single call you can easily create summaries of document info entries, page formats, fonts, or any other property. Combined with tabular output this provides a powerful PDF administration tool.
There are many every-day pCOS applications for PDF practitioners, but you can also use PDFlib pCOS as a tool for learning or debugging PDF. Here are some typical scenarios:
- Check incoming documents for predefined criteria
- Check PDFs for security problems and active content (Java-Script etc.)
- Check documents for quality assurance before publication
- Identify problem files in a large collection
- Create property summaries for document management
- Learn details of PDF data structures
Note: Please review the following Platform and Product Requirements, Limitations of the Evaluation version and Supported Programming Languages before downloading or purchasing this software.
Feature Summary
- Extract a Variety of Items from PDF Documents
Major Features in Depth
Information Retrieval PDFlib pCOS offers a simple query interface, without the need for low-level parser programming. With PDFlib pCOS you can extract a variety of interesting items, such as:
- Document info entries and XMP metadata.
- General information: linearization and tagged PDF status, encryption details and permission settings, number of pages and fonts.
- All fonts with their name, embedding status, etc.
- Images with size, bit depth, color space, compression, etc.
- Color space details for all PDF color variations.
- Target URLs and coordinates of Web links.
- All bookmarks along with the corresponding page numbers, e.g. to create a table of contents.
- Form field data: full field names, contents, position, etc.
- Page size, CropBox, page rotation.
- Status of PDF/X and PDF/A compliant files.
- List or extract file attachments.
- Layer names, page labels, article threads.
- Annotation details.
- List all comments along with the reviewer's name.
- Digital signature details: name of signature field(s), signed/unsigned, name of signer, date and reason of signature.
- Extract ICC output intent profiles from PDF/X or PDF/A files.
- List PDFlib block properties.
- JavaScript on document, page, annotation, or field level supported Input
- PDFlib pCOS supports all relevant flavors of PDF input:
- All PDF versions up to PDF 1.7 (Acrobat 8)
- RC4 and AES encryption (password may be required)
- Sophisticated security model: even if you don't know the password, you can query certain pieces of information as long as this doesn't violate the document author's intentions
- Damaged PDF input documents will be repaired if possible
Output Formats PDFlib pCOS can create output for different purposes:
- Plain text output
- Tabular output for processing with a spreadsheet/database
- Binary data for reuse, e.g. ICC profiles or file attachments
- Unicode text output in UTF-8 or UTF-16 formats
- User-defined output formats for custom post-processing
pCOS Paths - Simple Syntax for PDF Objects Instead of getting bogged down by complex tree structures, e.g. for bookmarks or form fields, you can easily access PDF objects by using the simple pCOS path syntax. It offers convenient shortcuts for accessing commonly used PDF objects, such as pages, fonts, bookmarks, form fields etc.
pCOS Library or Command-Line Tool? pCOS is available as a programming library (component) for various development environments, and as a command-line tool for batch operations. Both offer similar features, but are suitable for different deployment tasks.
Platform and Product Requirements
- Windows Server 2000/2003, Apple Mac OS X Server PPC/Intel, Linux x86/x86_64/EM64T, Sun Solaris 7-10 on x86/sparc, IBM AIX 5, HP-UX 10.20/11i on PA-RISC/IA-64, Windows 2000/XP/Vista, Apple Mac OS X PPC/Intel.
- pCOS is available as a programming library (component) for various development environments, and as a command-line tool for batch operations. Both offer similar features, but are suitable for different deployment tasks.
Limitations of the Evaluation version.
- The evaluation version of the product enables all features of the product and produces fully valid output, but has some volume limitations.
Supported Programming Languages
- COM for use with VB, ASP, and many other languages
- C and C++.
- Java, including servlets and Java Application Server.
- .NET for use with C#, VB.NET, ASP.NET, etc.
- Perl
- PHP