Conversion process
About conversion
WorkZone PDF Engine converts one or more documents into a PDF document. A document must pass the following stages to be converted into a PDF document:
- Getting the source file.
Files can be passed into the service in the following ways:
- Through URI (Uniform Resource Identifier) - A URI to the file must be specified.
- By stream - The file is provided directly to the service.
- Using JSON with a list of URIs to the files. Documents in the database can be referenced by providing a valid OData URI to the document.
- Converting the file into a PDF document.
- If Microsoft Office Word documents are converted, update the Table of Contents (TOC).
The table of contents in the source Word document will be automatically updated and used as the table of contents in the PDF document in order to ensure a correct TOC. The source Word document is not updated.
All document fields contained in the headers and footers are automatically updated. The Filename field will not be updated in order to maintain the source document filename. - Applying bookmarks.
- Applying watermarks (optional).
- Applying headers and footers (optional).
- Merging.
- Applying optimization and compression.
- Providing the PDF output file.
In a merged document, a bookmark can be added for every source file. The bookmark name in the document is the link to the file.
A watermark is added to the pages of the converted PDF document.
Headers and footers are added to the pages of the converted PDF document.
Two or more documents will be merged into a single PDF document.
Conversion priorities
The WorkZone PDF Crawler constantly polls the documents database for documents to convert. Each iteration of the PDF Crawler service checks for any potential documents to convert and will thereafter convert these documents.
If there are no documents to be converted, the PDF Crawler service will wait for a number of seconds before starting a new search iteration.
The length of the wait period between PDF Crawler iterations depends on the origin of the document conversion. User requests and edited documents have a higher priority than documents that have failed previously and therefore the PDF Crawler iterations are shorter with regards to user-requested and changed documents.
The table below provides an overview of the priority of the conversion by request origin as well as how often the PDF Crawler starts its search for new documents to convert (the Iteration Start column).
The Priority column indicates which documents will be processed first by the PDF Crawler in an iteration if multiple documents are found.
| Priority | Origin | Description | Iteration Start (seconds) |
|---|---|---|---|
|
1 |
User-requested documents |
Documents a user requests to be displayed as a PDF document, for example in the WorkZone Client |
5 |
|
2 |
Changed documents |
Documents that have been edited or changed since the last conversion. |
5 |
|
3 |
Archived documents |
Documents that have been archived since the last conversion. |
60 |
|
4 |
Policy documents |
Documents identified for conversion through PDF Crawler policies. |
60 |
|
5 |
Pending documents |
Documents that are in the process of being converted to PDF. |
60 |
|
6 |
Retry documents |
Documents where the conversion process has failed and the conversion status is set to Retry. |
60 |
|
7 |
Failed documents |
Documents where the conversion process has failed and the conversion status is set to Failed. |
60 |