PDF Crawler and Engine configuration

Prerequisite: To configure the WorkZone PDF settings, you must be assigned the CONFIGADM access code and WorkZone PDF must be installed.

About WorkZone PDF Crawler and Engine

The WorkZone PDF module is used to convert existing documents to PDF documents. The WorkZone PDF module consists of two sub-modules:

  • WorkZone PDF Engine: A stateless web service that performs real-time conversions of files to PDF documents.
  • WorkZone PDF Crawler: Used for deferred, asynchronous conversions of selected documents, either manually or with policies.

If an organization uses WorkZone PDF with policies, WorkZone PDF Crawler searches for documents that have not been converted, and match one of the policies that have been defined. When a document that matches a policy is found, it is converted and then saved back to the WorkZone Content Server as a PDF document.

For more detailed information, see WorkZone PDF Administrator Guide.

Configure WorkZone PDF

  1. On the start page, click PDF.
  2. Select the Engine configuration or Crawler configuration tab depending on your needs.
  3. Apply your changes.
Field Description Notes

Conversion

   

Bypass page bounds verification for files

Specify which file types are to be bypassed when boundaries of objects in the files are checked during PDF conversion. The default file extension is PDF, meaning that the check for out-of-bounds objects in PDF documents will be bypassed.

 

Convert documents with their attachments

Enables conversion of document attachments together with the original document.

 

Suppress content that exceeds page bounds

If deactivated, the document is converted even if the content exceeds the bounds of a document. If enabled, the content fails to convert.

Note: Documents with the states UL (Locked), ARK (Archived), and AFS (Closed) will not be checked for content that is out of bounds even if the toggle button is enabled. Documents with these states are locked in their final state regardless of content that exceeds the page bounds.
 

Document processing retries

Specifies how many times WorkZone PDF Crawler tries to convert a document.

This setting is only available for WorkZone PDF Crawler

 

Document processing time-out

Specifies a time-out after which the WorkZone PDF Crawler stops waiting for conversion to be finished. If the conversion is not finished during the specified time-out, an error message is written to the dvs_render_info table.

This setting is only available for WorkZone PDF Crawler

For PDF documents

 

 

PDF forms:

  • Flatten
  • Show

Select whether to include PDF forms as regular content or to show as review edits, when converting PDF documents.

 

Flatten: PDF forms will be included to the final document as regular content (text and images).

Show: PDF forms will be shown in the final document as review edits (and will remain editable).

 

Annotations:

  • Flatten
  • Show
  • Hide

Select whether to include annotations as regular content, show as edits, or hide completely, when converting PDF documents.

 

Flatten: Annotations will be included to the final document as regular content (text and images).

Show: Annotations will be shown in the final document as review edits.

Hide: Annotations will be hidden (excluded) from the final document.

 

For Word documents

 

 

Show comments

Enable to show comments when converting Word documents

 

Use document culture for date format fields

  • Enable to apply the culture setting for each page in the document to all date format fields on that page when generating a PDF rendition of the document.
  • A document with multiple culture settings will generate a PDF document with different date formats, depending on the culture setting of each page.
  • If the Use document culture for date format fields setting is disabled, the culture of the first page in the document will be applied to all date format fields in the document when generating a PDF rendition of the document.

    A document with multiple culture settings will generate a PDF document with the same date format for all pages.

     

    The Use document culture for date format fields setting is disabled by default.

    Revisions:

    • Accept
    • Show
    • Reject

    Select whether to accept, show or reject existing revisions, when converting Word documents

     

    For Excel documents

     

     

    Show comments

    Enable to show comments when converting Excel documents

     

    Revisions:

    • Accept
    • Show
    • Reject

    Select whether to accept, show or reject existing revisions, when converting Excel documents

     

    For PowerPoint documents

     

     

    Show annotations

    Enable to show annotations when converting PowerPoint documents

     

    Show notes

    Enable to show notes when converting PowerPoint documents

     

    Content settings

       

    Header

    Define the content of the header.
    You can specify the header content as normal text or use the following Microsoft field codes:
    {Title}: The document title.
    {Date}: The current date based on defined culture settings.
    {Page}: The current page number in the document.
    {NumPages}: The total amount of pages in the document.

    Custom header text example:

    <setting name="Header" serializeAs="String"> <value> 'My Custom Header'</value> </setting>

    OpenField code header example:

    <setting name="Header" serializeAs="String">

    <value> {page}</value>

    </setting>

    Header styles

    Define the default style of the header that will be used if the style is not specified in the request. This parameter must be in JSON format. You can specify the formatting of the header, for example, bold, italic, as well as which font to use. All parameters are optional.

    Example of a header in bold using Arial as font:

    { "Bold": true, "Font": "Arial"}

    Watermark

    Specify the text to print as a watermark on each page.

     

    Watermark styles

    Define the default style of the watermark. This parameter must be in JSON format. You can specify color, transparency, and font. All parameters are optional. You specify them as follows:

    Color: Any valid html string format such as standard color name (e.g. red), hex value (e.g. #FF0000), and RGB colors (e.g. 255,0,0).

    Transparency: Transparency ranges from 0 to 100 where 100 represents full opacity.

    Font: A font name.

    Note: You do not need to specify font size. The watermark will be sized to fit the page automatically.

    Example of a watermark with red as the font color, medium transparency, and using Verdana as the font:

    { "Color": "Red", "Transparency": "55", "Font": "Verdana"}

    Footer

    Define the content of the footer. You can specify the footer content as normal text or use the following Microsoft field codes:

    {Title}: The document title.

    {Date}: The current date based on defined culture settings.

    {Page}: The current page number in the document.

    {NumPages}: The total amount of pages in the document.

    Custom footer text example:

    <setting name="Footer" serializeAs="String"> <value> 'My Custom Footer'</value> </setting>

    Field code footer example:

    <setting name="Footer" serializeAs="String">

    <value> " Page {page} of {NumPages}"</value>

    </setting>

    Footer styles

    Define the default style of the footer that will be used if the style is not specified in the request.

    This parameter must be in JSON format. You can specify the formatting of the footer, for example bold, italic, as well as which font to use.

    All parameters are optional.

    Example of a footer in bold using Arial as font:

    { "Bold": true, "Font": "Arial"}

    Output settings

       

    PDF format:

    • PDF
    • PDF/A (Archival)
    • PDF/UA (Universal Accessibility, supports PDF and Word files only)

    Select the default PDF format that will be used:

    • PDF: Standard PDF format for generic usage.
    • For the best performance, we recommend using PDF as the default format.

    • PDF/A: A PDF format that is used for long-term storage.
    • PDF/UA: A PDF format that ensures accessibility for people with disabilities who use assistive technology to navigate and read electronic content. See Working with PDF/UA documents for more information.

      Note:

      • In case of failed PDF/UA rendition, the document will be reverted to standard PDF version.
      • For PDF reports, if PDF/UA validation fails for some of the merged documents or the summary part of the report, the report will be saved as a standard PDF file (without the "UA" badge on the document icon). See PDF/UA validation for PDF reports for more information.
     

    Compress PDF output

    Enable to reduce a size of the PDF output. This parameter is particularly important as large documents may take a long time to download from a server.

     

    Optimize PDF output for the web

    Enable to optimize the PDF output for the web. This parameter is particularly important as regards large documents that may take a long time to download from a server.