How Content Processing Works

When a new file is added to a Titania Delivery project, it is inspected by Titania Delivery and any work deemed necessary for that file is performed.

Titania Delivery can transform many different kinds of files, but some examples are:

  • Full-text search indexing of DITA, PDF, and Microsoft Office documents.
  • Flattening, metadata gathering, and HTML preview generation of DITA maps and topics.
  • Conversion of Markdown files to HTML
  • Analyzing the relationships between content, such as hyperlinks and graphic references in HTML, DITA, and Markdown files.
  • Conversion of print-oriented graphics, such as TIFF or EPS, into more web-friendly formats like PNG.

File processing is done in the background after content is uploaded. There is a short lag between when content is uploaded and when processing is complete. Until some files are processed, some of their information, such as their preview, incoming/outgoing relationships, and calculated metadata, will not be available. You can also see how much processing remains for an entire project by clicking the Content Processing tab when the project is selected.

Note: It is a good idea to wait until all content processing is completed before making any other changes to the project such as adding/removing metadata, changing the XML document types, or adding/deleting content. Taking such actions during content processing could cause errors that require re-processing the entire project to fix.

HTML and Markdown Processing

Markdown files will be automatically converted to HTML for processing. Then both HTML and Markdown files will be analyzed for relationships to other documents. This means that if these files have hyperlinks or references to graphics, they will function as expected when rendered in the portal.

In addition, Titania Delivery determines whether an HTML file is a full HTML document, with a <head> and <body>, or is HTML <body> content. Portals will present the former as standalone pages when serving them up through portals, but will serve the latter wrapped using the portal's theme configuration.

HTML <meta> tags that specify @name and @content attributes will be captured as embedded properties, and mapped to Titania Delivery metadata using the configured metadata mappings.

<meta name="mdName" content="mdValue">
<h1>Document Title</h1>
<p>This is the document content.</p>

Markdown documents may begin with metadata in YAML format, delimited from the rest of the content by three dashes. Such metadata will be treated as embedded properties and mapped to Titania Delivery metadata using the configured metadata mappings.

---
mdName: mdValue
---

# Document Title

This is the document content.

For details on setting up metadata mapping configurations, refer to the Developer's Guide.

Non-DITA XML Processing

When an XML document with an associated document type is loaded into a project, its processing is as follows.

  1. The document is decorated with namespaced attributes describing the role of each element, as configured in the document type's configuration files.
  2. References to other files, like graphics, are recorded.
  3. If the document contains chunk boundaries, the document is chunked at those boundaries, and a namespaced table of contents is inserted in place of the chunks.
  4. Each chunk is then added to the search index as an independent document.

Additional details about this process are described in the developer's guide.

DITA Processing

When a DITA map is processed, Titania Delivery generates a contextualized copy of each of the topics referenced by the map. This contextualized copy will apply the map-specific details, including

  • Cascading map and topicref metadata will be inserted.
  • Key reference will be resolved.
  • Map-based related-links, either from map structure or from relationship tables, will be injected.
  • Map-specified profiling will be applied.
Note: You can swap between the uncontextualized topic and its contextualized copies using the Context dropdown on the topic's preview.

So in effect, a topic exists in a Titania Delivery system N+1 times, where N is the number of <topicref> references to that topic. When viewing a topic in a portal, which contextualized copy will depend on the link used to access it. When clicking on a rendered map, you will be taken to the contextualized copy corresponding to the selected topicref. When viewing search results, all available contexts are listed, and the user is able to select the appropriate one.

For example, consider a DITA topic that is referenced by three maps.

When the maps are processed, contextualized copies of the topic will be made, with all map processing applied.

Any metadata fields on the DITA maps that are configured to cascade will then be copied to the contextualized copies of the topic.

When a topic referenced by a map is modified or removed, or when a formerly-missing topic is uploaded, the map is flagged as needing re-processing, so that the contextualized copy for the new version of the topic can be generated. You can check this flag on the Details tab of the map as the Children Modified entry.

Item Representation Types

After processing, each item will have one or more renditions stored in the system. Each rendition is classified as one of five item representation types. A processed rendition of an XML document many have several instances corresponding to its profiling and contextualization combinations. The item representation types are:

ORIGINAL
The original content item that was uploaded to the project. You can view the ORIGINAL rendition of text files by pressing the View Source button above the admin application project item preview display.
Note: Many browsers display XML files as HTML using a built-in stylesheet. To see the actual XML, you can either use the browser control to "View Page Source", or download the item from the project.
PROCESSED
The item has been processed according to its content type, possibly resulting in changes to the markup or content. XML documents are processed according to their doctype definition. DITA XML documents have been processed according to DITA processing expectations. Non-web-friendly images are converted to PNG format.
FLATTENED
For DITA topics, the FLATTENED rendition will have remote references resolved into the content.
MONOLITH
For DITA maps, the MONOLITH rendition is an XML document containing the fully-expanded contents of all topics referenced from the map.
PREVIEW
A rendition of the original item in a content type suitable for display in a browser. For XML files, this will be an HTML transformation of the XML, using the doctype's preview.xsl stylesheet.
Not all representation types are appropriate for all item content types.

Re-Processing Content

Content is automatically re-processed every time it is uploaded. However, there may be times when you want to re-process some or all of the content in a project without re-uploading the content. For example:

  • The project's document type associations are modified, and you need to re-process XML content using the newly-available document types.
  • Preview stylesheets or metadata rules are updated.
  • After a Titania Delivery upgrade with new content processing capabilities.
  • Content linked from a DITA map is updated.
  • Necessitating the regeneration of contextualized copies.

You can re-process content at the project, folder, or file level using the Re-Process Files button (). At the project and folder level, this button has a menu with two options:

  • Re-Process Updated files will reprocess only files with the Children Modified flag set to 'true'.
  • Re-Process All files will reprocess all files in the project or folder unconditionally.
  • Delete Project Search index and Re-Process All files will delete all search records for the project before reprocessing all files. The search index will be rebuilt during processing.
    Note: This action will disrupt the portal display of project contents and make search results unreliable while files are being reprocessed.

As a general rule, it's a good idea to re-process updated files whenever you upload topics without also uploading new versions of the maps that reference them.

Note: If you notice bizarre behavior within a project or a portal, a good first step in troubleshooting the problem is to reprocess the specific project or all projects associated with the Portal. If search results include items that no longer exist in the project, deleting the project search index will resolve that problem.