Setting Metadata

The tdsync tool has an extensible mechanism for loading the metadata associated with files being synced. By default, it can load metadata from a JSON file, from a CSV file, or from metadata artifacts generated from certain Content Management System exports. Custom metadata drivers can be implemented in Java.

The metadataLoader configuration setting supports three values by default.

json
This is the default metadata loader. It loads metadata from the HARP-META/metadata.json file, if present.
csv
This loader will look for HARP-META/metadata.csv, and load metadata from that file, if present.
sdl
This loader will look for companion files to the files being synced with the same name but an extension of .MET, as exported by some SDL content management systems.

A custom metadata loader can be specified by using its fully-qualified Java class name.

JSON Metadata Loader

If you download a zip archive from a project or folder using the Titania Delivery admin UI, you'll notice that the downloaded archive contains a file called HARP-META/metadata.json. This file contains all of the user-specified metadata for the files included in the archive. The JSON metadata loader will include the metadata specified by this file.

The structure in this file is a mapping of file paths (optionally including a fragment identifier) to arrays of Metadata values. Each metadata entry has a key, an array of String values, and a boolean field indicating whether the metadata field should cascade.

If the filename includes a fragment identifier (a "#", followed by a string representing a meaningful fragment identifier for a location or region within the document), the metadata will be associated with the fragment, and will be merged with the document-level metadata for searching. Fragment metadata entries should include the special metadata name, _td.fragmentLabel, to supply a readable display label for the fragment. Fragment metadata may be used for navigating to specific locations in a document, when displayed in a portal viewer.

{
  "/file1.dita": [
    {"name": "foo", "value": ["bar"], "cascades": true},
    {"name": "multi", "value": ["a", "b", "c"], "cascades": false}
  ],
  "/path/to/file2.dita": [
    {"name": "foo", "value": ["zed"], "cascades": true},
    {"name": "multi", "value": ["x", "y", "z"], "cascades": false}
  ],
  "/path/to/file3.pdf#page=12": [
    {"name": "about", "value": ["Dopey", "Sneezy"]},
    {"name": "_td.fragmentLabel", "value" ["Page 12"]}
  ]
}

This loader takes no parameters.

CSV Metadata Loader

This metadata loader reads file metadata from metadata.csv. The first column in the file must contain the relative path to the file. The first row names the metadata fields. A metadata name ending with an asterisk (*) indicates that its values should cascade; the asterisk is not included in the metadata name. Multiple values for a field can be represented by additional rows in the CSV structure.

Fragment metadata within a file may be indicated by appending "#" and a meaningful identifier for a location or region in the document. Each fragment should include a value for the special metadata name, _td.fragmentLabel, to supply a readable display label for the fragment.

This loader can take the following parameters.

metadataLoader.csvFile
The relative path to the CSV file. metadata.csv by default.
metadataLoader.csvEncoding
The character encoding of the CSV file, Uses the platform's default encoding by default.
metadataLoader.csvFormat
There is no one standard CSV format. This parameter specifies the "flavor" of CSV for the file. Valid values are as follows.
  • EXCEL (the default)
  • TDF (for tab-delimited format)
  • RFC4180

SDL Metadata Loader

SDL content management offerings can include the metadata associated with documents when they are exported from the system, encoded in files with the same name as the files to which they apply, with .MET on the end. These files contain an XML structure describing the metadata for the file. The SDL metadata loader will read those files and apply the metadata to the file when syncing.

The behavior of this loader can be configured using a separate XML file specifying, among other things, which metadata to include and exclude, mappings from SDL metadata names to Titania Delivery metadata names, and which metadata do and do not cascade.

<metadata-config>
  <!-- Omit metadata not listed in this file; cascade by default -->
  <defaults omit="false" cascades="true" includeEmptyValues="fale"/>

  <field name="FOO" tdname="foo" cascades="true"/>
  <field name="MULTI" tdname="multi" cascades="false"/>
</metadata-config>

The full DTD definition for this configuration file would be as follows.

<!ELEMENT metadata-config (defaults?, field*)>

<!ELEMENT defaults EMPTY>
<!ATTLIST defaults
  omit (true|false) 'false'
  cascades (true|false) 'true'
>

<!ELEMENT field EMPTY>
<!ATTLIST field
  name CDATA #REQUIRED
  tdname CDATA #IMPLIED
  omit (true|false) 'false'
  cascades (true|false) #IMPLIED
>

The configuration file is specified via the metadataLoader.configFile configuration parameter.

The SDL Metadata Loader does not support fragment metadata.

Custom Metadata Loaders

Custom metadata loaders can be implemented in Java by any class that implements the com.titania.harp.client.tdsync.MetadataDriver interface. See the apidocs in the Client Connector distribution for full documentation of this interface. Place the JAR file or files containing any custom implementation, along with any dependencies, into the lib directory of the Client Connector distribution to make it available.