Search Syntax

Titania Delivery provides a search component to quickly and easily find content within a Portal. While TD retrieves relevant search results using plain-text searches, it also allows for advanced metadata-based queries. The search syntax is the same whether entered in the Portal user interface or used as the query attribute of any of the search tags in the Titania Tag Library. Portal Theme developers can utilize Titania Delivery's search capabilities to customize the content of their Portal pages.

Standard search results are "scored" as a function of the frequency of the search term in a given document and the frequency of that search term within all documents in the Portal. All queries are case insensitive.

Full-text indexing

TD can index the textual contents of the following file formats.

  • XML and HTML Content
  • PDF
  • Microsoft Word, Excel, and Powerpoint
  • Text and Markdown

All other content types, regardless of whether or not they contain textual content, only have their filename indexed.

Indexed Fields

Beyond doing a full-text index, TD indexes text into numerous different fields where possible. These fields have different weightings on search results, so that search results with "hits" in one field may be returned with a higher relevancy score than those with a hit in a different field. The indexed fields, in order of their weighting, are:

  1. title
  2. searchTitle
  3. keywords

The remaining indexed fields are all weighted the same, and all less than the above. Those include:

  • itemKey
  • projectKey
  • filename
  • id
  • contextKey
  • created
  • lastModified
  • content

For example, given two documents, if docA has the term "Titania" in its title field, and docB has that same term only in its content field, a query for "Titania" will result in both documents, with docA returned before docB.

If a document does not have a title, its filename is used for the title field. If a document does not have a search title, its title is used for the searchTitle field.

TD extracts the title and kewords from XML content. In addition, it extracts the searchTitle from DITA topics and maps. The title, description (for generic text), and keywords on non-text-based content, like zip archives or images, can be set in the admin application.

Single Word Searches

The simplest search in TD is a single word. The search engine will match on exact matches and also partial matches. For example, "Big" will match "Big", "Bigger", and "Biggest". The search is case insensitive.

Multi-Word Searches

Search queries containing multiple words are considered as "OR"s by the search engine. For example, a search for Titania Delivery will match documents containing either Titania or Delivery, with documents containing both terms being returned with a higher relevancy than those with only one.

To search for the phrase Titania Delivery place the query in double quotes, as in "Titania Delivery"

Operators

TD supports "AND" and "OR" operators. For example, to find documents containing both the words Titania and Delivery, use the query Titania AND Delivery. Complex queries can be achieved by combining these operators. For example, the query Titania AND Delivery OR Sync will return documents that contain the term "Titania" and also either "Delivery" or "Sync".

The not operator (-) can be use to exclude documents that contain a certain term. A search for -Titania will return all documents that do not contain the term "Titania".

The "*" (asterisk) wild card can be used to mean zero or more of any character. A query of * by itself will return all documents accessible to the portal. A search of *ania will return all documents with words ending in "ania." A search for Del* will return all documents with words starting with "Del". It can also be used in the middle of a word. A search for D*y will return all documents with words starting with "D" and ending in "y", such as "Delivery". As always, these searches are case insensitive.

Proximity Searching

The "~" (tilde) operator can be used to search for words within a certain distance of each other. For example, "Titania Delivery"~2 would return documents where the terms "Titania" and "Delivery" appear with no more than two words in between. Documents with "Titania offers a Delivery solution...", "Titania does Delivery...", or "Titania Delivery" would all be returned.

A document with "Titania specializes in the Delivery of..." would not be returned.

The order of the terms is not important. The words could be found in any ordering so long as the absolute distance between them does not exceed what is specified. Therefore, the same query specified above would also return documents with "...delivery of Titania products...".

Field searches

Searches can be executed on a specific indexed field or fields. For example, to find documents with "Titania Delivery" in the title, the query title:"Titania Delivery" could be used.

Metadata Fields

TD indexes metadata set on documents. The name of the indexed field is the name of the metadata rule followed by _md. The value of that field is an array of all of the metadata values associated with that name.

For instance, if a document has a metadata rule of extension with a value of docx, the metadata field would be called extension_md and the value would be "docx".

If a document has a metadata rule of audience with the values novice and intermediate, the metadata field name would be called audience_md and the value would be ["novice", "intermediate"].

You can test the existence or absence of a metadata field using the special range-based query [* TO *].

audience_md:[* TO *]

To query for all documents that do not have a particular metadata rule:

-audience_md:[* TO *]

To query for documents that have a particular value for a metadata rule:

audience_md:novice

Note: The search engine has a hard limit on the number of results that can be scrolled through of 10,000. For instance, when paginating through large datasets and viewing records 9,990-9,999, attempting to fetch the next page will result in an error.