lunr (javascript search support)

lunr is a javascript library for indexing and searching text.

The lunr library is used in the offline packager to index documents in the package, and provide offline search capabilities in the browser. Information is available at the lunr project home page. The lunr.js github project has been forked into the TitaniaSoftware github area.

Building a search index involves configuring a lunr.Builder, adding documents, and then building the index. (The Builder class was added in lunr v2; the earlier idiom of simply adding documents to an index is still supported.) lunr documents are hash objects, where each hash entry represents a named search field. The field list is configurable, but must be the same for all documents indexed. Refer to the lunr documentation for details. For offline packaging, the index is built during packaging, then saved as JSON in the package. The HTML pages load the index and provide for client-side searching.

By default, the following fields are indexed when building the lunr search index:

  • id, containing the path and filename.
  • title
  • body, containing the full text contents of the document
  • keywords, containing all keyword metadata associated with the document
  • lang

lunr only fully supports English. However, the default packager theme uses lunr.languages plugins for non-English languages, when one is available. As a fallback, the English tokenizing process seems to work acceptably for many non-English Western languages.

The lunr search syntax is quite limited compared to lucene, solr, or elasticsearch (which supports online portal search). These are the features supported:

  • Wildcard ("*")
  • Required and prohibited presence indicators ("+" or "-" before term)
  • Field tag ("field:" before term)
  • Edit distance ("~n" after term). This will match indexed terms that can be created with no more than n edits to the search term.
Caution: Using a single exclusion term in a search may cause script errors. Best practice is to always combine an exclusion term with a restrictive positive term. For example: +pizza -anchovy.
Multiple terms are combined with OR. There is no grouping, phrasing, or proximity search support. Some of these features could probably be emulated by preprocessing the search input to handle enhanced syntax features. It might also be possible to enhance the indexing to provide token metadata that could support detection of word proximity. These modifications could be made to the default offline packager theme by a javascript developer.