The content indexing jobs
- assign the previously created index zones to files/emails of the selected entities
- use Elasticsearch as back end server (3rd party index server) to manage and store indexes and to perform search
- crawl the contentACCESS archive and send document text and metadata to Elasticsearch server
On the configuration page of the given job the user is required to specify the following settings:
✓ Scheduling settings
In this step the running time(s) of the job must be selected. It is possible either to select a scheduler from the list or to create a new scheduler via create new … option from the dropdown list. Using schedulers the administrator may automatize the running times of the given job. The mailboxes are periodically synchronized with the categories written into the queue in time intervals, which are set here. For more information how to set schedulers refer to section Schedules above.
✓ Resource settings
Set the values, which will determine how many items will be processed simultaneously by the job. The recommended value is “2”.
✓ Filtering settings
Set here the file types that should and shouldn’t be processed.
✓ Entities to index
Set here the entities that will be processed by the indexing job.
Index zones can be defined as a set of one or more Elasticsearch indexes, which are used for logical and physical separation of indexed documents. They were introduced to keep Elasticsearch indexes small, since smaller indexes are easier to move and distribute on multiple Elasticsearch servers.
Index zones can be assigned to entites directly or can be assigned and then overwritten by indexing jobs.