Configuring the Search Engine

Topics:

How to:

WebFOCUS must know the location of the search engine, the XSLT style sheet to use for the Magnify Search user interface, and the maximum number of items that should be returned by the search results tree. These items have preset defaults.

The following procedure explains how to set the search engine parameters using the WebFOCUS Administration Console.

Procedure: How to Set Search Engine Parameters in the WebFOCUS Administration Console

To set the search engine parameters in WebFOCUS:

  1. In the Configuration tab of the WebFOCUS Administration Console, expand Application Settings and select Magnify.

    The following image shows the WebFOCUS Administration Console with the default Settings for Magnify Search in the right pane.

  2. Type values (or accept the default values) for the remaining parameters:
    Configuration Directory (IBI_MAGNIFY_CONFIG)

    Specifies the context of the directory where the configuration files are located.

    The default value assigned to this setting is:

    {IBI_DOCUMENT_ROOT}/config/magnify

    where:

    {IBI_DOCUMENT_ROOT}

    is the default context for your installation of WebFOCUS. Typically, this is:

    drive:/ibi/WebFOCUS82/.

    Note: If your installation needs to make changes to the Magnify Search configuration directory, we recommend that you move it outside of the default location, drive:/ibi/WebFOCUS82/config/magnify. Doing so ensures that future WebFOCUS upgrades do not overwrite your customized configuration information.

    Index Directory (IBI_MAGNIFY_LUCENE_INDEX_DIRECTORY)

    Specifies the context of the directory that contains the default Lucene™ index.

    The default value assigned to this setting is:

    {IBI_DOCUMENT_ROOT}/magnify/lucene4_index

    where:

    {IBI_DOCUMENT_ROOT}

    is the default context for your installation of WebFOCUS. Typically, this is:

    drive:/ibi/WebFOCUS82/.

    Note: The index in this directory only supports the Lucene index. Additional search directories are configured in the collections.xml file, which is located in the drive:/ibi/WebFOCUS82/config/magnify/ directory.

    Feed Cache Directory (IBI_MAGNIFY_FEED_CACHE_DIRECTORY)

    Specifies the context of the directory where Magnify Search stores records that are not yet loaded into the index. After all content is added to the Lucene index, the cached version is moved to this directory if the value assigned to the magnify_generate_deltas variable is true. If the value assigned to this variable is false, the cached version is deleted. An index created during a quiesce is loaded after the backup of the Lucene index is complete.

    The default value assigned to this setting is:

    {IBI_DOCUMENT_ROOT}/magnify/feedcache

    where:

    {IBI_DOCUMENT_ROOT}

    is the default context for your installation of WebFOCUS. Typically, this is:

    drive:/ibi/WebFOCUS82/.

    Collections File (IBI_MAGNIFY_COLLECTIONS_FILE_NAME)

    Specifies the file name where the Lucene™ indexes and collections are defined.

    The default value assigned to this setting is collections.xml, but this file does not exist, by default. Instead, the Magnify Search installation provides a collections.xmltemplate file in the drive:/ibi/WebFOCUS82/config/magnify directory that you can use to create a collections.xml file for your installation.

    The collections.xmltemplate file defines the default values for all analyzers that Magnify can use in indexing and in searching. To create a collections.xml file that conforms to your requirements, copy the collections.xmltemplate file, rename it as collections.xml, and modify it to reflect the desired settings for your Magnify Search environment.

    For more information on configuring collections of indexes, see Configuring Magnify Search Collections.

    Maximum Number of Search Results (IBI_MAGNIFY_RECORDLIMIT)

    Specifies the maximum number of search results returned by a search request. Any results beyond this number are not displayed to the user. The default value is 300 results.

    Enable Suggest Index Creation (IBI_MAGNIFY_ENABLE_SUGGEST_INDEX_CREATION)

    When this check box is selected, Magnify Search automatically creates a dictionary file for an index or a collection that makes spelling suggestions when a user is specifying a search query. This check box is cleared, by default.

    Enable Feed (IBI_MAGNIFY_ENABLE_FEEDING)

    When this check box is selected, Magnify Search can receive incoming data feeds. This is the default setting. However, when this check box is selected, efforts to monitor or update indexing operations by Magnify Search developers or administrators can affect front-end operations and potentially impact performance.

    When this check box is cleared, Magnify Search developers and administrators can monitor and update indexing operations without affecting front-end operations.

    Feed minimum file threads (IBI_MAGNIFY_POOLSIZE_FILE_PROCESSING)

    The minimum number, and initial allocation, of threads that Magnify Search can support when parsing feed files. The default value is 6 threads.

    Note: Any value below the default number of threads will result in slower response times.

    Feed maximum file threads (IBI_MAGNIFY_MAXPOOLSIZE_FILE_PROCESSING)

    The maximum number of threads that Magnify Search can support when parsing feed files. The default value is 25 threads.

    Note: Any value above the default number of threads will result in slower response times.

    Feed minimum record threads (IBI_MAGNIFY_FEED_MINIMUM_RECORD_THREADS)

    The minimum number, and initial allocation, of threads that Magnify Search can support when feeding data to the Lucene index. The default value is 6 threads.

    Note: Any value below the default number of threads will result in slower response times.

    Feed maximum record threads (IBI_MAGNIFY_FEED_MAXIMUM_RECORD_THREADS)

    The maximum number of threads that Magnify Search can support when feeding data to the Lucene index. The default value is 25 threads.

    Note: Any value above the default number of threads will result in slower response times.

    Feed keep alive duration (ms) (IBI_MAGNIFY_KEEP_ALIVE)

    Identifies the maximum number of minutes that an inactive connection to the Magnify Search provider can remain open and idle during a data feed operation. If a connection remains inactive for more than the number of minutes identified in this setting, Magnify Search closes it.

    Typically, administrators assign a value to this setting that minimizes the number of open and idle connections. For example, a connection expiration interval of fifteen minutes could potentially leave more connections open and idle than a connection expiration interval of five minutes.

    The default value is 500 minutes.

    Dynamic Partition clean up (IBI_MAGNIFY_ENABLE_CLEANUP)

    When this check box is selected, Magnify Search imposes a cleanup operation that reviews all Magnify Partition Index libraries and deletes all duplicate data. The cleanup operation takes place after the interval defined in the Dynamic Partition clean up interval (minutes) (IBI_MAGNIFY_CLEANUP_INTERVAL) setting.

    When this check box is cleared, no automated cleanup operation takes place. This is the default setting.

    Dynamic Partition clean up interval (minutes) (IBI_MAGNIFY_CLEANUP_INTERVAL)

    Identifies the number of minutes between Magnify Partition Index cleanup operations that review all Magnify Partition Index libraries and delete all duplicate data. The timer counting the number of minutes between cleanup operations is reset to zero after each cleanup operation. This setting is relevant only when the Dynamic Partition clean up (IBI_MAGNIFY_ENABLE_CLEAN_UP) check box is selected.

    Default partition size (GB) (IBI_MAGNIFY_DEFAULT_PARTITION_SIZE)

    The maximum number of gigabytes that a single-partitioned index library can contain.

    The default value is 10 gigabytes. Partitions are sections of an index library folder created dynamically by the Magnify Dynamic Partition feature.

    Note: In addition to configuring the database connection settings, the drive:/ibi/WebFOCUSxx/utilities/WFReposUtil/MagnifyCreateDDL.bat (for Windows) or MagnifyCreateDDL.sh (for UNIX) must be executed to create the Dynamic Partitioning database tables.

  3. Click Save.

Configuration of the search engine for Magnify Search is complete.

Setting Timers For Feeding Data

When indexing large amounts of records, the information is not available for search until it is committed to the index. Magnify Search enables administrators to set timers for when prolonged indexing occurs. Commits can be issued during this time, thereby updating the index library. This ensures that the latest version of the index library is available when a search is performed.

Magnify Search timers regulate Magnify Search operations while transmitting feeds to the index library. This affects how and when new search content is made available to the Magnify Search-based application. This is useful when indexing large amounts of data. Timers also control Magnify Search operations such as open, close, and write. This helps tune the Magnify Search platform for various indexing activities by adjusting times to help control the frequency at which Magnify Search operations take place, thereby assisting in memory and performance allocation.

When Magnify Search receives incoming feeds, they are first held in memory for processing, and then they are written to the physical index library. Once a feed is written, then it can be made available to the Magnify Search-based application. After the search syncs its view of the index library, the newly fed record will be returned as a search result.

There are several different timers that can be configured using the settings in the following page:

http://server_name:port_number/context_root/search/jsp/setIndexWriterTimers.jsp

Alternatively, you can access the timer settings by clicking Magnify Search Timers in the Magnify Search Console.

The Magnify Search Timers page opens, as shown in the following image.

You can set the following timers for the index:

  • Flush timer. Controls the length of time to hold an incoming stream of Magnify Search feeds in memory before being written to the index library. Larger volumes of data should be given more time and memory. This results in less I/O usage, but it increases the latency of the feeds written to the disk. Therefore, higher times are recommended for historic, first-time, or other large batch indexing processing. For large index streams, it is recommended that the Flush timer is set to 600 or more seconds and that incremental/real-time loads is set to 60 seconds.
  • Commit timer. Controls the length of time to wait at the end of an incoming stream of Magnify Search feeds before being written to the index library. The Commit timer setting must be less than or equal to the flush timer setting to avoid flushing empty memory to the index library. This is activated as soon as there is a pause or a gap in the incoming stream to Magnify Search. Usually, this value should be the same as the Flush timer.

    Note: When indexing, the Magnify Search Administrator can set the Commit timer to a very low value (for example, 5 seconds) so that Magnify Search users retrieve the new search data almost immediately.

  • Close timer. Sets the length of time after the last incoming document has been processed before performing closing operations on the index library. This results in disk I/O operations. With higher times, feeds that may result in small breaks between incoming streams are not required to use resources on open operations. Higher times are also recommended for bulk indexing, although higher times decrease the time to open and close indexes. The default value is 2 minutes.
  • Inactivity timer. Specifies the frequency in which to check the duration of the close and commit timer cycles. It is recommended that this be the same as the close timer setting. This setting requires a very small set of I/O. Only in the most extreme cases is this setting different than the close timer setting.
  • Feedcache. Sets the length of frequency for Magnify Search feeds to check for any incoming feeds from the feedcache directory.
  • Feedcache (Long Duration). Sets the length of frequency for Magnify Search feeds to check for any incoming feeds from the feedcache directory after few short sleep times without any incoming feed.
  • Reading Refresh Rate. Controls the intervals at which the Magnify Search application refreshes its cached view of the Magnify Search index library. This is applicable when multiple application servers are used to feed and search Magnify Search. For example when a single WebFOCUS environment is created with two application servers, one is used for WebFOCUS reporting and Magnify Search searching while the other is used for indexing. Each application server is configured with different port numbers. To sync up the searching and indexing processes, the Magnify Search readers must be refreshed.

Note: Changes made to the timer settings should be tested before being applied in a production environment, since data size, memory allocations, and CPU specifications differ between machines.

WebFOCUS

Feedback