mHUB - Java Reference Guide

Introduction

matchIT Hub provides interfaces for use with a number of programming languages. This reference document details the interface for the Java programming language, available for use with all supported platforms. There are separate documents for the other interfaces. Note that all interfaces provide the same functionality.

Classes

Engine

The Java API for matchIT Hub provides a single class within the com.matchIT.Hub package - Engine - which comprises a number of methods.

To create an instance of the Engine class:

import com.matchIT.Hub.*;
...
Engine engine = new Engine();

or:

com.matchIT.Hub.Engine engine = new com.matchIT.Hub.Engine();

Engine Methods

Notes:

  • All string arguments in the following methods are encoded using UTF-8 (Unicode).
  • All methods throw an exception should an error occur.

Initialization

void initialize( String activationCode )

Arguments:

  • activationCode - Customer-specific license code. This is usually supplied as a text file, but can be stored within a database table or even embedded within the user's application. Note that the argument does not specify the license code's filename; the contents of the license file must instead be read into a string that's passed in via this argument.

Description:

Initialize the engine. This method must be called before any other method can be used.

Settings

void applySettings( String xml )

Arguments:

  • xml - XML-formatted string containing the configuration settings.

Description:

Configures the matchIT Hub engine using the supplied settings, provided the engine is currently idle - i.e. no data has been passed in via addData().

This method can be used with a single configuration string, or can even be called multiple times with different configurations strings to share settings between different processes for example, or to break a single settings file into smaller, more manageable files. Each time a new configuration is applied, existing settings are overridden by the new settings; anything that's not specified in the new configuration is not changed.

Refer to the Configuration Guide for details of the configuration settings and how to create and customize them.

String getSettings()

No arguments.

Description:

Retrieves and returns the engine's current settings as an XML-formatted string.

Note that if multiple configuration strings were applied using applySettings(), then this method will return a single string that represents an aggregation of all applied settings.

String getAdvancedSettings()

No arguments.

Description:

Retrieves and returns the engine's current advanced settings as an XML-formatted string.

Note that if multiple configuration strings were applied using applySettings(), then this method will return a single string that represents an aggregation of all applied settings.

String getMatchingMatrices()

From version 2.0.3

No arguments.

Description:

Retrieves the engine's current matching matrices settings as an XML-formatted string.

Data

void addData( int table, String data, int timeout )

Arguments:

  • table - Indicates the table (0, 1, or 2).
  • data - A delimited string (the first character is the delimiter, which must not be alphanumeric).
  • timeout - Timeout, in milliseconds.

Description:

Adds data to the engine's input buffer. If the buffer is full and a nonzero timeout is specified, then the method will wait until space becomes available, or until the timeout period has elapsed (in which case an error is returned). If a timeout of 0 is used, then the method will wait indefinitely until space becomes available.

When the processing mode is set to Matching, then the table can be 0, 1, or 2. When 0 is specified, then the engine performs matching on a single table of data. When 1 or 2 is specified, then the engine will instead find matches that overlap the two tables.

When the processing mode is set to Lookup, then the table can be 1 or 2. This effectively causes the engine to identify records that overlap the two tables.

When the processing mode is set to Normalization, then the table must be 0. Note that in this instance, the data is discarded immediately after it's processed, and isn't retained in memory.

String getData( int table, String uniqueRef )

Arguments:

  • table - Indicates the table (0, 1, or 2).
  • uniqueRef - String that identifies the record.

Description:

Retrieves the record with the given unique ref from the specified table. Unique refs must identify individual records and thus cannot be optional.

void updateData( int table, String data, int timeout )

Arguments:

  • table - Indicates the table (0, 1, or 2).
  • data - A delimited string (the first character is the delimiter, which must not be alphanumeric).
  • timeout - Timeout, in milliseconds.

Description:

The new data is parsed so that the unique ref can be extracted. The existing representation of the record with this unique ref is deleted and the new representation added. Unique refs must identify individual records and thus cannot be optional.

void deleteData( int table, String uniqueRef, int timeout )

Arguments:

  • table - Indicates the table (0, 1, or 2).
  • uniqueRef - String that identifies the record to be deleted.
  • timeout - Timeout, in milliseconds.

Description:

Deletes the record with the given unique ref from the specified table. Unique refs must identify individual records and thus cannot be optional.

void noMoreData( int table )

Arguments:

  • table - Indicates the table (0, 1, or 2).

Description:

Informs the engine that no more data will be added for the indicated table (refer to addData() for details of the table identifier).

void allowMoreData( int table )

Arguments:

  • table - Indicates the table (0, 1, or 2).

Description:

Informs the engine that more data will be added for the indicated table (refer to addData() for details of the table identifier).

void clearData( int table )

Arguments:

  • table - Indicates the table (0, 1, or 2).

Description:

Removes and discards all data relating to the indicated table. This includes: buffered input data, cached records, clusters, matching pairs, matching groups. (refer to addData() for details of the table identifier).

Processing is not interrupted, and any buffered results are left intact.

Results

int getResultCount()

No arguments.

Description:

Returns the number of buffered results in the output buffer.

Note that when the processing mode is set to Lookup, the result count is the number of buffered results that are associated with the calling thread, not the total number of buffered results. Refer to the Getting Started | Processing Modes | Lookup for further details on the Lookup mode.

String getNextResult( int timeout )

Arguments:

  • timeout - Timeout, in milliseconds.

Description:

Gets and returns the next buffered result and removes it from the output buffer.

If no results are available, then the method throws an exception. getResultCount() should be used beforehand to ensure that there are results available.

A timeout can be used to ensure that blocking doesn't occur. For example, if the input buffer is full and addData() is called with a timeout of 0 then the engine will be blocked; subsequently calling getNextResult() from another thread, also with a timeout of 0, will cause the method to block until addData() completes. If both methods are used simultaneously by concurrent threads, then a timeout of 0 should not be used.

Note that when the processing mode is set to Lookup, this method will remove the next buffered result that's associated with the calling thread. Refer to the Getting Started | Processing Modes | Lookup for further details on the Lookup mode.

void clearResults()

No arguments.

Description:

The output buffer is cleared; any buffered results are lost.

Note that when the processing mode is set to Lookup, this method will remove all buffered results, not just those for the calling thread. Refer to the Getting Started | Processing Modes | Lookup for further details on the Lookup mode.

Processing

State getState()

No arguments.

Description:

Returns the engine's current state:

  • Uninitialized - the engine has been created but not yet initialized;
  • Initialized - the engine has been initialized but no settings applied;
  • Ready - settings have been applied but no data has been added;
  • Running - data has been added, and the engine is actively processing data or waiting for more data to process;
  • Paused - the engine's processing threads have been temporarily paused;
  • Finished - all data input to the engine has been fully processed and all results have been output;
  • Aborted - the engine's processing threads have been stopped, and no more data can be added.

int getUnprocessedCount()

No arguments.

Description:

Returns the number of records that have not yet been processed. This includes all records that are currently being processed by the processing threads, plus all records in the input buffer.

Note that when the processing mode is set to Lookup, this method returns the unprocessed count that's associated with the calling thread, rather than the total number of unprocessed records. Refer to the Getting Started | Processing Modes | Lookup for further details on the Lookup mode.

String getStats()

No arguments.

Description:

Produces and returns an XML-formatted string that lists useful statistical details for the current processing mode. This method can be called when processing has completed - to get final statistics - or even during processing to get extended information that supplements getUnprocessedCount().

void pause()

No arguments.

Description:

All processing threads within the engine are paused, so no more data will be processed until the engine is resumed. Data can still be added via addData(), provided there's space in the input buffer. Results can be retrieved from the output buffer.

void resume()

No arguments.

Description:

Resumes processing if the engine was paused.

void abort()

No arguments.

Description:

Aborts processing. All processing threads are stopped. No more data can be added, and any existing buffered input data is not processed. Any results in the output buffer are left intact and can still be retrieved.

void reset()

No arguments.

Description:

Resets the engine so that new data can be added and processed using the current settings. All unprocessed and buffered input data is cleared. All buffered results are cleared.

void restart() - NOT IMPLEMENTED

No arguments.

Description:

Reserved for future use.

Errors

String getNextError()

No arguments.

Description:

All errors are stored internally within a stack. This method retrieves the next error, removes it from the stack, and returns it.

When any engine method fails and returns an error code, additional information about the failure can be retrieved via this method.

Internal processing failures and warnings will also be logged to the stack. Any large clusters encountered during processing are also logged to the stack (refer to the Configuration Guide for details on large clusters).

Information

String getVersion()

No arguments.

Description:

Returns a string representing the version number of the matchIT Hub engine.

The version consists of four delimited numbers (product.major.minor.patch) plus an optional pre-release indicator, for example "1.0.3.2" or "1.1.0.0 (beta 2)".

String getExpiry()

No arguments.

Description:

Returns an 8-character string of format YYYYMMDD that indicates the expiry date of the applied activation code, for example "20170531".

String getMetadata()

From version 2.0.3

No arguments.

Description:

Retrieves metadata about the settings applied.

Normalization mode metadata schema:

<metadata>
<mappedComponents>
<component>...</component>
<component>...</component>
...
</mappedComponents>
<outputs>
<output>...</output>
<output>...</output>
...
</outputs>
</metadata>

Matching mode metadata schema:

<metadata>
<mappedComponents>
<component>...</component>
<component>...</component>
...
</mappedComponents>
<outputs>
<output>...</output>
<output>...</output>
...
</outputs>
<matchingMethods>
<recommended>
<method>...</method>
<method>...</method>
...
</recommended>
<additional>
<method>...</method>
<method>...</method>
...
</additional>
</matchingMethods>
</metadata>