Optimizing Geocoding

Geocoding stages provide the best performance when the input records are sorted by postal code. This is because of the way the reference data is loaded in memory. Sorted input will sometimes perform several times faster than unsorted input. Since there will be some records that do not contain data in the postal code field, the following sort order is recommended:

  1. PostalCode
  2. StateProvince
  3. City

You can also optimize geocoding stages by experimenting with different match modes. The match mode controls how the geocoding stage determines if a geocoding result is a close match. Consider consider setting the match mode to the Relaxed setting and seeing if the results meet your requirements. The Relaxed mode will generally perform better than other match modes.

Optimizing Geocode US Address

The Geocode US Address stage has several options that affect performance. These options are in this file:

SpectrumLocation\server\modules\geostan\java.properties

egm.us.multimatch.max.records
Specifies the maximum number of matches to return. A smaller number results in better performance, but at the expense of matches.
egm.us.multimatch.max.processing
Specifies the number of searches to perform. A smaller number results in better performance, but at the expense of matches.
FileMemoryLimit
Controls how much of the reference data is initially loaded into memory.