Point-In-Polygon Performance Configuration

When creating a point-in-polygon solution, there are some specific configuration changes you can make to both the server and database resources that will affect the performance of the Spectrum server. Depending if you are using the Centrus Point In Polygon stage or the Query Spatial Data stage, the way you configure Spectrum to gain performance will be different.

Setting server resources for the Query Spatial Data stage. The Query Spatial Data stage uses the spatial remote component in the Location Intelligence Module. It is important to configure the machine resources to match the stage and server processes. When running point-in-polygon operations using the Query Spatial Data stage, make sure that the number of cores on the server machine are available to the spatial remote component. When using a stage for processing, it is important that the number of threads in the Query Spatial Data stage is equal to the number of cores on the machine. To modify this, edit the runtime parameters of Query Spatial Data stage when using it in the Enterprise Designer. For more information on how to tune the performance of Spectrum and the remote components, see Managing Memory and Threading.

Note: If the server needs to be available for other types of operations, making all the cores available for the spatial remote component may starve those processes for resources.

Setting Centrus database resource cache and pool size (Point In Polygon stage). When creating your Centrus database resources in Spectrum (using the Management Console), it is important to configure the Cache Size and Pool Size fields to best match your servers configuration.

The Pool Size field defines the maximum number of concurrent requests you want this database to handle. The optimal pool size will vary, and testing should be done with various combinations of Pool Size and Cache Size to get most efficient performance. You will generally see the best results by setting the pool size between one-half to twice the number of CPUs on the server, with the optimal pool size for most modules being the same as the number of CPUs. For example, if your server has four CPUs you may want to experiment with a pool size between 2 (one-half the number of CPUs) and 8 (twice the number of CPUs) with the optimal size possibly being 4 (the number of CPUs). When modifying the Pool Size you must also consider the number of runtime instances specified in the dataflow for the stages accessing the database resource. Consider for example a dataflow that has a Point In Polygon stage that is configured to use one runtime instance. If you set the pool size for the database to four, you will not see a performance improvement because there would be only one runtime instance and therefore there would only be one request at a time to the database resource. However, if you were to increase the number of runtime instances to four, you might then see an improvement in performance since there would be four instances of Point In Polygon accessing the database resource simultaneously, therefore using the full pool.

The Cache Size field determines the amount of memory to use to use to cache data. In general, the larger the cache the better the performance.