Validate Address Loqate Engine Configuration |
To set configurations for performing the validations:
- Verbose
- Tool Info
- Output Address Format
- Log Input
- Log Output
- Log File Name
- Match Score Absolute Threshold
- Match Score Threshold Factor
- Postal Code Max Results
- Strict Reference Match
|
Validate Address Loqate Validate Configuration |
To configure these settings for the input:
- Include Standard Address
- Include Matched Address Elements
- Standardized Input Address Elements
- Return Address Data Blocks
- Output Casing
- Include Result Codes for Individual Fields
- Return Multiple Addresses
- Failed On Multi Match Found
- Multiple Address Count
- Country Format
- Default Country
- Script Alphabet
- Return Geocoded Address Fields
- Acceptance Level
- Minimum Match Score
- Format Data Using AMAS Conventions
- Is Duplicate Handling
- Single Field Duplicate Handling
- Multi Field Duplicate Handling
- Non Standard Field Duplicate Handling
- Output Field Duplicate Handling
|
Validate Address Loqate General Configuration |
To set JVM configurations:
- Maximum Idle Objects
- Minimum Idle Objects
- Maximum Active Objects
- Maximum Wait Time
- Action When Exhausted
- Test on Borrow
- Test on Return
- Test While Idle
- Time Between Eviction Runs in Milliseconds
- Number of Tests Per Eviction Run
- Min Evictable Idle Time in Milliseconds
|
Reference Data Path |
To specify the Reference Data path details.
For the UAM jobs, reference data can be placed on the local data nodes in the cluster or on
HDFS. Note: In case of local data nodes the reference data needs to be placed as un-archived
folders while for HDFS these need to be archived files in .zip
format.
|
Job Configurations |
The Hadoop configurations for the job. For a MapReduce job, the instance must be of
type MRJobConfig. For a Spark
job, the instance must be of type SparkJobConfig.
|
Input File |
For text files:
- File Path
- The path of the input text file on the Hadoop platform.
- Record Separator
- The record separator used in the input file.
- Field Separator
- The separator used between any two consecutive fields of a record, in the input
file.
- Text Qualifier
- The character used to surround text values in a delimited file.
- Header Row Fields
- An array of the header fields of the input file.
- Skip First Row
- Flag to indicate if the first row must be skipped while reading the input file
records.
This must be true in case the first row is a header
row.
Attention: Invoke the appropriate constructor of
FilePath .
For ORC format files:
- ORC File Path
- The path of the input ORC format file on the Hadoop platform.
For Parquet format files:
- Parquet File Path
- The path of the input Parquet format file on the Hadoop platform.
Common parameters:
- Field Mappings
- A map of key value pairs, with the existing column names as the keys and the desired
output column names as the values.
|
Output File |
For text files:
- File Path
- The path of the output text file on the Hadoop platform.
- Field Separator
- The separator used between any two consecutive fields of a record, in the output
file.
Attention: Invoke the appropriate constructor of
FilePath .
For ORC format files:
- ORC File Path
- The path of the output ORC format file on the Hadoop platform.
For Parquet format files:
- Parquet File Path
- The path of the output Parquet format file on the Hadoop platform.
Common Parameters:
- Overwrite
- Flag to indicate if output file must overwrite any existing file of same name.
- Create Output Header
- Flag to indicate if header file is to be created on the Hadoop server or not.
|
Job Name |
The name of the job. |