Using a Validate Address Spark Job
Attention: Before creating and running the first
Validate Address job, ensure the Acushare service is running. For steps, see Running Acushare Service.
-
Create an instance of
UAMAddressingFactory
, using its static methodgetInstance()
. -
Provide the input and output details for the Validate Address job by creating
an instance of
UAMAddressingDetail
specifying theProcessType
. The instance must use the type SparkProcessType. For this, the steps are:-
To configure the input settings for the job, create an instance of
UniversalAddressValidateInputConfiguration
.Set the values of the various required fields of this instance, using the enums Enum PreferredCity, Enum CasingType, Enum CityNameFormat, Enum OutputCountryFormat, Enum StandardAddressFormat, Enum StandardAddressPMBLine, Enum StreetMatchingStrictness, Enum FirmMatchingStrictness, Enum DirectionalMatchingStrictness, Enum DualAddressLogic, and Enum DPVSuccessStatusCondition where applicable.Important: To run Validate Address in the CASS Certified™ mode, set the fieldsoutputReport3553
,outputCASSDetail
, andoutputReportSummary
of this instance to true. The CASS reports contain valid content only when the job is run in the CASS Certified™ mode. Else, blank report PDFs are generated. -
Set the details of the Reference Data path by creating an
instance of
ReferenceDataPath
. See Enum ReferenceDataPathLocation. -
To configure the various job run settings, create an instance of
UAMUSAddressingEngineConfiguration
by passing theReferenceDataPath
instance created above, and the COBOL Runtime path and modules directory path asString
values, as arguments to its constructor.Once theUAMUSAddressingEngineConfiguration
instance is created, set the values for its various required fields. -
To configure JVM settings, create an instance of
UniversalAddressGeneralConfiguration
.Use the enums Enum DPVFileType, Enum DPVMemoryModel, Enum LacsLinkMemoryModel, and Enum SuiteLinkMemoryModel. -
Create an instance of
UAMAddressingDetail
, by passing an instance of typeJobConfig
, and the instances ofUAMUSAddressingEngineConfiguration
,UniversalAddressGeneralConfiguration
, andUniversalAddressValidateInputConfiguration
created above as the arguments to its constructor.TheJobConfig
parameter must be an instance of type SparkJobConfig.- Set the details of the input file using the
inputPath
field of theUAMAddressingDetail
instance.Note:- For a text input file, create an instance of
FilePath
with the relevant details of the input file by invoking the appropriate constructor. - For an ORC input file, create an instance of
OrcFilePath
with the path of the ORC input file as the argument. - For a parquet input file, create an instance of ParquetFilePath with the path of the parquet input file as the argument.
- For a text input file, create an instance of
- Set the details of the output file using the
outputPath
field of theUAMAddressingDetail
instance.Note:- For a text output file, create an instance of
FilePath
with the relevant details of the output file by invoking the appropriate constructor. - For an ORC output file, create an instance of
OrcFilePath
with the path of the ORC output file as the argument. -
For a parquet output file, create an instance of ParquetFilePath with the path of the parquet output file as the argument.
- For a text output file, create an instance of
- Set the name of the job using the
jobName
field of theUAMAddressingDetail
instance. - Set the
compressOutput
flag of theUAMAddressingDetail
instance to true to compress the output of the job.
- Set the details of the input file using the
-
To configure the input settings for the job, create an instance of
-
To create and run the Spark job, use the previously created instance of
UAMAddressingFactory
to invoke its methodrunSparkJob()
. In this, pass the above instance ofUAMAddressingDetail
as an argument.TherunSparkJob()
method runs the job and returns aMap
of the reporting counters of the job. -
To display the reporting counters post a successful job run, use the previously
created instance of
UAMAddressingFactory
to invoke its methodgetCounters()
, passing the created job as an argument.AMap
of counters is received. -
To generate the CASS reports after a successful job run, use the previously
created instance of
UAMAddressingFactory
to invoke the methodgenerateCASSReport()
. You can invoke any of the overloaded versions of the methodgenerateCASSReport()
.Depending on whichgenerateCASSReport()
method signature is used, pass as arguments theMap
of reporting counters derived in the previous step, thejobName
, thepath
where the generated CASS report must be stored, and the requiredreportType
to be created.Thepath
must be on the cluster or client location depending on whether the SDK job is running in a cluster environment or on your client machine, respectively.Note: If thepath
is not specified, the new CASS report is placed in the current working directory.The
reportType
parameter must have values from the Enum UAMCASSReportType. You can specify one or more report types in this parameter.