You must enable javascript in order to view this page or you can go
here
to view the webhelp.
Big Data Quality SDK Guide
Content
Search Results
Loading, please wait ...
Welcome
Getting Started
Introduction
Reporting
Workflow
Who should use the SDK?
Installation
System Requirements
Required Operating System Updates
Installing the SDK
Overview
Installer Inclusions
Installing SDK on Windows
Installing SDK on Linux
Running Acushare Service
Reference Data
Reference Data Overview
Using Reference Data: Data Normalization Module and Universal Name Module
Using Reference Data: Universal Addressing Module
Modules
Advanced Matching Module
Supported Jobs
Match Key Generator
Interflow Match
Intraflow Match
Transactional Match
Best of Breed
Duplicate Synchronization
Filter
Data Normalization Module
Supported Jobs
Table Lookup
Advanced Transformer
Universal Addressing Module
Supported Jobs
Validate Address
CASS Certified Processing
CASS 3553 Report
CASS Detailed Report
Validate Address Summary Report
Validate Address Global
Reporting Counters
Validate Address Loqate
Reporting Counters
Universal Name Module
Supported Jobs
Open Name Parser
Reporting
The Java API
Introduction
Components of the SDK Java API
Using the SDK
Using Configuration Property Files
Creating a Java Application
Common API Entities
ConjoinedRule
ConsolidationCondition
ConsolidationRule
ConsolidationAction
FilePath
JobConfig<T extends ProcessType>
MRJobConfig
SparkJobConfig
JobDetail<T extends ProcessType>
JobFactory
JobPath
OrcFilePath
ProcessType
MRProcessType
SparkProcessType
ReferenceDataPath
ReportManager
SimpleRule
Exceptions
JobException
Advanced Matching Module Jobs
Common Module API
AdvanceMatchDetail<T extends ProcessType>
AdvanceMatchFactory
GroupbyOption<T extends ProcessType>
GroupbyMROption
GroupbySparkOption
MatchKeySettings
MatchRule
ChildMatchRule
ParentMatchRule
Special Scenarios
Match Key Generator
Overview
API Entities
MatchKeyGeneratorDetail
Input Parameters
Output Columns
Using a Match Key Generator MapReduce Job
Using a Match Key Generator Spark Job
Interflow Match
Overview
API Entities
InterMatchDetail
InterMatchComparisonOption
Input Parameters
Output Columns
Using an Interflow Match MapReduce Job
Using an Interflow Match Spark Job
Intraflow Match
Overview
API Entities
IntraMatchDetail
Input Parameters
Output Columns
Using an Intraflow Match MapReduce Job
Using an Intraflow Match Spark Job
Transactional Match
Overview
API Entities
TransactionalMatchDetail
Input Parameters
Output Columns
Using a Transactional Match MapReduce Job
Using a Transactional Match Spark Job
Best of Breed
Overview
API Entities
BestOfBreedConfiguration
BestofBreedDetail
Input Parameters
Output Columns
Using a Best of Breed MapReduce Job
Using a Best of Breed Spark Job
Duplicate Synchronization
Overview
API Entities
DuplicateSynchronizationConfiguration
DuplicateSyncDetail
Input Parameters
Output Columns
Using a Duplicate Synchronization MapReduce Job
Using a Duplicate Synchronization Spark Job
Filter
Overview
API Entities
FilterConfiguration
FilterDetail
Input Parameters
Output Columns
Using a Filter MapReduce Job
Using a Filter Spark Job
Data Normalization Module Jobs
Common Module API
DataNormalizationDetail<T extends ProcessType>
DataNormalizationFactory
Table Lookup
Overview
API Entities
AbstractTableLookupRule
Categorize
Identify
Standardize
TableLookupDetail
TableLookupConfiguration
Input Parameters
Output Columns
Using a Table Lookup MapReduce Job
Using a Table Lookup Spark Job
Advanced Transformer
Overview
API Entities
AbstractAdvancedTransformerRules
AdvancedTransformerDetail
AdvancedTransformerConfiguration
RegularExpressionExtraction
RegularExpressionGroupItem
TableDataExtraction
Input Parameters
Output Columns
Using an Advanced Transformer MapReduce Job
Using an Advanced Transformer Spark Job
Universal Addressing Module Jobs
Common Module API
UniversalAddressingDetail<T extends ProcessType>
UniversalAddressingFactory
Validate Address
API Entities
UAMAddressingDetail<T extends ProcessType>
UniversalAddressEngineConfiguration
UAMAddressingFactory
UniversalAddressGeneralConfiguration
UniversalAddressValidateInputConfiguration
Input Parameters
Output Columns
Using a Validate Address MapReduce Job
Using a Validate Address Spark Job
Validate Address Global
API Entities
GlobalAddressingDetail<T extends ProcessType>
GlobalAddressingEngineConfiguration
GlobalAddressingFactory
GlobalAddressingGeneralConfiguration
GlobalAddressingInputConfiguration
Input Parameters
Output Columns
Using a Validate Address Global MapReduce Job
Using a Validate Address Global Spark Job
Validate Address Loqate
API Entities
LoqateAddressingDetail<T extends ProcessType>
LoqateAddressingEngineConfiguration
LoqateAddressingFactory
LoqateAddressingGeneralConfiguration
LoqateAddressingValidateConfiguration
Input Parameters
Output Columns
Using a Validate Address Loqate MapReduce Job
Using a Validate Address Loqate Spark Job
Universal Name Module Jobs
Common Module API
UniversalNameDetail<T extends ProcessType>
UniversalNameFactory
Open Name Parser
API Entities
OpenNameParserDetail
OpenNameParserConfiguration
Input Parameters
Output Columns
Using an Open Name Parser MapReduce Job
Using an Open Name Parser Spark Job
Hive User-Defined Functions
Introduction
Components of a
Big Data Quality SDK
Hive Function
Using a Hive UDF
Advanced Matching Module Functions
Match Key Generator
Sample Hive Script
Interflow Match
Sample Hive Script
Intraflow Match
Sample Hive Script
Transactional Match
Sample Hive Script
Best of Breed
Sample Hive Script
Duplicate Synchronization
Sample Hive Script
Filter
Sample Hive Script
Data Normalization Module Functions
Table Lookup
Sample Hive Script
Advanced Transformer
Sample Hive Script
Universal Addressing Module Functions
Validate Address
Sample Hive Script
Validate Address Global
Sample Hive Script
Validate Address Loqate
Sample Hive Script
Universal Name Module Functions
Open Name Parser
Sample Hive Script
Appendix
Exceptions
Exception Messages
Enums
Common Enumerations
Universal Addressing Enumerations
ISO Country Codes and Module Support
ISO Country Codes and Module Support
Your browser does not support iframes.