Input Parameters

Parameter	Description
Group-By Option	Specifies the field to use to create groups of records to synchronize. For a MapReduce job, pass the arguments: GroupBy Column The name of the column using which the records are to be grouped. Number of Reducer Tasks The number of reducer tasks required to group the records. For a Spark job, to create a Group-By option pass the arguments: GroupBy Column The name of the column using which the records are to be grouped. Note: If there is no group in the input, then set this parameter to null. In this case, the entire data is considered in a single group.
Duplicate Synchronization Configuration	The rules based on which the fields of one record are copied to the other records of a collection.
Input File	For text files: File Path The path of the input text file on the Hadoop platform. Record Separator The record separator used in the input file. Field Separator The separator used between any two consecutive fields of a record, in the input file. Text Qualifier The character used to surround text values in a delimited file. Header Row Fields An array of the header fields of the input file. Skip First Row Flag to indicate if the first row must be skipped while reading the input file records. This must be `true` in case the first row is a header row. Attention: Invoke the appropriate constructor of `FilePath`. For ORC format files: ORC File Path The path of the input ORC format file on the Hadoop platform. For Parquet format files: Parquet File Path The path of the input Parquet format file on the Hadoop platform. Common parameters: Field Mappings A map of key value pairs, with the existing column names as the keys and the desired output column names as the values.
Output File	For text files: File Path The path of the output text file on the Hadoop platform. Field Separator The separator used between any two consecutive fields of a record, in the output file. Attention: Invoke the appropriate constructor of `FilePath`. For ORC format files: ORC File Path The path of the output ORC format file on the Hadoop platform. For Parquet format files: Parquet File Path The path of the output Parquet format file on the Hadoop platform. Common Parameters: Overwrite Flag to indicate if output file must overwrite any existing file of same name. Create Output Header Flag to indicate if header file is to be created on the Hadoop server or not.
Job Name	The name of the job.
Compress Output	Flag to indicate if the output must be compressed. Set this to `true` to compress the output.