Overriding File Format

When you run a job using job executor or the Administration Utility, you can override the file layout (or schema) of the file specified in the dataflow's Read from File stage and Write to File stage.

To do this in job executor, specify the following at the end of the job executor command line command:

StageName:schema=Protocol:SchemaFile

In the Administration Utility, use the --l argument in the job execute command:

–-l StageName:schema=Protocol:SchemaFile

Where:

StageName

The stage label shown under the stage's icon in the dataflow in Enterprise Designer. For example, if the stage is labeled "Read from File" you would specify Read from File for the stage name.

To specify a stage within an embedded dataflow or a subflow, preface the stage name with the name of the embedded dataflow or subflow, followed by a period then the stage name:

EmbeddedOrSubflowName.StageName

For example, to specify a stage named Write to File in a subflow named Subflow1, you would specify:

Subflow1.Write to File

To specify a stage in an embedded dataflow that is within another embedded dataflow, add the parent dataflow, separating each with a period. For example, if Embedded Dataflow 2 is inside Embedded Dataflow 1, and you want to specify the Write to File stage in Embedded Dataflow 2, you would specify this:

Embedded Dataflow 1.Embedded Dataflow 2.Write to File

Protocol
A communication protocol. One of the following:
file
Use the file protocol if the file is on the same machine as the Spectrum™ Technology Platform server. For example, on Windows specify:

"file:/C:/myfile.txt"

On Unix or Linux specify:

"file:/testfiles/myfile.txt"

esclient
Use the esclient protocol if the file is on the computer where you are executing the job if it is a different computer from the one running the Spectrum™ Technology Platform server. Use the following format:

esclient:ComputerName/path to file

For example,

esclient:mycomputer/testfiles/myfile.txt

Note: If you are executing the job on the server itself, you can use either the file or esclient protocol, but are likely to have better performance using the file protocol.
If the host name of the Spectrum™ Technology Platform server cannot be resolved, you may get the error "Error occurred accessing file". To resolve this issue, open this file on the server: SpectrumLocation/server/app/conf/spectrum-container.properties. Set the spectrum.runtime.hostname property to the IP address of the server.
esfile
Use the esfile protocol if the file is on a file server. The file server must be defined in Management Console as a resource. Use the following format:

esfile://file server/path to file

For example,

esfile://myserver/testfiles/myfile.txt

Where myserver is an FTP file server resource defined in Management Console.
webhdfs
Use the webhdfs protocol if the file is on a Hadoop Distributed File Server. The HDFS server must be defined in Management Console as a resource. Use the following format:

webhdfs://file server/path to file

For example,

webhdfs://myserver/testfiles/myfile.txt

Where myserver is an HDFS file server resource defined in Management Console.
SchemaFile

The full path to the file that defines the layout you want to use.

Note: You must use forward slashes (/) in file paths, not backslashes.

To create a schema file, define the layout you want in Read from File or Write to File, then click the Export button to create an XML file that defines the layout.

Note: You cannot override a field's data type in a schema file when using job executor. The value in the <Type> element, which is a child of the <FieldSchema> element, must match the field's type specified in the dataflow's Read from File or Write to File stage.

Example File Format Override

This example executes a job named TestJob. Instead of writing the output to the file specified in the Write to File stage, it will write the output to outputoverride.txt. Instead of using the file schema specified in the Write to File stage in the flow, the job will use the schema specified in output-data.xml.

job execute --j TestJob --l "Write to File=file:/Users/me/outputoverride.txt,Write to File:schema=file:/Users/me/output-data.xml"