- Print
- DarkLight
- PDF
Creating a Batch Listener
We have included an Interactive Training to provide a more engaging and visual learning experience. End users are required to complete their training through the CSG University.
The Batch Listener allows data ingestion from batch sources into a graph. Also, the Batch Listener captures input required to process data from batch sources at a defined frequency.
Creating a new Batch Connection is mandatory to create a Batch Listener Adaptor.
To add a Batch Listener Adaptor on any graph:
- Click the Graph Options dropdown in the top left side of the window.
- Select Add Listener from the dropdown. The Listener Editor window will appear.
Listener Type: Select the Batch Listener Type from the Listener Type dropdown.
Connection: Choose the required batch Connection from the Connection dropdown.
Batch Connections created under the Batch Connection configuration will appear here.
The (Edit) button is next to the connection dropdown and available for all types of listeners.
It allows users to reconfigure, edit, or create a new connection.
Batch Listener Options
Once the Batch Listener Type is selected, all the configurations related to Batch Listener will appear.
Data Format
In data processing, there are different types of file formats to store your data sets. The most popular data formats are Parquet, Delimited, and JSON.
Parquet is an open-source file format that handles flat columnar storage data formats.
Delimited files in which values in the row and columns are separated by configured delimiters (Comma, Tab, Customized delimiter, Pipe).
JSON (JavaScript Object Notation) is an open data format and is widely used by APIs (Application Programming Interface — How systems communicate with each other) and several databases (like MongoDB)
Parsing Inputs
Creating the detailed configuration (Compression, Encryption, and File path) is mandatory to parse the inputs from the defined files.
Compression allows you to define the decompression method, while the Encryption method decrypts the data using an algorithm and private key to parse the information.
A Directory refers to the location of the stored files and Path Pattern (Regex) filters the exact defined file.
Path Pattern (Regex)
Regular expressions are mainly helpful for defining the file name available in the directory, which contains a series of characters that define a pattern of text to be matched to make a filter more specific.
By using regular expressions single or multiple files can be filtered out.
Interval
The Interval configuration defines the batch processing interval, which will be repeated at the number of times mentioned in the frequency.
Output
The Output configuration allows you to create, modify, and define the source schema. The source schema defines the structure and properties of the data source, which can be read from and written to from any node in the graph.