Managing Kemux streams
Kemux streams can be defined in two ways:
- Statically, as standalone Python scripts, adhering to the class-based structure shown previously.
- Programatically, by defining
ProcessorandSchemasubclasses for each of the new streams inputs and outputs.
Both of these methods are a part of the kemux.manager.Manager class, which is responsible for loading and running the streams.
Manager class
To initialize the manager, the init class method must be called. It takes the following arguments:
- (required)
name: The name of the manager. This is used to identify the manager in the logs. - (required)
kafka_address: The address of the Kafka broker. - (required)
persistent_data_directory: The directory in whichfaust.Appinstance stores data used by local tables, etc. (see:datadirparameter in Faust documentation) - (optional)
streams_dir: The directory where the streams are located. By default it is set toNone, which means that streams will be loaded programatically after the manager is initialized.
Loading streams
Statically
To load streams statically, a path to the directory containing the streams must be provided in the streams_dir argument of the init method.
Programatically
To load streams programatically, the add_stream method can be used. It takes the following arguments:
- (required)
name: The name of the stream. This is used to identify the stream in the logs. - (required)
stream_input_class: The input of the stream, defined similarly to the way it is done in standalone stream scripts file. - (required)
stream_outputs_class: The output of the stream, defined similarly to the way it is done in standalone stream scripts file.
Remember: Each input/output need to have Processor and Schema sublasses defined.
Running streams
After loading the streams, the start method must be called to run them. This starts the underlying Faust application and starts routing messages from the input to the output, according to the stream definition.