Topology definition
Topology is defined via JSON
. It follows this structure:
general
: general information about the topologyheartbeat
: Defines heartbeat frequency in msecpass_binary_messages
: Optional. If true, the messages are passed in binary form from one node to the other. Otherwise they are serialized into JSON and deserialized in each subsequent node. Default is false. See notes at the bottom.initialization
: Optional. List of initialization scripts:- Single initialization script
working_dir
: working directory where initialization file is located.cmd
: name of the file where initialization code resides.init
: initialization object that is sent to initialization code ininit()
methoddisabled
: optional flag that this step is disabled. This means that it wont be run.
- Single initialization script
shutdown
: Optional. List of shutdown scripts:- Single shutdown script
working_dir
: working directory where initialization file is located.cmd
: name of the file where initialization code resides.disabled
: optional flag that this step is disabled. This means that it wont be run.
- Single shutdown script
spouts
: array of spout definitionsname
: spout nametype
:inproc
(in-process),module_method
(created by calling a method in specified module),module_class
(created by instantiating an instance of specified class from specified module), orsys
(standard)working_dir
: working directory where main file is locatedtelemetry_timeout
: Optional time (in milliseconds) that will elapse between two subsequent telemetry messages. Default is 1 minute.disabled
: optional flag that this spout is disabled. This means that it wont be instantiated.cmd
: name of the file that where spout is defined. If spout runs in-process, this file is loaded usingrequire()
.subtype
: Optional. String parameter that is passed to factory method for creation of spout. This enables the developers to provide multiple spouts inside single source file.init
: initialization object that is sent to spout ininit()
method
bolts
: array of bolt definitionsname
: bolt nametype
:inproc
(in-process),module_method
(created by calling a method in specified module),module_class
(created by instantiating an instance of specified class from specified module), orsys
(standard)working_dir
: working directory where main file is locatedtelemetry_timeout
: Optional time (in milliseconds) that will elapse between two subsequent telemetry messages. Default is 1 minute.disabled
: optional flag that this bolt is disabled. This means that it wont be instantiated.cmd
: name of the file that where bolt is defined. If bolt runs in-process, this file is loaded usingrequire()
.subtype
: Optional. String parameter that is passed to factory method for creation of spout. This enables the developers to provide multiple bolts inside single source file.inputs
: array of input nodes (spouts and bolts) for this boltname
: logical name of the input nodestream_id
: (optional) id of stream that this bolt will read. Empty means default stream.disabled
: optional flag that this input is disabled. This means that no data will flow here.
init
: initialization object that is sent to bolt ininit()
method
variables
: map of variables that can be reused when defining spout and bolt paths. Similar to environment variables in Unix.
An example:
{
"general": {
"heartbeat": 3200,
"initialization": [
{ "working_dir": ".", "cmd": "init.js" }
],
"shutdown": [
{ "working_dir": ".", "cmd": "shutdown.js" }
]
},
"spouts": [
{
"name": "pump1",
"type": "inproc",
"working_dir": ".",
"cmd": "spout_inproc.js",
"init": {}
},
{
"name": "pump2",
"type": "module_class",
"working_dir": "some-module",
"cmd": "TargetClass",
"init": {}
}
],
"bolts": [
{
"name": "bolt1",
"working_dir": ".",
"type": "inproc",
"cmd": "bolt_inproc.js",
"subtype": "subtype1",
"inputs": [{ "source": "pump1" }],
"init": {}
},
{
"name": "bolt2",
"working_dir": ".",
"type": "inproc",
"cmd": "bolt_inproc.js",
"inputs": [
{ "source": "pump1" },
{ "source": "pump2", "stream_id": "stream1" }
],
"init": {
"forward": true
}
},
{
"name": "bolt3",
"working_dir": ".",
"type": "inproc",
"cmd": "bolt_inproc.js",
"inputs": [{ "source": "bolt2" }],
"init": {
"forward": false
}
},
{
"name": "bolt4",
"working_dir": "some-module",
"type": "module_method",
"cmd": "methodName",
"subtype": "some-name",
"inputs": [{ "source": "bolt2" }],
"init": {
"forward": false
}
}
],
"variables": {}
}
Topology can be validated using built-in validator:
"use strict";
let config = {....};
// Run validator and abort process if topology doesn't meet the schema.
const validator = require("qtopology").validation;
validator.validate({ config: config, exitOnError: true });
Notes
Passing binary messages or not?
Use this option at your own risk. You have to explicitly enable it.
If messages are passed in binary form, there is a chance that one of the subsequent nodes changes the messages and this change will be visible to all other nodes. There is no isolation between the siblings. Also, there is no guarantied order of execution between siblings and their children.
So, when should we use it? When all of the following assumptions are true:
- You want performance / you want to pass fields that are binary classes (e.g.
Date
type). - Data fields of the message never changes. Only new fields are added.
- There is no expectation of the order of execution between peer nodes and their children. Any dependency is purely upstream.