Dataflow Specification

Dataflows are specified through a YAML file. This section presents our current draft for the file format. It only includes basic functionality for now, we will extend it later when we introduce more advanced features.

Dataflow

Dataflows are specified through the following format:

nodes:
    - id: foo
      # ... (see below)
    - id: bar
      # ... (see below)

Inputs and Outputs

Each operator or custom node has a separate namespace for its outputs. To refer to outputs, the / syntax is used. This way, there are no name conflicts between operators.

Input operands are specified using the : / syntax, where is the internal name that should be used for the operand. The main advantage of this name mapping is that the same operator executable can be reused multiple times on different input.

Nodes

Nodes are defined using the following format:

nodes:
  - id: some-unique-id
    # For nodes with multiple operators
    operators:
      - id: operator-1
        # ... (see below)
      - id: operator-2
        # ... (see below)



  - id: some-unique-id-2
    custom:
      source: path/to/timestamp
      env:
        - ENVIRONMENT_VARIABLE_1: true
      working-directory: some/path

      inputs:
        input_1: operator_2/output_4
        input_2: custom_node_2/output_4
      outputs:
        - output_1
 
  # Unique operator
  - id: some-unique-id-3
    operator:
        # ... (see below)

Nodes must provide either a operators field, or a custom field, but not both. Nodes with an operators field run a dora runtime process, which runs and manages the specified operators. Nodes with a custom field, run a custom executable.

Custom Nodes

Custom nodes specify the executable name and arguments like a normal shell operation through the run field. Through the optional env field, it is possible to set environment variables for the process. The optional working-directory field allows to overwrite the directory in which the program is started.

To integrate with the rest of the dora dataflow, custom nodes must specify their inputs and outputs, similar to operators. They can reference outputs of both operators, and other custom nodes.

Operators

Operators are defined through the following format:

- id: unique-operator-id
  name: Human-Readable Operator Name
  description: An optional description of the operators's purpose.

  inputs:
    input_1: source_operator_2/output_1
    input_2: custom_node_1/output_1
  outputs:
    - output_1

  ## ONE OF:
  shared_library: "path/to/shared_lib" # file extension and `lib` prefix are added automatically
  python: "path/to/python_file.py"
  wasm: "path/to/wasm_file.wasm"

Operators must list all their inputs and outputs. Inputs can be linked to arbitrary outputs of other operators or custom nodes.

There are multiple ways to implement an operator:

  • as a C-compatible shared library
  • as a Python object
  • as a WebAssembly (WASM) module

Each operator must specify exactly one implementation. The implementation must follow a specific format that is specified by dora.

Example

nodes:
  - id: rust-node
    custom:
      build: cargo build -p rust-dataflow-example-node
      source: ../../target/debug/rust-dataflow-example-node
      inputs:
        tick: dora/timer/millis/10
      outputs:
        - random
  - id: runtime-node
    operators:
      - id: rust-operator
        build: cargo build -p rust-dataflow-example-operator
        shared-library: ../../target/debug/rust_dataflow_example_operator
        inputs:
          tick: dora/timer/millis/100
          random: rust-node/random
        outputs:
          - status
  - id: rust-sink
    custom:
      build: cargo build -p rust-dataflow-example-sink
      source: ../../target/debug/rust-dataflow-example-sink
      inputs:
        message: runtime-node/rust-operator/status

TODO: Integration with ROS 1/2

To integrate dora-rs operators with ROS1 or ROS2 operators, we plan to provide special bridge operators. These operators act as a sink in one dataflow framework and push all messages to a different dataflow framework, where they act as source.

For example, we plan to provide a to_ros_2 operator, which takes a single data input, which is then published to a specified ROS 2 dataflow.