Skip to content

Latest commit

 

History

History
111 lines (72 loc) · 3.23 KB

File metadata and controls

111 lines (72 loc) · 3.23 KB

Data Processing Pipeline

This project is a distributed system built to simulate, collect, process, store, and serve sensor data. Designed as an Elixir umbrella project, it comprises multiple applications, each handling a specific part of the data pipeline:

  • Data Generator: Simulates sensor readings and publishes them.
  • Data Compactor: Subscribes to sensor data, compacts it, and publishes compacted data.
  • Data Server: Consumes compacted data, stores it, and exposes via a JSON API.
  • Schema: Contains the Protobuf schema definitions for the data model, as well as encoding/decoding helpers.

It’s primarily a learning project, aimed at exploring and gaining hands-on experience with Elixir's capabilities.

High level overview

Key technologies used

  • Messaging

    • Kafka

      • Producer
      • Consumer
    • MQTT

      • Publisher
      • Subscriber
  • Storage

    • MongoDB

      • Find
      • Insert
      • Aggregate
  • Data formats

    • JSON:

      • Decoding
      • Encoding
    • Protobuf:

      • Decoding
      • Encoding
  • HTTP

    • Server

      • Authentication
        • JWT
    • Client

  • Continuous Integration:

  • Elixir-specific

    • Custom Mix task for Protobuf schema compilation
    • DynamicSupervisor for GenServers
    • Broadway for data ingestion abstraction
    • Custom Storage and Producer behaviours as an abstraction layer
    • Interoperability with Erlang
  • Misc:

    • Docker
    • Contextive for Ubiquitous Language
    • Structured logging (JSON)

Running locally

Requires Elixir 1.17+ and Docker Compose for running Kafka, MongoDB, and Mosquitto (MQTT broker).

# Start Kafka, MongoDB, MQTT.
docker compose -f docker/docker-compose.yml up -d

# Export secrets for Kafka and Mongo.
export $(cat .env.local | xargs) && iex -S mix

# Install and compile dependencies.
mix deps.get && mix deps.compile

# Start the apps.
iex -S mix

Now you can navigate to http://localhost:8080/api/sensors/facility_1 to verify it's up and running.

Running in Docker

If you don't have Elixir installed, the demo can be run in a Docker container.

# Build the image: the release (build artefact) will include all the three applications.
docker build --no-cache --tag data-processing-pipeline -f docker/Dockerfile . --progress=plain

# Start the app along with Kafka, MongoDB, MQTT.
docker compose -f docker/docker-compose.yml -f docker/docker-compose.override.yml up

Kafka container takes some time to stabilize, so encountering errconnrefused errors shortly after startup is normal.