agglo

Agglo: A Process-Anywhere Framework for Event Stream Processing

View the Project on GitHub kmgreen2/agglo

Agglo: A Process-Anywhere Framework for Event Stream Processing

Agglo is an experimental event stream processing framework that enables lightweight, reliable and scalable stream processing alongside persistent object storage, key-value storage or stream processing platforms.

Binge (BINary-at-the-edGE) is the main artifact of the framework. As the name implies, it is a single, compiled binary and can run as a stateless daemon, persistent daemon or as a stand-alone command. This allows the same binary to be deployed in edge gateways, as Kubernetes deployments, in load balancers, cloud (lambda) functions, or anywhere else you need to perform stream processing. The deployed artifact is simply a binary and a single JSON config that defines the event processing pipelines and configuration for connecting to external object stores, key-value stores, HTTP/RPC endpoints and stream processing systems.

See the examples to get an idea of how you can use Agglo.

Processes

See the processes page for examples of the stream processors currently supported.

Entwine: Immutable, Partial Ordering of Events

Agglo contains a special processor, called Entwine that allows you to scalably entwine two or more event histories.

Entwine is a special process that provides the ability to “entwine” event timelines. It is similar to existing blockchain technologies in that it relies on hash chains for immutability and zero-knowledge proofs for secure verification of individual event timelines. There is no consensus mechanism. The assumption is that each event timeline is generated by an individual, single organization, binge binary or collection of binge binaries.

An entwined stream of events is a collection of independent substreams that anchor to a ticker stream. Currently, the ticker stream is a simple hash chain that “ticks” at an interval and allows substreams to anchor. Each anchor request must contain a signed proof that must be consistent with the ticker’s view of that substream’s history. If verified, the anchor request succeeds. Today, the ticker process does not have a consensus mechanism, but one can easily be added by implementing the Ticker interface. For example, one can build a ticker using traditional consensus algorithms, such as Raft or Paxos, or build a ticker on top of a distributed ledger, such as Bitcoin or Etherium.

The HappenedBefore relation can be applied to all events on an entwined stream. This means that we can define a partial ordering of all events in the stream, where the individual events are independently managed.

A basic example of this could be binge agents deployed to every news outlet in the world. Each agent will process every news story from each outlet, where each outlet will maintain a local ordering in their own private substream. As long as each substream anchors to the ticker stream at regular intervals, we can determine the partial order in which stories are broken and maintain a sort-of truth index for news outlets (e.g. you cannot change history).