WallarooLabs/wallaroo
Distributed Stream Processing
repo name | WallarooLabs/wallaroo |
repo link | https://github.com/WallarooLabs/wallaroo |
homepage | https://www.wallaroolabs.com |
language | Pony |
size (curr.) | 281327 kB |
stars (curr.) | 1374 |
created | 2015-12-30 |
license | Apache License 2.0 |
What is Wallaroo?
Wallaroo is a fast stream-processing framework. Wallaroo makes it easy to react to data in real-time. By eliminating infrastructure complexity, going from prototype to production has never been simpler.
When we set out to build Wallaroo, we had several high-level goals in mind:
- Create a dependable and resilient distributed computing framework
- Take care of the complexities of distributed computing “plumbing,” allowing developers to focus on their business logic
- Provide high-performance & low-latency data processing
- Be portable and deploy easily (i.e., run on-prem or any cloud)
- Manage in-memory state for the application
- Allow applications to scale as needed, even when they are live and up-and-running
You can learn more about Wallaroo from our “Hello Wallaroo!” blog post.
Getting Started
Wallaroo can be installed via our handy Wallaroo Up command. Check out our installation page to learn more.
APIs
The primary API for Wallaroo is written in Pony. If you are interested in writing Wallaroo applications in other high-performance languages such as C, C++ or Rust, drop us a line; we’d be happy to engage on a commercial basis in creating language bindings that meet your needs.
Usage
Once you’ve installed Wallaroo, Take a look at some of our examples. A great place to start are our word_count or market spread examples in Pony.
"""
Word Count App
"""
use "assert"
use "buffered"
use "collections"
use "net"
use "serialise"
use "wallaroo_labs/bytes"
use "wallaroo"
use "wallaroo_labs/logging"
use "wallaroo_labs/mort"
use "wallaroo_labs/time"
use "wallaroo/core/common"
use "wallaroo/core/metrics"
use "wallaroo/core/sink/tcp_sink"
use "wallaroo/core/source"
use "wallaroo/core/source/tcp_source"
use "wallaroo/core/state"
use "wallaroo/core/topology"
actor Main
new create(env: Env) =>
Log.set_defaults()
try
let pipeline = recover val
let lines = Wallaroo.source[String]("Word Count",
TCPSourceConfig[String].from_options(StringFrameHandler,
TCPSourceConfigCLIParser("Word Count", env.args)?, 1))
lines
.to[String](Split)
.key_by(ExtractWord)
.to[RunningTotal](AddCount)
.to_sink(TCPSinkConfig[RunningTotal].from_options(
RunningTotalEncoder, TCPSinkConfigCLIParser(env.args)?(0)?))
end
Wallaroo.build_application(env, "Word Count", pipeline)
else
env.err.print("Couldn't build topology")
end
primitive Split is StatelessComputation[String, String]
fun name(): String => "Split"
fun apply(s: String): Array[String] val =>
let punctuation = """ !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ """
let words = recover trn Array[String] end
for line in s.split("\n").values() do
let cleaned =
recover val s.clone().>lower().>lstrip(punctuation)
.>rstrip(punctuation) end
for word in cleaned.split(punctuation).values() do
words.push(word)
end
end
consume words
class val RunningTotal
let word: String
let count: U64
new val create(w: String, c: U64) =>
word = w
count = c
class WordTotal is State
var count: U64
new create(c: U64) =>
count = c
primitive AddCount is StateComputation[String, RunningTotal, WordTotal]
fun name(): String => "Add Count"
fun apply(word: String, state: WordTotal): RunningTotal =>
state.count = state.count + 1
RunningTotal(word, state.count)
fun initial_state(): WordTotal =>
WordTotal(0)
primitive StringFrameHandler is FramedSourceHandler[String]
fun header_length(): USize =>
4
fun payload_length(data: Array[U8] iso): USize ? =>
Bytes.to_u32(data(0)?, data(1)?, data(2)?, data(3)?).usize()
fun decode(data: Array[U8] val): String =>
String.from_array(data)
primitive ExtractWord
fun apply(input: String): Key =>
input
primitive RunningTotalEncoder
fun apply(t: RunningTotal, wb: Writer = Writer): Array[ByteSeq] val =>
let result =
recover val
String().>append(t.word).>append(", ").>append(t.count.string())
.>append("\n")
end
wb.write(result)
wb.done()
Documentation
Are you the sort who just wants to get going? Dive right into our documentation then! It will get you up and running with Wallaroo.
More information is also on our blog. There you can find more insight into what we are working on and industry use-cases.
Wallaroo currently exists as a mono-repo. All the source that is Wallaroo is located in this repo. See repo directory structure for more information.
You can also take a look at our FAQ.
Need Help?
Trying to figure out how to get started? Drop us a line:
Contributing
We welcome contributions. Please see our Contribution Guide
For your pull request to be accepted you will need to accept our Contributor License Agreement
License
Wallaroo is licensed under the Apache version 2 license.