Skip to content
forked from samblg/hurricane

Hurricane is a C++ based distributed real-time processing system.

License

Notifications You must be signed in to change notification settings

wqx081/hurricane

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#Hurricane Real-time Processing

##Brief Introduction Hurricane is a C++ based distributed real-time processing system. Different from the batch processing system like Apache Hadoop, Hurricane uses stream model to process data. It also supports multi-language interfaces, such as Python, JavaScript, Java and Swift.

We imitate the interface of Apache Storm and simplify it, so the developer familiar with Storm can learn the Hurricane easily.

Basic concepts

Topology

The logic for a realtime application is packaged into a Hurricane topology. A Hurricane topology is analogous to a MapReduce job. One key difference is that a MapReduce job eventually finishes, whereas a topology runs forever.

Stream

The stream is an important abstraction in Hurricane. A stream is an unbounded sequence of tuples that is processed and created in parallel in a distributed fashion. Streams are defined with a schema that names the fields in the stream's tuples.

Tuple

Tuple is the data unit transferred in stream. The spout and bolt need to use tuple to organize the data. Tuples can contain integers, longs, shorts, characters, floats, doubles and strings.

Spout

A spout is a source of streams in a topology. Generally spouts will read tuples from an external source and emit them into the topology. Spouts can either be reliable or unreliable. A reliable spout is capable of replaying a tuple if it failed to be processed by Hurricane, whereas an unreliable spout forgets about the tuple as soon as it is emitted.

Bolt

All processing in topologies is done in bolts. Bolts can do anything from filtering, functions, aggregations, joins, talking to databases, and more.

##Installation ###Dependencies Hurricane depends on Meshy network library (libmeshy, a transportation layer library designed for Hurricane real-time processing), you could find Meshy in deps folder, build Meshy before starting to build Hurricane. Hurricane supports to be built by Makefile (gmake) and Kakefile (Kake). Refer to the section "Build with Kake" for more details on how to build and install Hurricane.

Build Hurricane using Makefile

For the sake of convenience of Linux users' usage, we provided Makefile to build Hurricane. It's very simple to build with Makefile, simply type the following command:

make

Build Hurricane using Kake

Brief Introduction to Kake

Kake is a building system following the the "convention over configuration" paradigm

Dependencies

Like some of build systems (e.g. scons), Kake build system is written in python3 and using PyYAML library, thus we need to install Python, libyaml and PyYAML.

Install the following 3rd-party packages if you are using Ubuntu:

sudo apt-get install libyaml-0-2 libyaml-dev
sudo apt-get install python3 libpython3-dev python3-pip

Then you may use pip to install PyYAML:

sudo pip3 install PyYAML

Install Kake

It's very simple to install Kake build system into your environment, you just have to clone it: Assume you are going to install Kake build system into ~/apps:

cd ~/apps
git clone https://git.oschina.net/kinuxroot/kake.git 

Then add one line in ~/.bashrc with your favorite text editor:

export KAKE_HOME=~/apps/kake

Finally, add the path kake/bin right after PATH:

export PATH="${PATH}:${KAKE_HOME}/bin"

Verify Installation

If you type in kake in a folder without Kakefile, it will output:

Project file not exists.

This indicates that Kake build system is ready to use in your environment, if you want to use it, just simply type:

kake

Get Started

After the installation, you can write a simple topology described in docs/introduction.md. Then submit the output shared library to Hurricane.

About

Hurricane is a C++ based distributed real-time processing system.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 81.0%
  • C 5.7%
  • Java 4.2%
  • Python 3.8%
  • JavaScript 3.6%
  • Makefile 1.7%