Skip to content
/ rDSN Public
forked from microsoft/rDSN

Robust Distributed System Nucleus (rDSN) is an open framework for quickly building and managing high performance and robust distributed systems.

License

Notifications You must be signed in to change notification settings

lshmouse/rDSN

 
 

Repository files navigation

Build Status Build status

All pull requests please now go to https://github.com/imzhenyu/rdsn for automatic integration with latest version. We will preriodically update this repo. Thank you.


Robust Distributed System Nucleus (rDSN) is a framework for quickly building robust distributed systems. It has a microkernel for pluggable components, including applications, distributed frameworks, devops tools, and local runtime/resource providers, enabling their independent development and seamless integration. The project was originally developed for Microsoft Bing, and now has been adopted in production both inside and outside Microsoft.

Top Links

  • [Case] RocksDB made replicated using rDSN!
  • [Tutorial] Build a counter service with built-in tools (e.g., codegen, auto-test, fault injection, bug replay, tracing)
  • [Tutorial] Build a scalable and reliable counter service with built-in replication support
  • [Tutorial] Build a perfect failure detector with progressively added system complexity
  • [Tutorial] Plugin my own network implementation for higher performance
  • Latest documentation
  • Installation
  • turn legacy local components (e.g., a local storage, or a micro service) into highly available and reliable service with service frameworks
  • develop services with an event-driven service lib similar to libevent, Thrift, and GRPC, with additional:
    • a rich set of service API in addition to RPC
    • built-in tools for testing, debuging, monitoring, and operation
    • built-in service frameworks for scaliability, availabity, and reliability
  • build new frameworks with strong tooling support, benefiting the service applications immediately
  • build new tools with dedicated Tool API, benefiting the services and frameworks transparently
  • more as you can imagine.
  • reduced system complexity via microkernel architecture: applications, frameworks (e.g., replication, scale-out, fail-over), local runtime libraries (e.g., network libraries, locks), and tools are all pluggable modules into a microkernel to enable independent development and seamless integration (therefore modules are reusable and transparently benefit each other)

rDSN Architecture

  • flexible configuration with global deploy-time view: tailor the module instances and their connections on demand with configurable system complexity and resource allocation (e.g., run all nodes in one simulator for testing, allocate CPU resources appropriately for avoiding resource contention, debug with progressively added system complexity)

rDSN Configuration

  • transparent tooling support: dedicated tool API for tool development; built-in plugged tools for understanding, testing, debugging, and monitoring the upper applications and frameworks

rDSN Architecture

  • auto-handled distributed system challenges: built-in frameworks to achieve scalability, reliability, availability, and consistency etc. for the applications

rDSN service model

  • dist.service.stateful.type1: a production Paxos framework to quickly turn a local component (e.g., rocksdb) into an online service with replication, partition, failure recovery, and reconfiguration supports
  • dist.service.stateless: a scale-out and fail-over framework for stateless services such as Memcached
  • tools.common
    • network libraries on Linux/Windows supporting rDSN/Thrift/HTTP messages at the same time
    • asynchronous disk IO on Linux/Windows
    • locks, rwlocks, semaphores
    • task queues
    • timer services
    • performance counters
    • loggers (screen, simple)
  • tools.hpc: high performance counterparts for the above modules
  • tools.common
    • simulator debugs multiple nodes in one single process without worry about timeout
    • tracer dumps logs for how requests are processed across tasks/nodes
    • profiler shows detailed task-level performance data (e.g., queue-time, exec-time)
    • fault-injector mimics data center failures to expose bugs early
    • global-checker enables cross-node assertion
    • replayer reproduces the bugs for easier root cause analysis
  • tools.explorer: extracts task-level dependencies automatically
  • Web studio to visualize task-level performance and dependency information live Demo
Other distributed providers, libraries, and tools

rDSN borrows the idea in many research work, from both our own and the others, and tries to make them real in production in a coherent way; we greatly appreciate the researchers who did these work.

License and Support

rDSN is provided on Windows and Linux, with the MIT open source license. You can use the "issues" tab in GitHub to report bugs.

About

Robust Distributed System Nucleus (rDSN) is an open framework for quickly building and managing high performance and robust distributed systems.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 59.6%
  • JavaScript 24.5%
  • C 4.0%
  • C# 3.0%
  • PHP 2.7%
  • HTML 1.9%
  • Other 4.3%