forked from ssssam/iris
Message passing, Work-Stealing, Lock-Free, Dynamic Scheduling, and other buzzwords for GLib
License
crnt/iris
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
--------------------------------------------------------------------------- Iris - Message Passing and Asynchronous Toolkit for GLib --------------------------------------------------------------------------- Last Updated: Saturday April 22nd, 2009 Iris is an asynchronous toolkit for GLib to ease development of concurrent applications through message passing. Ontop of the message passing core, is task management for building asynchronous workflows. I. Introduction II. API Overview III. Message Delivery Cycle IV. Concurrent vs Exclusive V. Schedulers VI. Notes --------------------------------------------------------------------------- I. Introduction --------------------------------------------------------------------------- Right now there are a couple constructs you need to be familiar with. At the most basic, is the IrisMessage. It encapsulates your data or operation that is to be performed. The messages must be delivered somewhere. That is what IrisPort is for. It is the post office you post your messages to. IrisPort's are lightweight delivery mechanisms for your messages. Port's will attempt to send the message to a receiver, which is repsonsible for turning a message into an actionable work item. The receiver can push the work-item onto a scheduler who will actually manage executing the action. Another neat feature is the Scheduler Manager. This system allows you to have many well-purposed schedulers based on what work you are trying to perform. The scheduler manager will move threads between schedulers based on the scheduler's requirements to attempt to balance the active threads to the schedulers who need it. --------------------------------------------------------------------------- II. API Overview --------------------------------------------------------------------------- Iris can be split up into three areas. It has reusable lock-free data structures, message passing, and high-level constructs to make writing concurrent applications easier. Currently, Iris has the following lock-free data structures. * IrisStack - A lock free Stack * IrisLFQueue - A lock free queue * IrisFreeList - A lock free free-list * IrisRRobin - A lock free round-robin Also, not lock-free, however the fast path is, is the * IrisWSQueue which is used by the work-stealing scheduler for balancing work items efficiently between multiple cores. As previously mentioned, Iris is built upon message passing. The message passing system is built upon some of the previous data structures. Lets look at the data structures involved in the message passing. You wont neccessarily be using all of them all the time. * IrisMessage - A message encapsulation * IrisPort - A post-office so to speak for your messages * IrisReceiver - Converts messages delivered on ports to executable work * IrisArbiter - Arbitrates when and how receivers can execute * IrisScheduler - Manages scheduling of work items on cpus You can see examples of how to use these constructs in following chapters. High Level Constructs: * IrisTask - Single shot execution like Twisted Deferreds * IrisService - Helper for writing concurrent services As for higher level constructs, currently there is IrisTask and IrisService. IrisTask is a single shot work item that can have callbacks and errbacks just like you do with Python Twisted. However, Tasks are implemented as a class similar to that of Apple's NSOperation. So if it is easier for you to maintain your tasks as self contained classes, go right ahead. The goal of IrisService is to help in building application services that need to perform concurrent operations. Like the serving of HTTP, or data storage, or whatever really. IrisService is also a GObject, like IrisTask, and provides some simple methods you can override. More detail on these will be provided in their respective documentation. --------------------------------------------------------------------------- III. Message Delivery Cycle --------------------------------------------------------------------------- Lets create a basic example that sets up a port for delivering messages and a simple callback executed when messages are received. #define MSG_ID (1) static gboolean done = FALSE; static void callback (IrisMessage *message, gpointer data) { // all of this is superfulous g_debug ("Hello, World!"); gboolean *a = user_data; *a = TRUE; } static void setup (void) { IrisPort *port = iris_port_new (); /* first param is our scheduler, null for default */ IrisReceiver *recv = iris_arbiter_receive (NULL, port, callback, &done); iris_port_post (port, iris_message_new (MSG_ID); } What this does is create a new delivery port (IrisPort) that we can send messages to. When a message is delivered (from the port to the receiver), which is done for you by the port, the callback will be executed. You can learn more about how to control this throug the use if IrisArbiter by reading the docs. The IrisReceiver turns the callback into a work item and queues it into a scheduler. You can use the default scheduler or pass the scheduler as the first parameter. I recommend the IrisWSScheduler which does work-stealing. Lets look at the example in Vala quick, might make more sense. Also note, this should feel a bit familiar to CCR. It's good stuff. using GLib; using Iris; static void setup () { var port = new Port (); Arbiter.receive (null, port, m => { debug ("Hello, World"); }); } --------------------------------------------------------------------------- IV. Concurrent vs Exclusive --------------------------------------------------------------------------- One of the major problems in developing concurrent systems is managing the state changing operations while the actual concurrent service is being provided. Typically, complex locking scenarios are implemented which are both painful to read and maintain. This is where the arbiter can help us greatly. With an arbiter, we can use message passing into two ports (or sets of ports) and let the arbiter manage how they run. This means the highly concurrent portions of our code can continue to be concurrent without the complex locking schemes for dealing with state changes. The reason for this, is that the arbiter can bleed off the concurrent messages when a state change message is received so that it happens exclusively. This means we can optimize the locking patterns in a single place (the arbiter/receiver/port code-paths) and use the concept to develop our services around handling messages efficently (and ideally lock-free in our own code). I will introduce a new GObject soon that makes this easier, IrisService. It should feel similar in nature to Erlang's gen_server behaviour in how you handle messages. --------------------------------------------------------------------------- V. Schedulers --------------------------------------------------------------------------- The schedulers are responsible for the actual execution of work items within iris. Message handlers that are responding to dispatched messages occur on the scheduler in addition to real work-items for things such as IrisTask's. Due to the variance in work-loads for specific problem sets, the concept of multiple schedulers was introduced. This allows for fine-grained custom schedulers for a specific work-load (or a library perhaps). This problem had the potential to wreak havoc on the number of threads that were created within an application. To avoid this, we created a scheduler manager, which can dynamically move threads around between schedulers as they become free or under-utilized. Currently, there are a few schedulers that are available. The default scheduler is typical to what you see in most basic thread pools. It has a global work queue that all the threads share. This is a cess-pool for high contention on heavy work-loads. However, its a nice, safe scheduler for basic work to be performed. Especially when the lifetime of the work items grows (thus lowering the locking contention overhead of the queue). The IrisWSScheduler, a work-stealing scheduler, performs much better under heavy work-loads. It is designed to push items to a global queue only if the thread pushing the work item is not already part of the scheduler. It turns out that this is not the common case in concurrent systems. Typically, threads performing work recursively create more work for themselves based on the nature of the work-item. This is where the work-stealing really shines, because these recursive work items are pushed onto a local queue for the thread. They are placed in the queue in a position that allows the recently yielded work item to be taken soon in the scheduler which provides a good chance for a cpu cache hit. When a thread runs out of local work items, it will check back with the global queue then steal from a neighbor thread. The IrisLFScheduler, a lock-free scheduler, is not really an ideal scheduler for anything. It was simply a test to see how well it could perform for tasks created outside of a scheduler thread. It does okay for this; I believe the basic example runs fastest on this scheduler. Note that it is very hard on the cpu's and spins a lot. Definitely the wrong answer for low-power cpus and embedded devices. --------------------------------------------------------------------------- VI. Notes --------------------------------------------------------------------------- The GSlice allocator is used for the most part throughout iris. This could cause a problem as we start scaling out threads due to the thread-local caches of allocations that are used. If the creator of these allocations is not the destroyer of the the allocations, many extra locks can be taken. Pulse audio has worked around this to a degree using their own free-lists for allocations. After the API is in good shape we should look at tuning this as well. Right now both ports and receivers could be heavy consumers of locks. We need a good lock-free queue to utilize here to reduce the time spent in the kernel. Right now, in the early API scoping phase, that is pretty significantly high. Again, Pulse audio has a lock-free queue we should look at for use instead of the GAsyncQueue used currently. Perhaps this can also be the base queue for our work-stealing scheduler implementation.
About
Message passing, Work-Stealing, Lock-Free, Dynamic Scheduling, and other buzzwords for GLib
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published