[Stackless] Gsoc 2009:Asynchronous IO support
richard.m.tew at gmail.com
Mon Mar 23 15:54:54 CET 2009
On Mon, Mar 23, 2009 at 3:03 AM, kartik rustagi <kashes911 at gmail.com> wrote:
> Hi, I am interested in implementing Asynchronous IO support in
> stackless. As it has been mentioned on the wiki that what you are
> looking for is something similar to the way sockets do IO. I have some
> experience in Asynchronous servers (asyncore module), and asyncore
> uses a queue along with a dictionary mapping to implement Asynchronous
> IO. In the case so stackless microthreads a similar queue can be
> maintained and a dictionary mapping too having mapping of
> microthreads. Am I thinking in the right direction? Kindly suggest me
> the way you think this should be implemented.
I think that perhaps the wiki is not as clear as it needs to be with
regard to the requirements of this project. Read this and take it
* Language usage:
This project will not be implemented in Python. It will be
implemented in C or C++ and involve using the best low-level
asynchronous IO resources for a specific goal.
* Some notes on asynchronous IO resources:
We already have a cross-platform implementation written in Python
called "stacklesssocket", which can be found here:
The problem is that it uses the lowest common denominator in
asynchronous IO resources, "select", which you can read about at the
There are a wealth of articles about it and its limitations on the
internet to be searched for. It does this by using "asyncore".
The problem with "select" is that it is not an optimal solution for a
variety of reasons, which web searching should elaborate on. On
non-Windows platforms, people tend to use another low level resource
which asyncore exposes, "poll". This is also touched on in the
wikipedia article, but again you should be able to find better ones by
searching. "poll" however, is of course not available on Windows.
On Windows, the standard resource used is called "IO completion
ports". You can read an article about them in the following link,
again web search for more.
* Solution API
If you look at the "stacklesssocket" module mentioned above, you will
see that it has a way of installing itself in place of the normal
socket module. This means that tasklets which use the socket module
no longer block the whole interpreter, but instead magically only
block the tasklet which does the socket operation.
The core concept of this is that the stackless.tasklet blocks on a
unique stackless.channel and the channel reference is handed to
whatever wrapper is written for the chosen asynchronous IO resource.
When the event comes in with the result of an asynchronous IO
operation, it is sent through the channel to the tasklet, which
awakens it with the result of its socket operation.
Now, besides being written in the easiest, slowest cross-platform way,
"stacklesssocket" has two other flaws:
- There is no set of unit tests which run against both it, and the
original socket module, verifying that the behaviour is the same.
- It doesn't implement all the functionality that socket does.
This project is aimed at providing a 100% compatible replacement for
the original socket module, which works in the same manner as
"stacklesssocket" but better.
* Language usage again:
The goal is that a student would take the CCP project "Stackless IO"
mentioned in and linked from the wiki, which is currently Windows
based, and enhance and extend it to make it work on non-Windows
platforms. "Stackless IO" is written in C++, therefore the
cross-platform support will need to also be in C++.
- Unit tests will need to be provided to verify that the use of this
framework from Python behaves the same as use of the existing socket
module, taking into account that the former blocks a calling tasklet
and the latter blocks a calling thread.
- Unit tests will need to be provided to verify that the behaviour of
the resulting cross-platform framework is consistent across platforms.
- Example code will need to be provided to illustrate the scalability
of this solution.
Using restructured text:
- Clear documentation will need to be provided with code snippets and
potentially "programmer art" illustrating the workings, benefits and
usage of the resulting "Stackless IO" framework.
Why restructured text? This is what the Python language documentation
is written in, it can be used to generate HTML, CHMs and other
formats. Future Stackless documentation will also be written in this,
and in the eventuality that it proves clean and elegant to incorporate
this framework into the Stackless distribution so that IO just works
with minimal user effort, the documentation for this framework will be
included with the Stackless documentation.
* Final thoughts:
"Stackless IO" was designed to provide a low-latency scalable
implementation. It was benchmarked and analysed in order to ensure it
was as optimal in these two areas as it was possible to make it,
during its implementation. The cross-platform changes to the
framework in general and support will need to be done in such a way to
both emulate this approach in the new code and maintain it in the
This project is listed as hard on the wiki page. A skilled programmer
with a moderate breadth of knowledge and familiarity with C++ should
be able to complete it within a month (given Google's required 40 hour
working weeks), which would leave time for working on other non-socket
related support (file IO, subprocess, etc) so that as much blocking or
callback based IO in the runtime can be supported. A programmer
without these things, might take the whole three months for just the
cross-platform support and the other supporting work required.
This is a challenging and complex project but it is also a rewarding
one, in terms of the experience and knowledge it will give in
low-level networking details.
Let me know if you have any further questions :-)
More information about the Stackless