[Stackless] The case for Twisted
Joachim König-Baltes
joachim.koenig-baltes at emesgarten.de
Mon Aug 13 09:40:00 CEST 2007
Christopher Armstrong schrieb:
> This design looks nice on the surface, but I can't think of any
> efficient way to to implement it. Existing mechanisms for asynchronous
> I/O on all operating systems only work efficiently if you pass all
> event sources to one blocking function; select() and poll() and so
> forth need to get all of the file descriptors at once to have any
> chance of being efficient. If you had one tasklet per event source
> that's responsible for telling the scheduler whether there was data on
> that event source, you would have to call select() in each tasklet, or
> something, which would ruin performance. There's no way you'd be able
> to get ten thousand concurrent connections from that. Is there some
> other implementation strategy you had in mind?
>
I have implemented a prototype of it based on greenlets. It works the
following way:
- the main task(let) will be the scheduler
- new tasks can be created everywhere. The scheduler (main task) will be
informed about it and
add it to its list of tasks it manages
- whenever a tasks wants to do a possibly blocking call (or when it
wants to wait for one
of a number of resources) it informs the scheduler by calling an
resultEventList = event(eventList, timeout)
function that does an implicit switch to the scheduler. The
resultEventList informs
about resources on which events have occured that the task is
interested in (when
the scheduler continues it).
Of course, these event() calls can be hidden in custom read(),
write(), send(), receive()
... calls for the cases where the task only waits for a single resource.
- the scheduler then adds the events of that task to the combined lists
of events for all
the tasks, analyzes it and continues tasks as resources are available.
The task can then
perform its call on the resource without blocking (but see below for
restrictions)
- if no resource is immediately evailable, the scheduler issues a
select/poll/kqueue call
(my implementation is currently only using kqueue, but could be easily
changed to
use pyevent)
There are of course some fundamental problems of the underlying OS which
can lead
to unwanted blocking, among them:
- if reading from a socket, the number of bytes that can be read without
blocking can
be determined in advance. This is unfortunately not possible for
ordinary files, so
even if the file descriptor is readable, a read of a number of bytes
could lead to
blocking. Not a big issue for local files, but NFS or samba hosted
files may lead to
blocking inside the taks that does the read.
- the scheduler only works inside one thread. If one wants to combine it
with threads,
then a natural extension to the event class would be an event for
joining a thread
(pthread_join()) for synchronization purposes, but there is no kind of
select/poll/kqueue
to check if the join would succeed, so the join cannot be done without
blocking. So
inter-thread synchronization is a problem.
For me, reading the GNU pth sources were a good reading to understand
the difficulties
in emulating "parallelism" in a single thread/task without blocking.
Joachim
_______________________________________________
Stackless mailing list
Stackless at stackless.com
http://stackless.com/cgi-bin/mailman/listinfo/stackless
More information about the Stackless
mailing list