[Stackless] The case for Twisted

Joachim König-Baltes joachim.koenig-baltes at emesgarten.de
Mon Aug 13 09:40:00 CEST 2007


Christopher Armstrong schrieb:
> This design looks nice on the surface, but I can't think of any
> efficient way to to implement it. Existing mechanisms for asynchronous
> I/O on all operating systems only work efficiently if you pass all
> event sources to one blocking function; select() and poll() and so
> forth need to get all of the file descriptors at once to have any
> chance of being efficient. If you had one tasklet per event source
> that's responsible for telling the scheduler whether there was data on
> that event source, you would have to call select() in each tasklet, or
> something, which would ruin performance. There's no way you'd be able
> to get ten thousand concurrent connections from that. Is there some
> other implementation strategy you had in mind?
>   

I have implemented a prototype of it based on greenlets. It works the 
following way:

- the main task(let) will be the scheduler
- new tasks can be created everywhere. The scheduler (main task) will be 
informed about it and
  add it to its list of tasks it manages
- whenever a tasks wants to do a possibly blocking call (or when it 
wants to wait for one
  of a number of resources) it informs the scheduler by calling an
      resultEventList = event(eventList, timeout)
  function that does an implicit switch to the scheduler. The 
resultEventList informs
  about resources on which events have occured that the task is 
interested in (when
  the scheduler continues it).
  Of course, these event() calls can be hidden in custom read(), 
write(), send(), receive()
  ... calls for the cases where the task only waits for a single resource.
- the scheduler then adds the events of that task to the combined lists 
of events for all
  the tasks, analyzes it and continues tasks as resources are available. 
The task can then
  perform its call on the resource without blocking (but see below for 
restrictions)
- if no resource is immediately evailable, the scheduler issues a 
select/poll/kqueue call
  (my implementation is currently only using kqueue, but could be easily 
changed to
  use pyevent)

There are of course some fundamental problems of the underlying OS which 
can lead
to unwanted blocking, among them:

- if reading from a socket, the number of bytes that can be read without 
blocking can
  be determined in advance. This is unfortunately not possible for 
ordinary files, so
  even if the file descriptor is readable, a read of a number of bytes 
could lead to
  blocking. Not a big issue for local files, but NFS or samba hosted 
files may lead to
  blocking inside the taks that does the read.

- the scheduler only works inside one thread. If one wants to combine it 
with threads,
  then a natural extension to the event class would be an event for 
joining a thread
  (pthread_join()) for synchronization purposes, but there is no kind of 
select/poll/kqueue
  to check if the join would succeed, so the join cannot be done without 
blocking. So
  inter-thread synchronization is a problem.

For me, reading the GNU pth sources were a good reading to understand 
the difficulties
in emulating "parallelism" in a single thread/task without blocking.

Joachim









_______________________________________________
Stackless mailing list
Stackless at stackless.com
http://stackless.com/cgi-bin/mailman/listinfo/stackless



More information about the Stackless mailing list