[Stackless] verify my performance numbers?
Santiago Gala
santiago.gala at gmail.com
Tue Jan 30 23:47:38 CET 2007
El mar, 30-01-2007 a las 02:37 +0000, Andrew Dalke escribió:
> On Jan 29, 2007, at 12:13 AM, Santiago Gala wrote:
> > My version substitutes the bloated and hyperlocked Queue class by a
> > deque + simple locking, using with for cleaner code.
>
> Interesting. I didn't think about that because the deque class
> doesn't have built-in guarantees that it would be usable across
> multiple threads. It works in CPython because of the implementation
> guarantees, but I don't know know if that can be considered portable.
>
> Because the CPython implementation uses a C type and none of the
> append or pop calls are re-entrant it works.
>
No need for it being re-entrant, it works because both producer and
consumer calls that use it are wrapped with a shared lock, i.e., marked
as critical sections in a fairly standard way. Those are the
with lock:
<code>
calls in the producer and the consumer. Basically only one thread
touches it at a time.
The Queue class is supposed to do the same, but it looks, on a quick
glance, overoptimized, which makes it slower. I could be wrong, I'm not
an expert pythoneer, but it looks like too much python code for what it
is supposed to do.
> > A substantial part of the overhead, thus, is due to the Queue class,
> > though
> > stackless is still 50%+ faster than a simple deque+lock solution, plus
> > cleaner code
>
> In checking the analysis I did realize that the thread doing
> the pushes onto the Queue/deque (the producer side) can be faster
> than the consumer. I tested it and found that the Queue can have
> several thousand pending messages while the stackless version of
> course only has one.
>
> Using Queue.Queue(1) nearly doubled the run-time but I need
> to verify that. I got the exception
>
> Unhandled exception in thread started by
> Error in sys.excepthook:
>
> Original exception was:
>
> and need to figure out what that means.
>
I guess it is the same that I got before the test for len(queue): the
extracting thread arrives before the pushing one has put anything. I let
the consumer sleep for 0.01 seconds while the queue was empty.
But, as a java programmer, and leaving aside potential problems in the
Queue class, I like a lot how simple is the stackless approach for this
kind of producer-consumer, and 50% faster than a much more tricky and
bug prone implementation like the shared-lock one.
> Also, the iTunes XML has a reference to a DTD and the expat
> parser interface tries to resolve it, so part of the overhead
> is doing a network lookup of
> "http://www.apple.com/DTDs/PropertyList-1.0.dtd"
>
>
> Andrew
> dalke at dalkescientific.com
>
>
> _______________________________________________
> Stackless mailing list
> Stackless at stackless.com
> http://www.stackless.com/mailman/listinfo/stackless
_______________________________________________
Stackless mailing list
Stackless at stackless.com
http://www.stackless.com/mailman/listinfo/stackless
More information about the Stackless
mailing list