[Stackless] stackless python in a multicore environment

Cosmin Stejerean cstejerean at gmail.com
Fri Aug 24 18:56:22 CEST 2007

IMHO the problem with Python is the ability to take advantage of
multicore/multiprocessor CPUs out of the box. Multiple threads in Python
cannot run concurrently due to the global interpreter lock (GIL).

Stackless Python allows you to use lightweight threads (tasklets) that can
be switched with less overhead and it allows for cooperative multitasking
with the intention of making async programming easier. This helps with IO
bound applications where you can have tens of thousands of tasklets running
at the same time (try doing that with threads). It does not however allow
you to take advantage of multi-core or multi-processor resources. It simply
allows you to squeeze the most amount of work out of a single threaded
process. AFAIK attempting to use threads in Stackless makes things worse
(from what I could gather from the mailing lists).

Also Psyco, might not provide many gains for your code if you're using NumPy
to do most of the processing. Psyco will optimize Python code and if a lot
of your time is spent in Python it might provide some benefits. You can give
it a try but I would be surprised if it provides significant gains.

In order to perform your simulations concurrently you will likely need to
use multiple python processes that might run on either the same or multiple
boxes. Depending on your problem domain you might be able to split the data
into chunks and have each node process one set of data or you might need to
share all the data between all the processes. If you don't need to do any
IPC you can simly spawn multiple Python processes. Otherwise you might
attempt this with MPI or you can take a look at Pyro.

Hope this helps.

Cosmin Stejerean

On 8/24/07, seun.osewa at gmail.com <seun.osewa at gmail.com> wrote:
> Hello John,
> Python doesn't take advantage of multiple CPUs unless you us os.fork()
> on UNIX. The deadlock you experienced is probably just the two threads
> struggling for the Global Interpreter Lock.  Another (painful) way to
> achieve concurrency is to store all your data in a mmaped file
> accessible from multiple Python processes.
> But there's one other way to speed up your software on a single
> core:  Psyco!
>   http://psyco.sf.net
> Regards,
> Seun OSewa
> On 8/24/07, Chris Lee <c.j.lee at tnw.utwente.nl> wrote:
> > Hi Everyone,
> >
> > I suspect that this question has come up before, but a search of the
> > archive revealed nothing so please forgive me if you are tired of
> answering.
> > Basically I need some hints on using python (or stackless python) on a
> > multi core CPU.
> > Let me give some details of the project:
> > I am running simulations on light traveling through a very disordered
> > medium. I do this using a ballistic approximation, which essentially
> > means that I assume the light consists of particles and then generate a
> > bunch of random numbers for each particle. I use the random numbers to
> > determine the path taken by the particle.
> >
> > In practice, I simulate between 1e4 and 1e6 particles at a time taking
> > advantage of numpy and python to keep the code clean while still getting
> > good speed from a single CPU core. However, the simulations are becoming
> > more sophisticated and I would like to be able to take advantage of
> > multiple core CPUs and maybe even multiple computers.
> >
> > My first attempt was a disaster. I used the threading module and simple
> > split the task into two threads. The interpreter put each thread on a
> > separate CPU and bus deadlock ensued (each CPU was trying to access
> > 136*5e5 bytes of memory simultaneously).
> >
> > I realized that I would need finer grained control over how the data was
> > apportioned between threads, but doing this using the python queue and
> > events starts to look a bit messy again. That was when I happened about
> > stackless python and tasklets. With tasklets I get the finegrained
> > control over data access that allows to me to ease the bus contention
> > ... but all the tasklets run on a single core. Even if I take the
> > trouble to spawn python threads and the threads run method invokes a
> > tasklet, they all run on the same core.
> >
> > Can someone give me some advice towards making use of multicores,
> > preferable with stackless--the code is sooo much nicer--but I'll take
> > any python based solution at this point.
> >
> > Cheers
> > Chris
> >
> > --
> > **********************************************
> > *  Chris Lee                                 *
> > *  Laser physics and nonlinear optics group  *
> > *  MESA+ Institute                           *
> > *  University of Twente                      *
> > *  Phone: ++31 (0)53 489 3968                *
> > *  fax: ++31 (0) 53 489 1102                 *
> > **********************************************
> >
> >
> > _______________________________________________
> > Stackless mailing list
> > Stackless at stackless.com
> > http://stackless.com/cgi-bin/mailman/listinfo/stackless
> >
> --
> Seun Osewa
> http://www.nairaland.com [vast Nigerian forum]
> _______________________________________________
> Stackless mailing list
> Stackless at stackless.com
> http://stackless.com/cgi-bin/mailman/listinfo/stackless
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.stackless.com/pipermail/stackless/attachments/20070824/fdd2dca4/attachment.htm>
-------------- next part --------------
Stackless mailing list
Stackless at stackless.com

More information about the Stackless mailing list