[Stackless] Multi-CPU Actor Based Python
senn at maya.com
Wed Nov 19 16:50:08 CET 2008
On Nov 19, 2008, at 9:49 AM, Timothy Baldridge wrote:
> What I'm more looking to reduce, is the overhead of transferring data
> from one Python VM to another. In current implementations transferring
> data from one VM to another requires pickling data (and that requires
> traversing the entire object being transmitted, pickling each part
> along the way), transmitting it across the wire, then unplickling it
> at the other end. So where talking thousands of cycles.
> In the method I'm proposing, you could have multiple "VMs" in the same
> process, with a unified GC, these VMs would would share nothing. If
> all messages are immutable, then all that is required is to copy a
> pointer from one VM to the other and increment the GC ref count on the
> message. That's what, 100-200 cycles or so (yes I did just pull that
> out of the air).
> My core idea here is that multitasking in modern languages isn't as
> pervasive because of the overhead/risks involved. In C you have shared
> memory issues. In Erlang, well, many people can't stand the Erlang
> syntax. And in Python you can't have to pass messages via
> So does anyone else see this being possible, or am I off my rocker?
Hm. I'll defer judgment on the "off the rocker" bit... :-)
However there does seem to be a fundamental issue here that probably
goes to the basis of how the universe works.
Locality is scarce. You make things fast by making them fit in a
small space so that the speed of light does not matter.
You decouple their behavior from other things that are "far"
You make things robust and architectural ("componentized") by making
them "big"... with well-defined boundaries that take up space
and well-defined interactions that require synchronous coupling
at the edges.
So you want-your-cake-and-to-eat-it-too... you're not the first one...
and perhaps you shouldn't be discouraged by no-sayers... you might
just invent something wonderful... However there are many issues
you are not considering (even in your simple example):
-- notice that both incrementing and decrementing the refcnt
have to involve some sort of interlock. (Not to mention GC
and heap structure management!)
-- notice that you are starting to change the very nature of
python. If, for example, I want several processes co-operating
to add results to a search list, I can't just pop them into
the same object, I now need to invent a whole structure to
"re-combine" things again. How much more memory am I going
to use to do that? How "pythonic" is it going to look when
I'm done? Or will it look more like an Erlang program? :-)
So it could be a fundamental trade-off:
"fast", "safe", "nice-looking"**; choose 2!
Erlang is fast/safe but non-nice-looking (ugly?).
Python is nice/safe but slow.***
C is fast/nice but unsafe.
** I almost said "understandable" rather than "nice-looking"...
I'm not sure exactly what the right word is.
*** Python would be slower if it were safer for multiple-threads;
i.e. the GIL is a hack to keep python safe by trading
multiple-thread utilization for single-thread speed.
Now... Criticism aside: you are probably on the right track:
I believe the future is in architectures that specifically divide
computing into asynchronously communicating components. But IMHO
the interesting question (currently) is probably more along the
lines of how to get human-programmers to do the dividing well
(hence Michael's "conciously partitioning" comment) rather
than how to have an environment that abstracts the problem away.
BTW: I'd love to have a Stackless Python w/o a GIL... I just
can't afford to do the (ton of) work! I gather the "posh" thing
Michael mentioned is a hack to put separate locks (LILs? :-) )
around pieces of memory that contain python objects -- seems like
a lot of hoops to jump through, for questionable benefit...
I don't immediately see performance data, but my
too-complicated-solution intuition bell rings a little...
More information about the Stackless