[Stackless] Is Stackless single core by nature?

Fri Jul 10 05:27:22 CEST 2009

On Fri, Jul 10, 2009 at 3:20 AM, Henning
Diedrich<hd at authentic-internet.de> wrote:
> coming back to the thread, I meanwhile found the treasure trove to dive into
> for my questions: http://wiki.python.org/moin/ParallelProcessing
>
> That's a list of efforts to nudge Python towards parallel/multicore/ etc.
> Did you work with any of them?  Is there any specific or general rule, as to
> which will work with Stackless?
>
> Does MPI4Py?

Stackless Python is a superset of Python.  Anything that works in
Python should work in Stackless.

But over and above that, the general rule is to do with blocking.  If
one of these solutions blocks the current thread when invoking some
resource, then the scheduler on that thread is blocked and no
microthreads get scheduled for the duration.  So in order to have a
naturally usable framework, asynchronous IO needs to be used to
prevent this from happening.  This is what my stackless socket module
is intended to facilitate, allowing standard Python modules which use
blocking sockets to just work with Stackless because it takes the
blocking up from the thread level to the microthread level, preventing
them from blocking the scheduler.

Any solutions which have an asynchronous API can probably be adapted
with minimal effort.

> I don't understand what you are asking here.  The ability to provide a
> function that blocks in a synchronous way wrapping asynchronous IO is
> a benefit that comes with any real coroutine-like solution.  And it
> can be applied as a building block in any framework you build, whether
> one core/process or multiple cores/process per core.
>
>
> But in a SMP environment you run into concurrent resource access, as one
> effect of blocking, issues that you are completely isolated from when
> staying one-core, protected by the guaranteed sequentiality that this
> yields.

There is no such guarantee, if we are talking in terms of Stackless.

> It that sense I had referred to "a programmer needs to be aware ... but in
> practice this is rarely much of concern": I wondered if this may get way
> more complicated as soon as you'd have multi-core concurrency and the need
> to protect resources from contention.

Of course it would get more complicated, it's an extended paradigm.

> The hiding of asynchronous operations - in its simplest incarnation in a
> physically sequential environment - is not called in question. But the
> implications of such a layout in a distributed or multi-core environment,
> across multiple blades even, at best 'transparently distributed'.

The implications are irrelevant IMO.  There is no hiding.  You still
need to know when things block and be prepared for any changes which
might happen during the duration of that block.  The "hiding of
asynchronous operations" is merely a tool that affects the clarity of
the framework structure.  If a programmer does not understand when
something might block or what they need to handle, then this is on
them.  They have no taken the time to understand the framework they
are working with.

> But all this, regarding "architecture decision", is what you avoid when you
> stay on one core; and basically stay protected by underlying ensured
> sequentiallity of (micro-)threads, while creating parallelly formulated
> code.

Not at all.

Microthreads still need to block, and they need to handle whatever may
have happened when they block.  I see it as same problem on a smaller
scale.  Whether you have a framework based around callbacks or
synchronous blocking enabled using Stackless channels is completely
tangential.  You still have this problem.  It doesn't even matter if
you use preemptive scheduling or cooperative scheduling, you still
have it.  Of course, it's a much larger problem if you use preemptive
scheduling.

Sure you can have no shared state if you use Erlang and that may get
rid of this.  You could also build a framework on Stackless to have no
shared state.

> Is there a rule of thumb, or a list, of what modules and libraries run with
> Stackless?

I covered the main consideration at the top of this email.

> I am still hoping to find the stackless-compatible concurrency support I am
> looking for. But otherwise, would Stackless then stay close, and 'on top' of
> the main Python branch, which in turn will likely not implement
> multi-threading as that would be obstructed by the GIL-philosophy? (see (2)
> above)

I don't understand this question.

> Even if Stackless is not originally about multi-core or distributed
> processing: just as it is not a *language* issue that CPython has the GIL,
> but an implementation issue of CPython (as discussed at (2))-  would not the
> Stackless syntax be just what one wanted to use multi-cores and distribute
> calculations to multiple computers? Potentially extended (or reduced!) to
> deal with shared resources? It just seems to lend itself to that
> exceptionally well and would not have to pass in the last hurdle, as Java
> does (see (6) at the bottom). The last hurdle being microthreads, wich make
> Elrang and Stackless seem very close.

I don't fully understand what you are describing here especially with
respect to what ever it may lend itself to, you'd need to elaborate in
more detail for me to comment.

> Would not even EVE have to expect, in the future, that Blades will become
> faster on a much slower pace, measured per core, but offering more cores
> instead as today's proposition of speed improvement? Growth by hardware
> should get harder to realize, staying with one core. But maybe you fork out
> different stuff to keep cores busy in a different way.

I don't work for C.C.P. any more.  I cannot know or say much with
respect to EVE in the future.

> Erlang, got multi-threaded only quite recently, in 2007
> (3) -
> http://www.ericsson.com/technology/opensource/erlang/news/archive/erlang_goes_multi_core.shtml
> As would be expected with no language changes, only the VM was adapted,
> which people at Ericsson where rightfully proud about. I imagine the Erlang
> hype of 2007/8 was fired up by this fact. I had initially thought Stackless
> was just as destined for that feat.

Well, given Erlang's very constrained model, this isn't very
surprising.  I say yawn for the fact that it only required VM changes
and hurray for the contribution to the greater good of Erlang.

> This may neatly clarify similarities and differences between Stackless and
> Erlang (Joe, Armstrong, quote from (3)):
>
> "The Erlang VM is written in C and run as one process on the host operating
> system (OS). Within the Erlang VM an internal scheduler is responsible for
> running the Erlang processes (which can be many thousands). In the SMP
> version of the Erlang VM, there can be many such schedulers running in
> separate OS threads. As default there will be as many schedulers as there
> are processors or processor cores on the system.
>
> "The SMP support is totally transparent for the Erlang programs. That is,
> there is no need to change or recompile existing programs. Programs with
> built-in assumptions about sequential execution must be rewritten in order
> to take advantage of the SMP support, however."
>
> That this worked was because of the way that Erlang had focused on making
> distributed computations possible: again, the paradigm of no shared state.
> As this is inherent in Erlang, Erlang could transparently be made to use
> multi-cores.
>
> Even if Stackless cannot follow that leap (sic, p), my impression was that
> it may be the natural starting point for Python to get there, if probably
> with syntactic modification needed. It's coming from a different approach of
> (not) dealing with state in concurrency but seems as microprocess centered
> by design as Erlang.

Stackless cannot follow this leap and will not.  Stackless has a
defined purpose and has implemented that purpose completely for
several years.

The leap can be followed by a framework built on Stackless though.

> I should note that while at CCP, I wrote part of a framework that ran
> an agent on each machine involved.  There was a master program and it
> would communicate with each running agent telling it to start
> sub-applications to farm off work to.  All programs, whether agents,
> master and sub-applications were specialisations of the CCP Stackless
> Python based application.  There was no pickling involved, however.
> Unless I am mistaken, this sort of arbitrary ability to start up
> instances of the interpreter on involved machines is as close as you
> would be able to get to "or even blades", no matter the language (and
> framework) used.
>
>
> Yes.
>
> Plus what you did with no pickling is probably close to Erlang philosphy:
> (if not literally because you can send all sort of things with an Erlang
> message, but:) if you didn't pickle you probably also did not expect state
> sent back as immediate answers, except for basic 'ok's. Which is close to
> the Erlang's 'return-less' (Actor model) messages.

I was not explicit enough.  I meant I did not pickle microthreads and
send them across the wire.  The basic premise of the model was that
data was pickled and send across the wire.

> Reminds of the raison d'etre for StacklessIO. Could it yield equal rewards?
> You had mentioned a drop in you wrote with the same functionality as
> StacklessIO. How difficult could it be to shift the blocking to the level of
> the tasklet, away from the system thread, with Pyro?

Beats me.

> Shouldn't that be rather painless, given that Pyro is native Python
> (http://pyro.sourceforge.net/manual/1-intro.html)?
>
> Or am I missing something there?

You might have to completely rewrite it.  You might be able to do it
by wrapping its asynchronous API (if it has one) with channels.  I
don't see it as a very interesting question, or one worth spending
much of my time on.  You could ask it for any one of many Python
modules which block the system thread.

> It's along the lines that Stackless introduced a whole new concept, making
> possible a whole different way to formulate solutions. But very much by
> changes 'under the hood'. Will it be possible to bring Erlang's model to
> Python without loss of style?

Yes, with sufficient work.  But then, nothing is free.  If you choose
Stackless, you pay a price.  If you choose Erlang, you pay a different
one.

> If I could kindle your interest, I found the following post by Slava
> Akhmechet a rewarding read, both for humor and enlightment.
>
> It plays through the thought of how Java could be extended in the direction
> of Erlang and why and what for. It also stops exactly at an unsurmountable
> hurdle for Java, which happens to be Stackless' specialty: microprocesses.
> (6) - http://www.defmacro.org/ramblings/concurrency.html

Does he ever describe some real projects where he has used Erlang?
Not that I recall.  I don't have the time to follow fads, and without
substance, his post is just fad propagation.

I've heard about Erlang, message passing and the advantages it gives,
but so what?  This post handwaves away the "what".

> This is a commendable article by Bruce Tate of IBM, which is looking at
> Erlang from the Java angle, too:
> (7) - http://www.ibm.com/developerworks/java/library/j-cb04186.html

My eyes glazed over reading this one, can't remember much about what I read.

> Ralph Johnson explains why Erlang processes are objects, even if this should
> send Joe Armstrong, Erlang's creator kicking and screaming: (8) -
> http://www.cincomsmalltalk.com/userblogs/ralph/blogView?entry=3364027251

He says, "Erlang, the next Java" and "I do not believe that other
languages can catch up with Erlang anytime soon."

This guy is also drinking the kool-aid.  IMO his first quote would be
more correct if Lisp were substituted for Java.  Maybe all the Lisp
programmers can get together with the Erlang programmers and talk
about when all the things that have been prophecied about their
languages are going to happen.

If I ever go around trumpeting "everyone should be using Stackless"
rather than "this is what Stackless does, use it if it seems handy",
someone please let me know.

> Where I currently got to is Candygram (
> http://candygram.sourceforge.net/overview.html ), explicitly an Erlang
> epigon, looking quite quiet since 2004. Probably suffering from the fact,
> too, that it can't have many threads, not near the numbers of Erlang and
> Stackless. I can't tell why it couldn't run with Stackless, you surely can?
>
> If you would have an answer to that it would be very much appreciated.

Unfortunately, I do not have the time or interest to look into it.  I
don't necessarily believe that the limitations of OS threads as a
resource are the reason for its lack of take up.  I would be more
inclined to categorise it as just another second party library.  Sure
it may have the right keywords to describe it (i.e. "message
passing"), but it is a second class solution where Erlang is a first
class one.

In Erlang, message passing is a fundamental part of the design and the
language.  You don't have an option, it is just there.  It black boxes
away a lot of systems which just work.

In Candygram, you have a framework someone else has written.  You
don't understand the code that makes it up, you didn't write it.  You
implicitly have the burdens that come with adopting unofficial
dependencies that are not a part of the language.  If the original
project changes you have the burden of merging in those changes and
they might not be compatible with your own ones.

And this to me highlights the huge advantage anything that comes with
the language has (whether part of the language itself or a standard
library that come with it).  And how the next best thing is what you
have written and have a full understanding of yourself.

> There is a post from 2006 on
> http://mail.python.org/pipermail/python-3000/2006-September/003718.html ,
> Bob Ippolito answering Ivan Krstic:
>
> "Candygram is heavyweight by trade-off, not because it has to be. Candygram
> could absolutely be implemented efficiently in current
> Python if a Twisted-like style was used. An API that exploits Python 2.5's
> with blocks and enhanced iterators would make it less verbose
> than a traditional twisted app and potentially easier to learn. Stackless or
> greenlets could be used for an even lighter weight API,
> though not as portably."
> . . .
>
> "> * Introduce microthreads, declare that Python endorses Erlang's
> no-sharing approach to concurrency, and incorporate something like
> "> candygram into the stdlib.
>
> "We have cooperatively scheduled microthreads with ugly syntax (yield), or
> more platform-specific and much less debuggable microthreads with stackless
> or greenlets.
>
> "The missing part is the async message passing API and the libraries to go
> with it."
>
> End of quote.

I agree with Bob in what you have quoted.

> What puzzles me is how you seem rather unphased about these multi-core
> issues. Isn't Stackless *the* place from where this should come to CPython?
> Is the potential in this irrelevant for some reason I am missing out on? Or
> for some reason uninteresting for CCP?

Again: Stackless has a defined purpose and has implemented that
purpose completely for several years.

Threading, whether multi-core or not, is a standard Python usage
issue.  IMO it is tangential to Stackless in the same way that the use
of any other module or resource that comes with the standard library
is.  Sure, if you use Stackless and you are writing threading
solutions you need to take into account what Stackless does and what
it provides, but that is neither here nor there.  This again goes for
a lot of standard library modules.

Cheers,
Richard.