[Stackless-sprint] On greenlets

Christian Tismer tismer at stackless.com
Tue Mar 16 02:36:28 CET 2004


Dear friends,

first of all, I wish to thank you all for this great sprint!!!

I would like to repeat this ASAP, given that I will stand my
current monetary disaster.
But the sprint was very promising, very productive, and was
missing for at least three years. I won't make this mistake
again. :-)

Also, I'm wondering if we should keep the sprint list and
rename it to stackless-dev???

Tomorrow, I will probably purge the 2.3/src trunk and symlink
it to 2.3/sprint, but it will be better for you to check it out,
again, when this is done. I will keep you posted.


Now on the new concept.
-----------------------

Armin has given us a very special, wonderful gift, completely
unexpected. (Well, not really, if you know Armin...)
He implemented greenlets, which are revolutionary better than
what I did so far. He didn't know *how* much better they are.
Sunday night, we were sitting in the Luise and talked more about
it. Armin was not aware of the full power of his implementation,
but we figured out that it can be the answer to all questions,
namely it is like 42**42.

If you want to know more, check out the sprint trunk and read
sprint/Greenlet. There is a copy of my/our hard switching
platform stuff, and a tiny extension module named greenlet.c,
which is able to do hardware switching as an extension module
to standard Python!!!

We had several sessions at the blackboared, and I tried to
explain him how things work. It was especially hard to explain
why this initial_stub needs to be recreated on every "entry"
to Stackless, it must be a versioned resource, since its history
is changing all the time, when Stackless is entered from different
contexts.
It took him quite some time to understand. We then discussed
the implementation more, and I explained how I make sure that
all stack slices start on the same top address, and that it
is probably possible to go without this restriction, but that
my brain was not able to figure out the consequences, and it
seemed to be an optimum between simplicity and efficiency,
anyway.
For the rest of the time, I didn't hear so much from Armin,
but some more questions. I appreciated that he was going to
swallow all that hackish code to get an understanding.

On Sunday evening, he presented me something that changed
my life, completely...

Now really on the new concept.
------------------------------

What Armin did, is most remarkable. He removed the need to have
a common stack base. Instead, he finally found the right model
to describe stacks, how they behave and what they are.
The breakthrough is to declare that a stack consists of two
pieces: One upper part that is really living on the real stack,
to some limit, or for the "main" greenlet upto infinity.
And there is one lower part which is saved in heap memory.
The latter may or may not exist, this is dependant of when
other greenlets are created.

For further reference, please read and try to understand his
implementation. Please don't feel bad if you find it hard to
understand. This *is* hard, and I don't believe that there
exist so many implementations of this elegance and consequence.

There are some crucial, new ideas, and how far the consequences
reach, rushed like a ball lightning though my brains.

In Armin's design, it is possible to create a greenlet in any
context, at any time. There is no need to free up stack space
to some initially defined common stack base. Such a thing does
not exist. Instead, the machinery always knows which stack patches
to save and restore if any greenlet has to be activated, and it does
this always right, whatever stack situation might be there.
This is the total generalisation of my initial, restricted concept.

I'm not jealous because I didn't get this idea in all
the years. Instead, I'm more than happy that it has been
detected, after all, thanks to Armin.

Here the initial advantage that Armin presented to me (but there is
more to come):
There is no need to create an initial (green|task)let at all.
A main greenlet can be described by a static structure. It is the
owner of the complete stack, in a way from the topmost possible
location (somehow infinite) to the current position. It does not
posess a heap based part.
Whenever a new greenlet is created, its stack base is defined
at the point where we currently are. And this is the great concept:
We can create a greenlet wherever we are, there is no need to free up
some space occupied by someone else.
Later on, greenlets can be activated in any order. The system keeps
track of greenlets in chains, which tell which stack pieces need
to be saved when activating a greenlet (this only happens when the
greenlet to be activated is living above the current one), and which
stack pieces have to be restored for this greenlet.
Even if the main greenlet is activated and executes some arbitrary
code, this concept holds. The initial_stub, which had to be
recreated all the time in my concept, is always implicitly associated
with the main greenlet, its creation can be avoided, since the
main greenlet always *is* this stub, there is no need to create
something like a versioned resource. This is simply accomplished by
definition. Well, this is hard to understand, but believe me,
it is the truth.

Now, by this great new input, my brain was reactivated and delivered
from former wrong assumptions, and all in a sudden I realised
that this was the missing concept over years, which now can
get us a very relevant bit further.
(And it makes me somewhat happy that Armin didn't see this in the
first place, so I don't need to leave the dunce hat on...)

How to make greenlets the absolute winner concept
-------------------------------------------------

If you look into the implementation, greenlets are created by
a g_start function, which defines and claims an area on the
stack at the moment the greenlet is created.

With a bit of imagination, you will realize that it is not really
necessary to claim this initial stack position at the moment the
greenlet is created. This can be deferred until the greenlet
is actually executed. By doing this, we allow to start a new
greenlet regardless from the former stack position, just where
it happens to be started. This means, a first activation of
a greenlet never needs to cause any stack copying at all.
It simply extends the current stack. I'd like to call such
a greenlet "placable".

For the time the greenlet is executing, it has a claim on the
stack. It may decide at some time to jump to a different greenlet.
In Armin's first implementation, this will cause some stack saving,
at least for the current greenlet.

But here the non-recursive scheme chimes into the play:
What happens if the greenlet is able to perform a switch
without claiming any piece of stack space to be preserved?
You guessed it! This is the software switching, integrated into
greenlets, seamlessly. If a greenlet is able to jump off without
leaving any stack space to be saved, this is a soft switch, and the
greenlet stays "placable".

This has a *big*, *big* advantage.
The idea of placable greenlets says, that we can run soft-switched
greenlets on top of any existing stack, we don't need to demand that
this stack is "innocent" or have any other restrictions.
This was not possible with current Stackless, because I forced
"innocent" tasklets to be run at the same zero stack level, all
the time. In a sense, this makes placable greenlets as movable
and flexible as Python's generators, which also don't care in
which context they are called, despite the fact that greenlets
are *way* more versatile.

So here is my addition to the greenlet concept:
- allow greenlets to mark their stack_start when they are
   *called*, not when they are created.
- provide an extra field in the greenlet structure, which points
   to a memory area where the greenlet can save its current state.
- allow the greenlet to jump off after declaring its stack state
   as being "clean": there is nothing to save, "nothing to
   declare, I'm innocent". In that case, they *release* their
   stack_start, staying placable.

The consequences are, as far as I can see it right now:
- greenlets can always be run for the first time without causing
   any stack copy operations.
- greenlets that have existing heap memory are bound to a specific
   stack area, and they need to save and restore some stack when
   they are activated.
- greenlets that manage to jump off without leaving state on the
   stack, but storing it in their extra memory, stay in a placable
   mode and can be run from any other stack position.
- this concept can be implemented without any relationship to
   the Python language. It may make sense to make it use Python
   structures, but the concept does not depend on Python at all.
- In the case of Python, the extra structure to keep program state
   off of the C stack is the Python frame or the cframe/baseframe.
   This could be anything else for other host languages, but the
   principle remains the same:
   If you manage to save state in an extra structure, you are not
   stack based, you can be switched without involving assembly,
   and it is possible to pickle program state.

And how can we make use of all of this?
---------------------------------------

I have the deep confirmedness that this new concept will supersede
all of what I did with Stackless so far, and that it can be the
basis for a general purpose, language independent, most efficient
and most versatile implementation of green threads, which has never
been designed with this completeness, consequence, elegance, speed
and power anywhere and anytime.

I am sure that we will be able to create a spin-off of this as
a general greenlet package for other compiler languages, and I
do hope that we will create *the* best possible green-threading
package available, better than everything else, before.

Yes, these are big words, but I do believe in it.
I think we should use the concept for Python and keep it free
software. At the same time, I encourage everybody to think of
a commercial spin-off, providing a lightweight task-switching
library for Python-less languages and applications, which has
no real competitors, and which probably should *not* be
free software, but be paid work supporting other non-free
software, possibly giving revenues for all of us.

Please help to make this take place. I think we are at the
bleeding edge here, and we have a great chance if we can
do it just right.

Apologies about my lengthy brain ejaculation, but I think it
was appropriate and necessary. Please let me know about your
thoughts, and let me explain things which were hard to understand.

enlightened-ly y'rs -- chris

-- 
Christian Tismer             :^)   <mailto:tismer at stackless.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  mobile +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/



_______________________________________________
Stackless-sprint mailing list
Stackless-sprint at stackless.com
http://www.stackless.com/mailman/listinfo/stackless-sprint



More information about the Stackless-dev mailing list