[Stackless] LLVM coroutine support [was: [LLVMdev] Proposal: stack/context switching within a thread]

Wed Apr 14 13:01:19 CEST 2010

Hello there.  Thanks for taking these thoughts into consideration.
On the surface, it sounds reasonable to separate the stack copying and the context switching.
There may be problems however.
Take the case of 'swapcontext'.
We cannot save the source stack until after the swapcontext() call, because only after this call do we have the context state to query for the stack position.  But any function call we perform after the swapcontext() may trample the unsaved stack, if the source and destination stack positions overlap.

It would be better to query the "current" frame, so that we could do something like:

extern void *base;
extern void  *sourcestackpos, *sourcestack;
extern void  *deststackpos, *deststack;
extern context_t *sourcectxt, *destctxt;
extern size_t sourcesize, destsize;
void switch() {
	sourcestackpos = llvm.getstacktop();
	sourcesize = base-sourcestackpos
	memcpy(sourcestack, sourcestackpos, sourcesize);
	llvm.swapcontext(sourcectxt, destctxt);
	stackpos = llvm.getstacktop();
	memcpy(deststackpos, deststack, destsize);
}

I'll run this by Christian Tismer.  these posix "makecontext" functions sound useful, but the querying of the current stack pointer is missing (as is stack direction).  If we had those in posix, we could probably implement the hard switching / stack slicing using those primitives alone :)

Of course, you can always "guess"  the current stack position simply by taking the address of an automatic variable.  You then hope that the call return address and other frame information are in the stack position 'above' that address.

Cheers,

Kristján

> -----Original Message-----
> From: stackless-bounces at stackless.com [mailto:stackless-
> bounces at stackless.com] On Behalf Of Jeffrey Yasskin
> Sent: 14. apríl 2010 01:04
> To: stackless at stackless.com
> Subject: Re: [Stackless] LLVM coroutine support [was: [LLVMdev]
> Proposal: stack/context switching within a thread]
> 
> Kristján Valur Jónsson <kristjan at ...> writes:
> 
> >
> > Hello Jeffrey.
> > This is very interesting.  I am not familiar with the C apis
> > that this appears to emulate.  The concept of a
> > "linked" context is novel to me.
> >
> > One thing that is not immediately is apparent to me is if this
> > system does not support "stack slicing".
> > I see the "makecontext"  This has some problems:
> > 1) You have to know beforehand how much stack you will use
> > 2) You Therefore have to allocate conservatively.
> > 3) This generous amount of preallocated memory that remains fixed
> > for the lifetime of the context causes memory fragmentation, similar
> > to what you see with regular threads.  This limits the number of
> > contexts that can be alive by virtual memory in the same way as the
> > number of threads are limited.
> >
> > In stackless python, we use "stack slicing".  If you are not
> > familiar with the concept, it involves always using the same C
> > stack, which therefore can grow, and storing the "active" part of
> > the stack away into heap memory when contexts are switched.  An
> > inactive context therefore has not only cpu registers associated
> > with it, but also slice of the stack (as little as required to
> > represent that particular context) tucked away into a heap memory
> > block.
> >
> > It is unclear to me if a context created by "getcontext" could be
> > used as a base point for stack slicing.  Could one create such a
> > base point (ctxt A), and then decend deeper into the stack, then
> > "swapcontext" from a nex context B back to the previous point on the
> > stack?  Will the stack data in the between points A and B on the
> > stack be "tucked away", to be restored when returning to context B?
> > When doing stack slicing, one has to define a "base", and in the
> > exaple above, context A would be the base, and all other contexts
> > would have to be from deeper on the stack.  I don't see a provision
> > for identifying such a base.  And indeed, if some of the contexts
> > come from a separate "makecontext" area, they would have a different
> > base.
> >
> > If stack slicing is not supported, as I suspect, it would be
> > relatively simple to add it by being able to specify a "base
> > context" to "getcontext" and "swapcontext", which would serve the
> > base point on the stack between which and the current stack
> > position, memory would need to be saved.  The contexts thus
> > generated would have an associated stack slice with them.  Adding
> > such a "base context" argument go getcontext() and swapcontext()
> > would enable us to build current stackless behaviour on top of such
> > an API.
> 
> Thanks for the detailed analysis. I forwarded it to the author of
> the proposal, and we discussed it some in the thread at
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-April/030887.html. He's
> uploaded a new version to
> http://code.google.com/p/llvm-stack-switch/wiki/Proposal,
> although it doesn't yet include all of the comments I asked for.
> 
> To try to answer some of your questions above:
> 
> The C functions are documented, though not very well, at
> http://www.opengroup.org/onlinepubs/009695399/functions/makecontext.htm
> l.
> 
> The basic mechanism Kenneth is proposing only swaps out the
> context's registers, not its actual stack space. It would be up
> to Stackless's scheduler to copy the stack data back and forth
> from the heap. He thinks it'll be enough for your use to add a
> way to query the top of the stack. Then, before calling anything
> you'd have to hard-switch, you would call getcontext() once and
> save the stack top. When switching, you'd call getcontext() again
> and save out the data between the two tops. I guess that matches
> your suggestion for a "base" context? The idea would be to give
> schedulers enough portable primitives to do what you need,
> without making LLVM know how to allocate memory.
> 
> In the llvmdev thread, I suggested that Stackless might want to
> use makecontext(new_space) any time it entered a C function, on
> the assumption that C frames are short-lived, and you could
> re-use the space as soon as the C frame returned, but I think
> that was wrong.
> 
> Jeffrey
> 
> 
> _______________________________________________
> Stackless mailing list
> Stackless at stackless.com
> http://www.stackless.com/mailman/listinfo/stackless