[Stackless] st.serial --> st.serial_last_jump patch

Kristján Valur Jónsson kristjan at ccpgames.com
Tue Nov 24 21:43:32 CET 2009


Yes, what you describe sounds like a fair normal scenario and is precisely what the cstack's "serial" member is supposed to make work.

Now, in the first "main", that main tasklet returns.  You then enter a second "main" and when this is returning, it does an incorrect slp_transfer_return, somehow.

It is this last bit that is hard to understand, because it is difficult to see how it can be wrong.  The code is simple:

	PyTaskletObject *task = ts->st.current;
	int ismain = task == ts->st.main;

...

if (ismain) {
		if (ts->st.serial_last_jump != ts->st.serial) {
			slp_transfer_return(task->cstate);

So, the slp_transfer_return is only triggered if indeed it is _the_ main tasklet (the last "main" tasklet) that is returning.  And an slp_transfer_return() should be safe, because it will switch to the place where this tasklet created its initial stub, even if we are already on the correct stack.  If you follow the code through the debugger right past the actual stack switch, you should find yourself in make_initial_stub.

Are you working on windows?
Usually if you are in the wrong stack, you can tell because the traceback in the debugger will have a gap in it.  Is the traceback "whole" prior to the slp_transfer_return()?  And if you follow the debugger, how does it change once you switch the stack (set a breakpoint just after "return slp_switch()" in slp_transfer.c)

Also, when you create your second "main", can you note the serial number that is created for it in stacklesseval.c, line 232: ts->st.serial_last_jump = ++ts->st.serial;  Does this match the logic in scheduling, line 1103, where it decides to slp_transfer_return()?

Now, I'm quite curious as to why your bug is happening, and we should fix it, but if all else fails, there is a workaround.  This is the workaround that we have employed in EVE for years, since before the time when multiple "main" tasklets became possible:  At the start of your program, enter a "main" context and stay there, by using PyStackless_Call_Main, having exposed your program's stackless_main function to python as a c function.  Then you stay within a main tasklet.  Something like
PyObject *inner_main(PyObject *self, PyObject *args){
	/* all my program logic goes here */
	Py_RETURN_NONE;
}

int main(int argc, char *argv[]) {
	PyMethodDef def = {"inner_main", &inner_main, METH_VARARGS, ""}
	PyObject *func = PyCFunction_New(def, NULL);
	PyObject *args = Py_BuildValue("....
	PyObject *r = PyStackless_Call_Main(func, args, NULL)
...
	return 0;
}




Kristján

> -----Original Message-----
> From: Jeff Senn [mailto:senn at maya.com]
> Sent: 24. nóvember 2009 19:31
> To: Kristján Valur Jónsson
> Cc: stackless list
> Subject: Re: [Stackless] st.serial --> st.serial_last_jump patch
> 
> Ok ... it turns out (not surprisingly) to be more complicated than I
> thought.
> 
> I'm using preemptive scheduling (stackless.run(interval)) to run some
> tasklets
> for awhile, return out of the embedded interpreter, and come back later
> and
> run some more.  When I "come back later"... I wind up with a different
> "main" tasklet (i.e. the first main tasklet has previously returned out
> of it's stack).
> 
> When I start scheduling, this causes a hard-switch into the previously
> created tasklets
> because they were on a "different stack" (even though they won't ever
> use anything
> on the cstack)
> [Note: I'd like to avoid this... so maybe I'll reorganize... but
> meanwhile...]
> the slp_transfer notes the serial of the *old* main tasklet in
> last_jump...
> 
> and when the return after the stackless.run(i) happens it attempts to
> slp_transfer_return to a previous main tasklet (which is now gone)
> 
> So...either:
>  1) something is failing to mark serial_last_jump in the switch back to
> the
>     *new* main tasklet (when stackless.run(i) returns)
> or
>  2) something I'm doing is wrong-headed
> 
> I think it's #1... because I believe what should happen is that the
> *new*
> main tasklet should just return (out of *it's* stack) -- nothing should
> ever return (again) from the prior stack (it should just be left around
> so
> the tasklets can run).
> 
> I hope this is lucid...it's been quite a while since I looked at
> stackless internals...and I'm trying to do too much at once today...
> 
> Thoughts?
> 
> 





More information about the Stackless mailing list