[Stackless] slp_kill_tasks_with_stacks is broken

Kristján Valur Jónsson kristjan at ccpgames.com
Thu Sep 22 00:39:10 CEST 2011


So, I have been working with multithreaded stuff, even done some fixes.
There is this function, slp_kill_tasks_with_stacks, which is supposed to kill tasklets with stacks.  It turns out, that for all threads but the main thread, it kills _all_ tasklets, because such tasklets each have their own cstate object.  Yes, all tasklets on worker threads are linked into the cstate circular list, in order to have them killed when a thread goes away.  This is achieved through the function tasklet_ensure_linkage().  Whether this is required, is another topic.

Now, the last thing this function does to a tasklet thus killed is to change its cstate->tstate to the initial tstate.  That's right, it makes it so that it appears to have come from the main thread.   I have been experimenting with setting this to NULl instead, to indicate that it comes from a dead thread.

Anyway, this is when I found out that slp_kill_tasks_with_stacks actually does not work at all.  When this function is entered, when the thtread state is being cleared, tstate->ts.main (and tstate->ts.current) is already NULL, having been cleared here, in tasklet_end():
/* capture all exceptions */
    if (ismain) {
        /*
         * Main wants to exit. We clean up, but leave the
         * runnables chain intact.
         */
        ts->st.main = NULL;
        Py_DECREF(retval);
        retval = schedule_task_destruct(task, task);
        Py_DECREF(task);
        return retval;
    }

So, all the gymnastics that slp_kill_tasks_with_stacks does with trying to place tasklets before the "current" tasklet and so on are senseless since thre is no current tasklet.  In parcicular:  When a tasklet that is to be killed is linked into the current queue, it becomes the "main" tasklet and also the "current" tasklet.  PyTasklet_Kill then attempts to kill the current tasklet.  The kill function attempts to get the current frame, but then this code kicks in:

/*
     * silently do nothing if the tasklet is dead.
     * simple raising would kill ourself in this case.
     */
    if (slp_get_frame(task) == NULL)
        Py_RETURN_NONE;
And
/* CAUTION: This function returns a borrowed reference */
PyFrameObject *
slp_get_frame(PyTaskletObject *task)
{
    PyThreadState *ts = PyThreadState_GET();

    return ts->st.current == task ? ts->frame : task->f.frame;
}

So, slp_get_frame returns the ts->frame which already is NULL.

In effect, it leaves the killed tasklet then in a strange state, where its next and prev members point to itself, but it has a NULL frame.

So, how can we fix tasklet killing on exit? It seems necessary, at least for threads with real C state in order to release any special C state that may be pendin on the C stack.
First, off, the code in tasklet_end, which clears the main tasklet, is incompatible with the kill_task_with_cstate.
Secondly, the latter function does appear very fragile.  What happens if a tasklet that is being killed refuses to die?  Or creates new tasklets?  I wonder if we should have a super exception, one that is not catchable as a system_exit, to ensure that they die....  Or maybe just require programs not to handle tasklet exit or do any magic tasklet stuff during such a handle clause.  But that's a different story.

What I think should happen is probably something like this.  Assuming that kill_tasks_with_cstate can be called with the man and current tasklets still valid:

1)      All runnable tasklets are removed from the runnable queue and decrefed, possibly resulting in them dying on their own or being killed, leaving only the main tasklet in the queue.

2)      Then proceed with killing each tasklet.  Now that there is only one in the queue left, there is no need to worry about the queue ordering.

Anyway, enough ranting.  I'm not sure what to do here.  It is annoying that the main tasklet is cleared like this.  But perhaps it is necessary.  Perhaps it is necessary to create a new temporary main tasklet...  Who knows.

K
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.stackless.com/pipermail/stackless/attachments/20110921/6a79c718/attachment.html>


More information about the Stackless mailing list