[Stackless] A new crash bug

Kristján Valur Jónsson kristjan at ccpgames.com
Sat Oct 20 12:11:11 CEST 2007


I have had some success with this.
After intensive debugging I figured out mostly why this is happening, but not necessarily the whys.

1) the tasklet is dormant on the channel when the program exits.
2) in PyHandleSysExit() (or something, I'm writing this from home from memory) a gc.collect() is initiated.
3) Both the channel and the frame object (the sleeping one) end up in the 'uncreachable' list during collection.  They are linked into a list, which' header is a local (stack) variable in the collect() function.
4) to collect the list, a gc_clear() call is made.  The first one is made for the channel (it is first in the list).
5) The channel seeks to awaken the sleeping tasklet.  I am not sure why this is a good idea.
6) Now, here (I didn't single step far enough) when it tries to switch to it, it somehow longjumps out of the gc.collect() call, and we suddenly find ourselves in the line below where PyHandleSystemExit() is called.
7) PyErr_Clear() is called.  This causes a release of the frame object. The frame object is in the current traceback.  But it is still linked to 'unreachable' in a stackframe that no longer exists, and unlinking it causes a stack corruption in the current frame, eventually causing a crash.

Some thoughts:
1) Why do we need to awaken tasklets when channels are collected?
2) Why does awakening the tasklet cause this longjump out of the function?
3) Why do we find the frame in the exception state then? (I must investigate.)  Was it there when gc.collect() was entered, or did it enter it as part of awakening the tasklet?  If the former, then there must be a reference bug since in that case it shouldn't have entered the 'unreachable' list in the first place.

I will investigate further on Monday, but some insight into points 1 and 2 above would be useful.  Christian?

K

> -----Original Message-----
> From: stackless-bounces at stackless.com [mailto:stackless-
> bounces at stackless.com] On Behalf Of Richard Tew
> Sent: 12. október 2007 20:15
> To: Stackless mailing list
> Subject: [Stackless] A new crash bug
>
> Hi,
>
> I've working on a comprehensive test suite for Stackless and came
> across this crash bug.  I have not yet fixed it myself, but it looks
> like a pretty straightforward problem related to garbage collection of
> objects when the frame is exited because of an exception.
>
> I am using the release Windows Stackless build.  Experience dictates
> that this should happen consistently on all platforms.
>
> If anyone wishes to get more familiar with Stackless internals and the
> Python virtual machine/implementation, this should be a good bug to
> start with.
>
>
> import unittest, stackless
>
> if True:
>     class CrashBugBehaviour(unittest.TestCase):
>         "Tests for current and past reproduction cases for crash bugs."
>
>         def testChannelFailureCrash(self):
>             """ Related to channel blocking and garbage collection order
>                 for objects freed when this frame exits.  More detail
>                 should be added once it has been fixed. """
>
>             def channelBlockingFunction(channel, crashCausingList):
>                 # Condition 3: The tasklet must block on the channel.
>                 channel.send(1)
>
>             # Condition 1: The channel must be created before the list.
>             c = stackless.channel()
>
>             # Condition 2: The list must be passed into the created
> tasklet.
>             v = []
>             channelBlockingTasklet =
> stackless.tasklet(channelBlockingFunction)(c, v)
>             channelBlockingTasklet.run()
>
>             # This intentionally fails.  It will trigger the crash bug.
>             self.failUnless(False)
>
>
> Cheers,
> Richard.
>
> _______________________________________________
> Stackless mailing list
> Stackless at stackless.com
> http://www.stackless.com/mailman/listinfo/stackless




More information about the Stackless mailing list