[Stackless] need help on debugging a deadlock
andrewfr_ice at yahoo.com
Sat Jan 10 17:58:53 CET 2009
> Message: 1
> Date: Fri, 09 Jan 2009 16:50:44 +0100
> From: Paul Sijben <sijben at eemvalley.com>
> Subject: [Stackless] need help on debugging a deadlock
> To: stackless at stackless.com
> Message-ID: <49677254.1040804 at eemvalley.com>
> Content-Type: text/plain; charset=UTF-8
> I am creating a deadlock somewhere.
> Is there a way to ensure that a tasklet does not receive
> more than a
> certain number of milliseconds and is then stopped with
> some message on
> the console so I can see which one it was?
> Or are there even better ways to find out what is going on?
Gee whiz, my first deadlock programme was in December 2005!
I'll assume that Stackless complains about deadlock? Or the programme is simply hanging?
I feel the best way to find out what is going on concerning deadlock, is to understand how deadlock occurs. Deadlock occurs when four conditions happen simultaneously: mutual exclusion, non-preemption, hold-and-wait, circular wait.
Stackless in non-preemption mode essentially give you the first two conditions. I'll argue hold-and-wait looks like:
ch1.receive() # I am waiting on a resource
ch2.send() # I am holding a resource
However, due to transitivity, this is harder to detect. A wait-for graph is needed. I believe this is the way it works.
Tasklet channel operations are nodes. Channels are resources. If a tasklet does an operation on channel owned by a particular node, draw an directed edge to that node. When the transaction is finished, remove the edge. If there is a cycle, there is deadlock.
In the following case, a line would be drawn from tasklet1 to tasklet2.
In turn, there is a line drawn from tasklet2 to tasklet1. Deadlock.
As for debugging. Often I find the following helpful
1. create a function, say makeChannel(). As a part of the function, store the channel in some list.
2. In my main loop, I use the construct
while (stackless.getruncount() > 1):
(I guess stackless.run() would work too)
depending on your programme, if all the tasklets are blocked (i.e., due to deadlock), control will fall out of the loop.
3. Iterate over the list and do a __reduce__ on each channel. You will see the channel balances and the associated tasklets. This will give you insights about what is happening.
Looking back at the December 2005 posts, perhaps with decorators, we could get enough information to build a graph walker/cycle detector? In 2009, we should be much better at solving this :-|
Anyhow, if you can, post code.
P.S - Shameless plug: I explain deadlock in my Pycon 2008 talk "Adventures in Stackless Python Twisted Integration".....
More information about the Stackless