[Stackless] Connecting To A Database

Jeff Senn senn at maya.com
Sat Jun 9 02:09:18 CEST 2007


On Jun 8, 2007, at 6:20 PM, Christopher Armstrong wrote:

> No, adbapi does use threads. That's the only way to take advantage of
> the existing blocking database client libraries.
>
> It would be nice to do it without threads, but nobody's written decent
> asynchronous implementations of the database protocols. (There's one
> in the postgres client library, but I haven't seen anything good come
> of it).

As a general statement, anyone considering this whole topic (using a
database with some sort of scheduled computation engine) should
think carefully about it.

In terms of performance, there is no "magic" here - There is only one  
way
you are going to complete some sort of task in any amount of clock
time that is shorter than the most easy-to-understand, synchronous,
single-threaded process: if you can somehow manage to
overlap I/O with CPU.  That is: if you can get some work CPU work  
done during
a DMA request (or other asynch request to a device with another CPU).

[Actually, I lied,  you can also have multiple CPUs, but because
of the Python GIL you should just consider that a special case of the  
above.
That is: with the current implementation of C-Python, multiple CPUs only
helps with I/O or possibly execution of non-Python code]

So... generally you are not going to get this "pure performance"
benefit unless you use another thread (than the one that is running  
Python) or your OS supports
some kind of asynch I/O directly and you have Python support for it  
(unlikely).

So (for whoever said they don't want to use another thread at all) I
hope you have a different reason to do asynchronous scheduling.  There
*are* other reasons (and what your goals are affect what particular
methods you might want to use.)

The main two other reasons are:

-- you might want to chop up a task (or execute many *independent*  
tasks) in 
    a way that either reports back progress or is somehow "fair  
scheduling".
    There is, of course, some cost to this (scheduling overhead), but it
    can usually be kept small.

-- you have many *interdependent* tasks and it's too hard to think about
    the dependencies (or they are somehow dynamic) so that you can  
express
    your code directly.  This is: it is easier to write your code in
    "seemingly independent" chunks that are assembled and executed  
dynamically
    by some sort of scheduler.  This often results in problems with  
deadlocks
    (which might require significant analysis) and you are almost  
certainly
    trading off *worse* performance (scheduler overhead) for simpler  
code.

Anyway... my point is, it may help to discuss what your goals are  
more specifically
than "use a database with stackless python".

I can imagine many situations where "you should try Twisted" is great  
advice,
and also many others where only a deep re-architecting of how  
relational database
engines operate is going to help...

-Jas










_______________________________________________
Stackless mailing list
Stackless at stackless.com
http://www.stackless.com/mailman/listinfo/stackless



More information about the Stackless mailing list