Hi list,<br><br>I have observed slowdown when running Numpy with Stackless compared to running it with CPython. The slowdown can be significant when problem size is large. I wish someone in this mailing list can give me some hints on this issue. I also post performance results and my source code here.<br>
<br>My platform is a 12-core dual-socket SMP machine running Ubuntu (Linux kernel 3.2.0). On this platform, I have CPython 2.7.3, Numpy 1.6.1, and Stackless Python 2.7.2 installed. <div><br>In order to test which Python implementation runs Numpy better, I wrote a piece of code that simply performs matrix-matrix multiplication using one of Numpy's functions. The source code is listed as follows:<br>
<br><i>if __name__ == "__main__":<br>
import numpy, random, time<br> import sys<br><br> size = int(sys.argv[1]) # For simplicity, we only test square matrix<br><br> matrix_a = numpy.matrix(numpy.random.randn(size,size))<br> matrix_b = numpy.matrix(numpy.random.randn(size,size))<br>
<br> start_time = time.time()<br> result = numpy.dot(matrix_a, matrix_b)<br> print '%0.3f ms' % ((time.time() - start_time)*1000.0)</i><br><div><i><br></i></div><div>I run this code with both CPython and Stackless Python. Here is performance numbers I got:</div>
<div><br></div><div>Problem size CPython Stackless Slowdown</div><div>256x256 14.112 ms 40.003 ms ~2.9X</div><div>512x512 109.965 ms 353.292 ms ~3.2X</div>
<div>1024x1024 871.251 ms 8771.153 ms ~10X</div><div>2048x2048 8214.799 ms 86479.872 ms ~10.5X</div><div>4096x4096 69790.130 ms 822476.506 ms ~11.8X</div><div><br></div><div>Any ideas? Thanks in advance. </div>
<div><br></div><div>Bin</div>
</div>