tungwaiyip.info

 

home

about me

links

my software

Media

Yucatán Photos

St Lucia Photos

Photo Album

Videos

Blog

< July 2009 >
SuMoTuWeThFrSa
    1 2 3 4
5 6 7 8 91011
12131415161718
19202122232425
262728293031 

past articles »

Click for San Francisco, California Forecast

San Francisco, USA

 

ctype performance benchmark

I have done some performance benchmarking for Python's ctypes library. I am planning to use ctypes as an alternative to writing C extension module for performance enhancement. Therefore my use case is slight different from the typical use case for accessing existing third party C libraries. In this case I am both the user and the implementer of the C library.

In order to determine what is the right granularity for context switching between Python and C, I have done some benchmarking. I mainly want to measure the function call overhead. So the test functions are trivial function like returning the first character of a string. I compare a pure Python function versus C module function versus ctypes function. The tests are ran under Python 2.6 on Windows XP with Intel 2.33Ghz Core Duo.

First of all I want to compare the function to get the first character of a string. The most basic case is to reference it as the 0th element of a sequence without calling any function. The produce the fastest result at 0.0659 usec per loop.

  $ timeit "'abc'[0]"

  10000000 loops, best of 3: 0.0659 usec per loop

As soon as I build a function around it, the cost goes up substantially. Both pure Python and C extension method shows similar performance at around 0.5 usec. ctypes function takes about 2.5 times as long at 1.37 usec.

  $ timeit -s "f=lambda s: s[0]"  "f('abc')"

  1000000 loops, best of 3: 0.506 usec per loop

  $ timeit -s "import mylib" "mylib.py_first('abc')"

  1000000 loops, best of 3: 0.545 usec per loop

  $ timeit -s "import ctypes; dll = ctypes.CDLL('mylib.pyd')"
              "dll.first('abc')"

  1000000 loops, best of 3: 1.37 usec per loop

I repeated the test with a long string (1MB). There are not much difference in performance. So I can be quite confident that the parameter is passed by reference (of the internal buffer).

  $ timeit -s "f=lambda s: s[0]; lstr='abcde'*200000"
              "f(lstr)"

  1000000 loops, best of 3: 0.465 usec per loop

  $ timeit -s "import mylib; lstr='abcde'*200000"
              "mylib.py_first(lstr)"

  1000000 loops, best of 3: 0.539 usec per loop

  $ timeit -s "import ctypes; dll = ctypes.CDLL('mylib.pyd')"
           -s "lstr='abcde'*200000"
              "dll.first(lstr)"

  1000000 loops, best of 3: 1.4 usec per loop

Next I have make some attempts to speed up ctypes performance. A measurable improvement can be attained by eliminating the attribute look up for the function. Curiously this shows no improvement in the similar case for C extension.

  $ timeit -s "import ctypes; dll = ctypes.CDLL('mylib.pyd');
           -s "f=dll.first"
              "f('abcde')"

  1000000 loops, best of 3: 1.18 usec per loop

Secondary I have tried to specify the ctypes function prototype. This actually decrease the performance significantly.

  $ timeit -s "import ctypes; dll = ctypes.CDLL('mylib.pyd')"
           -s "f=dll.first"
           -s "f.argtypes=[ctypes.c_char_p]"
           -s "f.restype=ctypes.c_int"
              "f('abcde')"

  1000000 loops, best of 3: 1.57 usec per loop

Finally I have tested passing multiple parameters into the function. One of the parameter is passed by reference in order to return a value. Performance decrease as the number of parameter increase.

  $ timeit -s "charAt = lambda s, size, pos: s[pos]"
           -s "s='this is a test'"
              "charAt(s, len(s), 1)"

  1000000 loops, best of 3: 0.758 usec per loop

  $ timeit -s "import mylib; s='this is a test'"
              "mylib.py_charAt(s, len(s), 1)"

  1000000 loops, best of 3: 0.929 usec per loop

  $ timeit -s "import ctypes"
           -s "dll = ctypes.CDLL('mylib.pyd')"
           -s "s='this is a test'"
           -s "ch = ctypes.c_char()"
              "dll.charAt(s, len(s), 1, ctypes.byref(ch))"

  100000 loops, best of 3: 2.5 usec per loop

One style of coding that improve the performance somewhat is to build a C struct to hold all the parameters.

  $ timeit -s "from test_mylib import dll, charAt_param"
           -s "s='this is a test'"
           -s "obj = charAt_param(s=s, size=len(s), pos=3, ch='')"
              "dll.charAt_struct(obj)"

  1000000 loops, best of 3: 1.71 usec per loop

This may work because most of the fields in the charAt_param struct are invariant in the loop. Having them in the same struct object save them from getting rebuilt each time.

My overall observation is that ctypes function has an overhead that is 2 to 3 times to a similar C extension function. This may become a limiting factor if the function calls are fine grained. Using ctypes for performance enhancement is a lot more productive if the interface can be made to medium or coarse grained.

A snapshot of the source code used for testing is available for download. This is also useful if you want a boiler plate for building your own ctypes library.

2009.07.16 [] - comments (3)

 

 

Comments (3)

Could you give some more rationale about why you wouldn't use other approaches like cython, SWIG, psyco etc.?

Posted by Paul at Mon Dec 21 20:44:54 2009



psyco is easy to use and give some instant performance boost. However in my experience the performance gain is typical around 2x. This fall far short of what we can archive with C.

cython, pyrex etc are good candidates to explore. I just haven't got around to learn their syntax yet.

Posted by Wai Yip Tung at Tue Dec 22 07:15:08 2009



Interesting article - you examine an important topic. I agree with the conclusion - indeed calls shouldn't be too fine-grained.

Posted by Eli at Tue Dec 22 20:04:02 2009



Please add your comment

Name:


E-mail (not shown):


Your website (optional):


Comment:


In order to deter spammers, we would like to ask you a question about this article:

Which city am I based in? (hint: check the weather box on the left column.)

past articles »

 

BBC News

 

EU 'nearing' Greece bail-out deal (13 Mar 2010)

 

Suicide bomb hits Pakistan's Swat (13 Mar 2010)

 

Moon move dismays Apollo men (13 Mar 2010)

 

Three freed in Irish 'Vilks plot' (13 Mar 2010)

 

Alaska wolves 'killed' US teacher (13 Mar 2010)

 

Clinton rebuke over Israel homes (12 Mar 2010)

 

Ivory and tuna top wildlife talks (13 Mar 2010)

 

Sarkozy and Brown attack US deal (13 Mar 2010)

 

Obese drinkers face 'double whammy' (13 Mar 2010)

 

Cell for cell: Georgian prisoners can swap jail time for monastery (13 Mar 2010)

more »

 

Slashdot News for nerds, stuff that matters

 

Digg Says Yes To NoSQL Cassandra DB, Bye To MySQL (2010-03-13T02:47:00+00:00)

 

Hunting Disease Origins By Whole-Genome Sequencing (2010-03-13T00:04:00+00:00)

 

Nearby Star Forecast To Skirt Solar System (2010-03-12T23:36:00+00:00)

 

Texas Approves Conservative Curriculum (2010-03-12T23:17:00+00:00)

 

Court Rules Against Vaccine-Autism Claims Again (2010-03-12T22:36:00+00:00)

 

Scientists Need Volunteers To Look At the Sun (2010-03-12T21:55:00+00:00)

 

Here Come the Linux iPad Clones (2010-03-12T21:13:00+00:00)

 

DR Congo Ring May Be Giant Impact Crater (2010-03-12T20:48:00+00:00)

more »

 

TechPsychic Tech Rumors and Invented News

 

TechPsychic: Apple Unveils The UC Berkeley or Twitter looks like FriendFeed. (13 Mar 2010)

 

TechPsychic: It's fairly simple Way To social network information in Clearwire and Firefox Add-On according to. (12 Mar 2010)

 

TechPsychic: Make money changed hand, Google Nexus One event. (12 Mar 2010)

 

TechPsychic: Microsoft Bing's Twitter. (12 Mar 2010)

 

TechPsychic: Apple's iPad Comes to say about Google Chrome Release. (12 Mar 2010)

 

TechPsychic: Opera Mini Stats Tell Google Chrome Release Adds Support high-definition video. (12 Mar 2010)

 

TechPsychic: MySpace recently launched Android On Twitter 1 Million Downloads A job applicant are alive, Facebook? (11 Mar 2010)

 

TechPsychic: Android Apps. (11 Mar 2010)

more »

 

SF Gate

 

Photos may show more victims of serial killer (2010-03-12T13:41:35UTC)

 

2 convicted in San Jose baseball bat killing (2010-03-12T21:00:40UTC)

 

Suspect's family sues police over his death (2010-03-12T17:58:41UTC)

 

Oakland skills center given stimulus boost (2010-03-12T13:09:12UTC)

 

Special session ends, state deficit unresolved (2010-03-12T08:00:19UTC)

 

Oakland's Kaplan could be long-term force (2010-03-12T13:17:51UTC)

 

Experts say even Obama getting too many med tests (2010-03-12T21:30:57UTC)

 

State transit projects may be U.S. models (2010-03-12T08:07:31UTC)

 

Roethlisberger's lawyer hires private investigator (2010-03-12T22:55:51UTC)

 

Business Highlights (2010-03-12T22:55:50UTC)

 

Presented By: (12 Mar 2010)

 

Quinn wants to borrow nearly billion _ but how? (2010-03-12T22:53:50UTC)

 

Tri-Continental declares 1Q dividend (2010-03-12T22:48:49UTC)

 

Rates trade in tight range on uneven economic data (2010-03-12T22:47:48UTC)

more »

 

Asia Times Online

 

AN ATOL SPECIAL REPORT : Iran's spies show how it's done (Fri 12 Mar 2010 19:00:00 +0700)

 

The demise of a 'good-for-nothing bandit' (Fri 12 Mar 2010 19:00:00 +0700)

 

A titanic power struggle in Kabul\ (Fri 12 Mar 2010 19:00:00 +0700)

 

Israel puts US on notice (Fri 12 Mar 2010 19:00:00 +0700)

 

When the Mekong runs dry (Fri 12 Mar 2010 19:00:00 +0700)

 

US, China struggle with mid-life crisis (Fri 12 Mar 2010 19:00:00 +0700)

 

South Korea reluctant to take command (Fri 12 Mar 2010 19:00:00 +0700)

 

BOOK REVIEW : Healing invisible wounds (Fri 12 Mar 2010 19:00:00 +0700)

 

Medvedev plays down power role (Fri 12 Mar 2010 19:00:00 +0700)

 

MARKET RAP : Buyers beware (Fri 12 Mar 2010 19:00:00 +0700)

 

IT WORLD : Browser beaten (Fri 12 Mar 2010 19:00:00 +0700)

 

THE MOGAMBO GURU : A debtor's dream (Fri 12 Mar 2010 19:00:00 +0700)

more »

 


Site feed Updated: 2010-Mar-12 21:15