tungwaiyip.info

home

about me

links

Blog

< December 2013 >
SuMoTuWeThFrSa
1 2 3 4 5 6 7
8 91011121314
15161718192021
22232425262728
293031    

past articles »

Click for San Francisco, California Forecast

San Francisco, USA

 

Cython for the win, 177x speed increase!

Putting Cython into use, I have great success in speeding up a computation algorithm. Previously I have great success of getting 10x speed increase just by swapping in pypy. This time, with a little bit of work, I get 177x speed improvement using Cython. This vastly exceed my expectation.

The code is to solve a string alignment problem in bioinformatics. With a string s and t, the algorithm has complexity of O(|s||t|). The first step of writing Cython is to identify the performance critical region of code. In this case it is quite obvious the bottleneck is in the inner loop with complexity of O(n^2). No profiling is needed. The origin Python code inner loop is like below. Note that M and B are numpy arrays.

def overlap_alignment_inner(s, t, sigma, M, B):
  for i in range(1,M.shape[0]):
    for j in range(1,M.shape[1]):
      match_score = M[i-1, j-1] + (1 if (s[i-1] == t[j-1]) else -2)
      M[i,j], B[i,j] = max([
                             (M[i-1, j] - sigma, (i-1, j  )),
                             (M[i, j-1] - sigma, (i  , j-1)),
                             (match_score,       (i-1, j-1)),
                           ])

After a few iterations, I have arrived with the optimized code below. It looks somewhat different from the first glance. But I would walk through the chances I have made. I have taken a lot of clues from the tutorial Working with NumPy.

import numpy as np
cimport numpy as np
DTYPE = np.int
ctypedef np.int_t DTYPE_t

def overlap_alignment_inner(s, t, int sigma, np.ndarray[DTYPE_t, ndim=2] M, np.ndarray[DTYPE_t, ndim=3] B):
    cdef int i, j
    cdef int s10, s01, s11
    for i in range(1,M.shape[0]):
        for j in range(1,M.shape[1]):
            s01 = M[i, j-1] - sigma
            s10 = M[i-1, j] - sigma
            s11 = M[i-1, j-1] + (1 if (s[i-1] == t[j-1]) else -2)
            if s11 >= s01 and s11 >= s10:
                M[i,j] = s11
                B[i,j,0] = i-1
                B[i,j,1] = j-1
            elif s01 >= s10 and s01 >= s11:
                M[i,j] = s01
                B[i,j,0] = i
                B[i,j,1] = j-1
            else:
                M[i,j] = s10
                B[i,j,0] = i-1
                B[i,j,1] = j

The first thing to do is to add type to certain variable for early binding. The simple int declaration below increase speed by about 10%.

cdef int i, j

The next thing to do is to type numpy ndarray objects like below. This allow faster indexing compares to normal Python operations. This speed things up a few times.

... np.ndarray[DTYPE_t, ndim=2] M, np.ndarray[DTYPE_t, ndim=3] B ...

The most significant problem turn out to be the use of the max function. In the original code it is a concise and stylish way to pick the best score and simultaneously assign two values to M and B. But this keep the statement as a costly Python function call. By unwinding the function into if statements, it allow the code to be fully optimized and attained the 177x speed improvement! This brings the performance to the league of C.

The unwind code, while longer, is actually quite straight forward. So I backported it to the pure Python code. This also result in a 3x improvement! Turns out use of max here is rather costly.

My first use of Cython is very successful. By identifying and optimizing only a few lines of critical code, it dramatically speed up the performance to the level of C. The rest of the code are not performance critical. They remain easy to write and debug using Python.

2013.12.19 [] - comments

 

 

blog comments powered by Disqus

past articles »

 

BBC News

 

Trump and Putin held another, undisclosed meeting at G20 (19 Jul 2017)

 

Justine Damond: Australian PM calls shooting 'inexplicable' (19 Jul 2017)

 

Let Obamacare fail - Trump's new plan (18 Jul 2017)

 

First child to undergo double hand transplant can now do this (18 Jul 2017)

 

Despacito breaks global streaming record (19 Jul 2017)

 

Doris Payne, 86-year-old jewel thief, arrested again (19 Jul 2017)

 

How did dogs become our best friends? New evidence (18 Jul 2017)

 

Briton and Italian die in beach rescue near Brindisi (18 Jul 2017)

 

Is WhatsApp being censored in China? (19 Jul 2017)

 

Antonio Conte: Chelsea manager signs improved two-year deal at champions (18 Jul 2017)

more »

 

SF Gate

 

Bay Area News (7 Jan 2012)

 

City Insider (11 Feb 2012)

 

Crime Scene (13 Feb 2012)

 

C.W Newius Column (10 Jan 2012)

 

C.W. Nevius Blog (11 Feb 2012)

 

Education News (10 Jan 2012)

 

KALW (11 Feb 2012)

 

Matier and Ross Blog (11 Feb 2012)

 

Cap-and-trade survives razor-thin votes in California Legislature (18 Jul 2017)

 

Ann Coulter battles Delta; Gillette goofs; Titleist sues (17 Jul 2017)

 

Business News Roundup, July 18 (17 Jul 2017)

 

Kitchen gadget maker Juicero feels squeeze, cuts 25% of staff (17 Jul 2017)

 

Judge limits Google pay, job data in Labor Department audit (17 Jul 2017)

 

Penguin pulls book by Doerr, tech CEO amid Duggan harassment suit (17 Jul 2017)

more »

 


Site feed Updated: 2017-Jul-19 00:00