tungwaiyip.info

home

about me

links

Blog

< January 2014 >
SuMoTuWeThFrSa
    1 2 3 4
5 6 7 8 91011
12131415161718
19202122232425
262728293031 

past articles »

Click for San Francisco, California Forecast

San Francisco, USA

 

Tokyo - City of Alleys

My impression of Tokyo, a metropolis of 13 million inhabitants, is a bustling city of glamor, high energy and endless activities. When I did a Google Street View tour, I am surprised to find Tokyo is predominately a city of low rises and narrow alleys.

Tokyo Street
Tokyo Street
Tokyo Street

I use Google Map to drop in some random, mid block location in Tokyo. These places are not in the peripheral area. They are fairly central and close to rail lines and shopping districts. I see low rises crisscross by narrow alleys. Across the vast Tokyo metropolitain area is the repeated pattern of alleys and small houses.

2014.01.31 comments

 

Crazy Math Homework

I spent all night working on the Information Theory homework. This is getting crazy.

Information Theory homework

2014.01.30 comments

 

Python 3 is not better, I am moving back to Python 2

After Python 3.3 was released, I begin to embrace Python 3 in a big way last year. I use it in my personal projects whenever possible, when I do not have to worry about compatibility with existing code base. I use it as the default REPL everyday.

While I am not an early adopter, I am still ahead of many other Python programmers to actually make production use of it. Unfortunately, my experience so far is not favorable (note I have not used unicode string much). I ran into miscellaneous problems. At first I thought it is some learning curve I have to overcome. Slowly I come to realize some design decision are really problemetic.

Recently there are more discussions around the problems related to porting to Python 3, notably from Armin Ronacher of Flask. I want to add my experience to the discussion. Many have been written about the core issue of unicode string handling or the death of binary string. My grievance is from a different area, the systematic replacement of list result by iterators.

Iterator Blues

Iterator and generator are some of the best feature of Python. In many cases, a list is interchangeable with its corresponding iterator. When the result is large, iterator is certainly more memory efficient. If you want to loop 100 millions times, you probably don't want to build a list of 100 millions items using range, but rather use the iterator version of xrange.

Since iterators are so great, they argue, from now on Python 3 should only provide iterator result.

This turn out to be a huge annoyance for me.

For the start, when I run a function in the interactive console, I don't see the result anymore. I get an iterator object. No big deal, the Python 3 people say, you can render the result by wrapping it in the list function. Fine, it is an inconvenience, a minor inconvenience you may argue. But you just cannot spin it as a positive change. When I first starting using Python when coming from a Java background, I was delighted to find how easy thing is in Python. Why bother to build a chain of stream handler to read a file in Java? The Pythonic way is to build the entire input in memory. 95% of time the data is so small it does not matter. This was the Python magic and this is beginning to lose.

Using the list will be a tolerable workaround if it is exceptional, if it is only needed in a small number of cases. But I have bumped into the wall so many times I begin to think the reverse is more true.

Other than want to see the result in REPL, there are actually a long list of use cases of a list that cannot be conveniently expressed with iterator, like

  • building nested data structure
  • processes that need the length of the collection
  • take just one element, say the first, from the collection
  • many libraries anticipate a list rather than iterator (like numpy)

For example, I want to parse a CSV like input into a nested data structure.

In [22]: INPUT = """\
   ....: 1,2
   ....: 3,4
   ....: """.splitlines()

In [24]: [map(int, line.split(',')) for line in INPUT]
Out[24]: [<builtins.map at 0x321f270>, <builtins.map at 0x321f090>]

Ouch, I got a list of iterators instead. It used to be easy in Python 2 like this.

In [2]: [map(int, line.split(',')) for line in INPUT]
Out[2]: [[1, 2], [3, 4]]

The problem is you almost never want a nested data structure with iterators inside. When I accidentally did that, it usually causes a bug a few lines down. I have to dig hard into the data structure to find out what has done wrong.

Trying to pull a value from a dictionary gives me further insult. Sometimes I want inspect a value in a dictionary. Which one does not matter, I just need one. With Python 2, it is d.items()[0]. It will be dumb to write a for loop in Python 3 to do this. As an experienced programmer, I know I can use next(). But this gives me an exception?! How about d.items().next()? Fail. How about d.items().__next__(). It fails too. I spent hours before I found out in Python 3, d.keys() correspond not to iterkeys() but an unfamiliar viewkeys() of Python 2. To get any values, I have to first turn it into an iterator, only then can I apply next. When you apply an extra function like list once, it is an inconvenience. When you have to do it twice or more, it becomes a big clutter and big annoyance.

Python 3 renders the map function nearly useless because of the extra list needed. In Python 2, we often have two alternative to express a similar construct, with the map function or list comprehension. Usually I choose map when there is a function readily available, like int in my example above. But because of the extra clutter of list needed with map, the balance has tipped toward list comprehension decisively in Python 3. I should be thankful because they could have remove the list comprehension too and force me to use generator expressions and list.

The bottom line is this change is strictly feature removal. With Unicode, it is a necessary pain to go through and we gain a predictable unicode handling as a bargain. With iterator, there is no new feature to be gain. Existing code are broken for nothing. All the Python 3 people tell you is just to wrap you function with a list, no big deal.

Enough to say I am not convinced. To me this is torture.

Feature Removal Pain

Just a few days ago I was bitten by another feature removal issue. The sort method used to have a cmp feature that's removed in Python. Oh, it is dumb to use cmp anyway because the implementation using key is faster. Except in my case, I was working on a bioinformatics problem that required sorting all suffix substring of a long string. With a string that's millions of characters long, generating millions of substrings quickly exhaust all memory. This trick is to use cmp to generate and compare the substrings on demand. This may be slower, but it works. Removing cmp not only cause inconvenience, the algorithm breaks with no easy workaround.

I solved the problem by going back to Python 2.

Python 3 is the dead end

The official story line is Python 2 is a dead end. Python 3 is the future. I begin to see it differently. Python 2 is actually alive and well. The development of the language and the interpreter has stalled. But innovation continues in third party library and tools. For example, Pandas is a big progress for Python in the data analysis space.

It is Python 3 we should worry about. I fear it would become a facto dead end because of lack of adoption. Outside of my personal use, there are absolutely no proposal from my workplace about moving to Python 3. Two companies and hundred of programmers I have worked with recently are cranking out Python 2 code everyday, not Python 3. At various time I was considering to championing Python 3 at work. I am not considering this anymore in the near future.

Sorry for the critical opinion. I just wish to open up some honest discussion about the merit of Python 3.

2014.01.23 [] - comments

 

Family Snapshot (infrared)

Snapshot of us in infrared. Taken from Exploratorim.

2014.01.13 comments

 

Homeless in San Francisco

The Chronicle run another story on homeless congregate around the city hall, a familiar story that has perhaps intensified, dashing the hope that the recent economic boom could perhaps alleviate the situation.

Photo: Michael Short, The Chronicle

The booming tech industry, which I am a part of, was accused of not doing its part to solve the social issues. Some people see them as rich brat, aloof, disconnected from the general public, and have little awareness of the city's social problems. I am not sure if there is much factual basis in such stereotype. I maybe biased. My guess is, compare to other mainstream business, say insurance companies, commercial contractors, law firms, or fashion retailers, tech companies are probably doing more work, not less, to directly address some of these social issues.

In my last company, we have organized a monthly volunteer event to a homeless shelter. I have participated one month. We are helping out in the cafeteria. The sheltered housed three hundred residents and they are served meal each day. With just a handful of cooks and workers, our help in the cafeteria are very welcomed. A few of my colleagues were at the counter filling and handing out food trays. I was a general helper at the floor. Sometimes there are people with physical difficulty, I brought food to their table so that they do not have to get in line. Other time I clean the table to make it ready for others. After dining hour, we stayed behind with the cook to clean up. We stacked the tables and chairs by the wall and mopped the floor. The place was clean and tidy ready for the next day. I took pride in the work.

Now that I have read the Chronicle article, I have more thinking about that experience. It strike me that something is wrong there. Why aren't the residents volunteer to help themselves? We work hard in our day job, taking care of our own chores at home, and we still find time to help others. Why aren't the residents, who get free food and free housing, work to help themselves? It is right that some of them are old or has physical problem. They should be excused. But the other three quarters are able bodied. The most difficult population, those who have mental illness or drug addicts, are not in the shelter anyway. Not that I am not willing to help. I just wonder what stop them from working for themselves. Wouldn't it be great if they can help themselves? Wouldn't be great if they can help others too?

I agree it will be fairly extraordinary for a shelter resident to be so motivated enough to go to help others. But perhaps this is what they really need to get themselves out of the long term dependency situation. More than food and shelter, some coaching and some extra motivation is perhaps what they need the most.

2014.01.12 comments

 

past articles »

 

BBC News

 

Egypt Coptic Christians killed in bus attack (26 May 2017)

 

G7 summit agrees on countering terrorism but not climate change (26 May 2017)

 

'Progress' in Manchester bomb inquiry (26 May 2017)

 

Germany vaccination: Fines plan as measles cases rise (26 May 2017)

 

What's the French for 'bromance'? (26 May 2017)

 

Wonder Woman women-only screenings in Texas spark row (26 May 2017)

 

Sri Lanka floods: Scores die as monsoon triggers mudslides (26 May 2017)

 

Disney's Bob Iger says the film hack threat was a hoax (26 May 2017)

 

UK achieves solar power record as temperatures soar (26 May 2017)

 

Teacher disciplined over 'most likely to become a terrorist' award (26 May 2017)

more »

 

SF Gate

 

Bay Area News (7 Jan 2012)

 

City Insider (11 Feb 2012)

 

Crime Scene (13 Feb 2012)

 

C.W Newius Column (10 Jan 2012)

 

C.W. Nevius Blog (11 Feb 2012)

 

Education News (10 Jan 2012)

 

KALW (11 Feb 2012)

 

Matier and Ross Blog (11 Feb 2012)

 

Lights out! Eclipse to cut solar power in California (25 May 2017)

 

Jury rules with UC Davis in fight over strawberries (25 May 2017)

 

Business News Roundup, May 26 (25 May 2017)

 

Harassment lawsuit against Monsieur Benjamin, former worker (25 May 2017)

 

Partial eclipse of the heat; Switch helps Best Buy (25 May 2017)

 

Lawsuit alleges GM cheated on diesel pickup truck emissions (25 May 2017)

more »

 


Site feed Updated: 2017-May-26 12:00