tungwaiyip.info

home

about me

links

Blog

< January 2014 >
SuMoTuWeThFrSa
    1 2 3 4
5 6 7 8 91011
12131415161718
19202122232425
262728293031 

past articles »

Click for San Francisco, California Forecast

San Francisco, USA

 

Tokyo - City of Alleys

My impression of Tokyo, a metropolis of 13 million inhabitants, is a bustling city of glamor, high energy and endless activities. When I did a Google Street View tour, I am surprised to find Tokyo is predominately a city of low rises and narrow alleys.

Tokyo Street
Tokyo Street
Tokyo Street

I use Google Map to drop in some random, mid block location in Tokyo. These places are not in the peripheral area. They are fairly central and close to rail lines and shopping districts. I see low rises crisscross by narrow alleys. Across the vast Tokyo metropolitain area is the repeated pattern of alleys and small houses.

2014.01.31 comments

 

Crazy Math Homework

I spent all night working on the Information Theory homework. This is getting crazy.

Information Theory homework

2014.01.30 comments

 

Python 3 is not better, I am moving back to Python 2

After Python 3.3 was released, I begin to embrace Python 3 in a big way last year. I use it in my personal projects whenever possible, when I do not have to worry about compatibility with existing code base. I use it as the default REPL everyday.

While I am not an early adopter, I am still ahead of many other Python programmers to actually make production use of it. Unfortunately, my experience so far is not favorable (note I have not used unicode string much). I ran into miscellaneous problems. At first I thought it is some learning curve I have to overcome. Slowly I come to realize some design decision are really problemetic.

Recently there are more discussions around the problems related to porting to Python 3, notably from Armin Ronacher of Flask. I want to add my experience to the discussion. Many have been written about the core issue of unicode string handling or the death of binary string. My grievance is from a different area, the systematic replacement of list result by iterators.

Iterator Blues

Iterator and generator are some of the best feature of Python. In many cases, a list is interchangeable with its corresponding iterator. When the result is large, iterator is certainly more memory efficient. If you want to loop 100 millions times, you probably don't want to build a list of 100 millions items using range, but rather use the iterator version of xrange.

Since iterators are so great, they argue, from now on Python 3 should only provide iterator result.

This turn out to be a huge annoyance for me.

For the start, when I run a function in the interactive console, I don't see the result anymore. I get an iterator object. No big deal, the Python 3 people say, you can render the result by wrapping it in the list function. Fine, it is an inconvenience, a minor inconvenience you may argue. But you just cannot spin it as a positive change. When I first starting using Python when coming from a Java background, I was delighted to find how easy thing is in Python. Why bother to build a chain of stream handler to read a file in Java? The Pythonic way is to build the entire input in memory. 95% of time the data is so small it does not matter. This was the Python magic and this is beginning to lose.

Using the list will be a tolerable workaround if it is exceptional, if it is only needed in a small number of cases. But I have bumped into the wall so many times I begin to think the reverse is more true.

Other than want to see the result in REPL, there are actually a long list of use cases of a list that cannot be conveniently expressed with iterator, like

  • building nested data structure
  • processes that need the length of the collection
  • take just one element, say the first, from the collection
  • many libraries anticipate a list rather than iterator (like numpy)

For example, I want to parse a CSV like input into a nested data structure.

In [22]: INPUT = """\
   ....: 1,2
   ....: 3,4
   ....: """.splitlines()

In [24]: [map(int, line.split(',')) for line in INPUT]
Out[24]: [<builtins.map at 0x321f270>, <builtins.map at 0x321f090>]

Ouch, I got a list of iterators instead. It used to be easy in Python 2 like this.

In [2]: [map(int, line.split(',')) for line in INPUT]
Out[2]: [[1, 2], [3, 4]]

The problem is you almost never want a nested data structure with iterators inside. When I accidentally did that, it usually causes a bug a few lines down. I have to dig hard into the data structure to find out what has done wrong.

Trying to pull a value from a dictionary gives me further insult. Sometimes I want inspect a value in a dictionary. Which one does not matter, I just need one. With Python 2, it is d.items()[0]. It will be dumb to write a for loop in Python 3 to do this. As an experienced programmer, I know I can use next(). But this gives me an exception?! How about d.items().next()? Fail. How about d.items().__next__(). It fails too. I spent hours before I found out in Python 3, d.keys() correspond not to iterkeys() but an unfamiliar viewkeys() of Python 2. To get any values, I have to first turn it into an iterator, only then can I apply next. When you apply an extra function like list once, it is an inconvenience. When you have to do it twice or more, it becomes a big clutter and big annoyance.

Python 3 renders the map function nearly useless because of the extra list needed. In Python 2, we often have two alternative to express a similar construct, with the map function or list comprehension. Usually I choose map when there is a function readily available, like int in my example above. But because of the extra clutter of list needed with map, the balance has tipped toward list comprehension decisively in Python 3. I should be thankful because they could have remove the list comprehension too and force me to use generator expressions and list.

The bottom line is this change is strictly feature removal. With Unicode, it is a necessary pain to go through and we gain a predictable unicode handling as a bargain. With iterator, there is no new feature to be gain. Existing code are broken for nothing. All the Python 3 people tell you is just to wrap you function with a list, no big deal.

Enough to say I am not convinced. To me this is torture.

Feature Removal Pain

Just a few days ago I was bitten by another feature removal issue. The sort method used to have a cmp feature that's removed in Python. Oh, it is dumb to use cmp anyway because the implementation using key is faster. Except in my case, I was working on a bioinformatics problem that required sorting all suffix substring of a long string. With a string that's millions of characters long, generating millions of substrings quickly exhaust all memory. This trick is to use cmp to generate and compare the substrings on demand. This may be slower, but it works. Removing cmp not only cause inconvenience, the algorithm breaks with no easy workaround.

I solved the problem by going back to Python 2.

Python 3 is the dead end

The official story line is Python 2 is a dead end. Python 3 is the future. I begin to see it differently. Python 2 is actually alive and well. The development of the language and the interpreter has stalled. But innovation continues in third party library and tools. For example, Pandas is a big progress for Python in the data analysis space.

It is Python 3 we should worry about. I fear it would become a facto dead end because of lack of adoption. Outside of my personal use, there are absolutely no proposal from my workplace about moving to Python 3. Two companies and hundred of programmers I have worked with recently are cranking out Python 2 code everyday, not Python 3. At various time I was considering to championing Python 3 at work. I am not considering this anymore in the near future.

Sorry for the critical opinion. I just wish to open up some honest discussion about the merit of Python 3.

2014.01.23 [] - comments

 

Family Snapshot (infrared)

Snapshot of us in infrared. Taken from Exploratorim.

2014.01.13 comments

 

Homeless in San Francisco

The Chronicle run another story on homeless congregate around the city hall, a familiar story that has perhaps intensified, dashing the hope that the recent economic boom could perhaps alleviate the situation.

Photo: Michael Short, The Chronicle

The booming tech industry, which I am a part of, was accused of not doing its part to solve the social issues. Some people see them as rich brat, aloof, disconnected from the general public, and have little awareness of the city's social problems. I am not sure if there is much factual basis in such stereotype. I maybe biased. My guess is, compare to other mainstream business, say insurance companies, commercial contractors, law firms, or fashion retailers, tech companies are probably doing more work, not less, to directly address some of these social issues.

In my last company, we have organized a monthly volunteer event to a homeless shelter. I have participated one month. We are helping out in the cafeteria. The sheltered housed three hundred residents and they are served meal each day. With just a handful of cooks and workers, our help in the cafeteria are very welcomed. A few of my colleagues were at the counter filling and handing out food trays. I was a general helper at the floor. Sometimes there are people with physical difficulty, I brought food to their table so that they do not have to get in line. Other time I clean the table to make it ready for others. After dining hour, we stayed behind with the cook to clean up. We stacked the tables and chairs by the wall and mopped the floor. The place was clean and tidy ready for the next day. I took pride in the work.

Now that I have read the Chronicle article, I have more thinking about that experience. It strike me that something is wrong there. Why aren't the residents volunteer to help themselves? We work hard in our day job, taking care of our own chores at home, and we still find time to help others. Why aren't the residents, who get free food and free housing, work to help themselves? It is right that some of them are old or has physical problem. They should be excused. But the other three quarters are able bodied. The most difficult population, those who have mental illness or drug addicts, are not in the shelter anyway. Not that I am not willing to help. I just wonder what stop them from working for themselves. Wouldn't it be great if they can help themselves? Wouldn't be great if they can help others too?

I agree it will be fairly extraordinary for a shelter resident to be so motivated enough to go to help others. But perhaps this is what they really need to get themselves out of the long term dependency situation. More than food and shelter, some coaching and some extra motivation is perhaps what they need the most.

2014.01.12 comments

 

past articles »

 

BBC News

 

Prague gunman killed himself on roof as police approached (22 Dec 2023)

 

Bodycam footage shows police hunting Prague gunman (22 Dec 2023)

 

Alex Batty: Police launch abduction investigation into disappearance of British teen (22 Dec 2023)

 

Banksy stop sign drones art removed in London (22 Dec 2023)

 

Martin Kemp refunds disabled ticket after fans' difficulty with seller (22 Dec 2023)

 

Queues at Dover as Christmas getaway begins for millions (22 Dec 2023)

 

New £38,700 visa rule will be introduced in early 2025, says Rishi Sunak (22 Dec 2023)

 

UK at risk of recession after economy shrinks (22 Dec 2023)

 

Mohamed Al Bared: Student jailed for life for building IS drone (22 Dec 2023)

 

Andrew Tate denied request to visit ill mother in UK (22 Dec 2023)

more »

 

SF Gate

more »


Site feed Updated: 2023-Dec-22 09:00