Make Useful Charts
Having been with Bay Area Bike Share for a few month, I am glad with their green bikes available around downtown. I also appreciate Bay Area Bike Share's openness in sharing their system metrics. However the system chart on their web page do not provide much useful information.
The slowly raise bar chart shows cumulative trips taken since launch. The chart may look boring, but as least the upward trend looks comforting, right? Wrong! A cumulative chart by construction can only go up. The comfort feeling is misleading. A more useful way is to chart the number of trips taken by week. Also, data from 5 cities are available individually. So why not plot them side by side for comparison. I did a little work to make an improved chart below.
Some information immediate obvious from the chart is that nearly all trips were made in San Francisco, despite it only has half of the resources deployed. Clearly the bike share program is not working out in the Peninsula at all given their negligible usage. Secondly, there are ups and downs in usage that cannot be observed from the upward looking bar chart. After the first few months the usage has peaked at about 6,000 to 7,000 trips per week, then fallen sharply in December due to holidays. The January number has yet to recover to reach the usage last year.
This is not mean to be critical. It is just to demonstrate how an useful chart can inform us.
This is the link to the simple source code by Pandas.
Protovis pivot into D3
I have used Protovis to create some really interesting charts like the San Francisco Ranked Choice Voting and SFUSD School Assignment by Ethnicity. I have just found out the development of Protovis has pivot into a new library D3.
Although I have not learned the detail of D3 yet, I am quite hopeful.
The biggest different seems to be that the chart is directly render into SVG elements rather than through an intermediate library. So we will have full access the the underlying feature of SVG. I'm a strong believer in SVG and I find this an excellent approach. SVG is like the best keep secret on the web. It provide such a powerful vector graphics functionality, yet so few people uses it directly or even understand it. Documentation is scarce however, depsite it has been standardlized so many years ago.
Protovis/D3 use a declarative and functional approach for building chart. I must say this is not intuitive to a programmer like me. And I have struggled a lot to get things right. Yet it has such great promises I am going to spend more time to dig into it.
I've learned some nugget today regarding face perception. These are some hacks to exploit humans' capability to recognize faces.
Robohash is a cool web service to generate unique robot image from any text. The robots look funny. But it is actually created for an excellent application to help quickly identify a caller from a large pool.
The source code is available in github.
The project is inspired by Identicon. It has similar idea but uses geometric shapes to generate icons.
Chernoff faces, a visualization technique invented by Herman Chernoff, display multivariate data in the shape of a human face.
SFUSD School Assignment by Ethnicity
I have received data for the 2011 SFUSD school assignment. The table in the appendix breaks down assignment by ethnicity. I have been interested in the racial distribution among the San Francisco schools. So I created the chart below to see the observe the pattern.
Each school is shown as a ring. The length of the color arc is proportional to the percentage of the corresponding ethnic group in the school. And the size of the ring is proportional to the size of the school. The school with similar racial make up move close to each other.
See the full chart
San Francisco Unified School District is a very diverse district. The three main groups - white, Chinese, Latino each represent about a quarter of the population. The list of schools below are the most well mixed (i.e. most similar to the overall distribution)
- Starr King
On the other hand, there are also a number of schools that has a dominant racial group. You can easily identify them as Chinese schools, Latino schools, etc.
P.S. -- Updated chart to include 6th and 9th grade data. There aren't a strong pattern like the Kindergarten. The big story is really that white, the largest group in Kindergarten assignment, is only 13% here. Asian constitute 50% in 6th and 9th grade. (3/27)
SFUSD STUDENT ASSIGNMENT
MARCH 2011 SCHOOL ASSIGNMENT OFFERS
March 18, 2011
Appendix E: School Offers
Five Year Comparison of Round 1 Demand
March 12, 2010
SF Montessori opening is estimated
Charting San Francisco Ranked Choice Voting
Last Tuesday, 5 of San Francisco city supervisor's seats are opened for election. San Francisco uses ranked choice voting. Each voter can mark multiple candidate in order of preference. In the initial round, if the top candidate did not get a majority of votes, it goes into instant run-off. The candidates with the fewest number of votes are eliminated. Its votes are transfered to the remaining candidates according to the next preference on each ballot. This process repeats until one candidate obtains a majority of votes among the remaining candidates.
This year 4 out of 5 of the district election has headed for instant run-off. The most crowded field are the 14 candidates competing for district 6 seat and the 21 candidates competing for the district 10 seat. The preliminary election result released (Nov 6) shows the epic battle of over 10 rounds of instant run-off before a candidate wins. To visualize how the process play out, I have charted the election data below.
The colored lines connects the candidates in each round. A circle marks the front-runner and a cross marks the candidates eliminated. Each time a candidate is eliminated, those votes are transfered to the remaining candidates. This "lift" up the line of the next round. In district 6 we can see that the relative position of the top 3 candidate are unchanged through out the process, with Jane Kim leading all the way. District 10 is more dramatic, with the front-runner status pass around the top 4 candidates. We see that Marlene Tran has propelled to the front-runner in round 15 after Teresa Duque was eliminated in round 14 (marked by the red cross). Interestingly no one but Tran has received much of Duque's vote. In a similar fashion, Dewitt Lacy's elimination lift up Maria Cohen. And Steve Moss' votes have passed mainly to Tony Kelly. Finally Lynette Sweet's vote goes mostly to Maria Cohen, making her the winner in under the preliminary result.
The election department is expected to release the final result in a few weeks. It will possibly change the result in a close race such as district 2. I will updated the chart once the data is available.
I have also made a first pass chart that I find not as informative.
To view the election chart it requires a web standard compliant browser with SVG support, e.g. Firefox, Chrome, Safari or Opera. IE 8 or below is not supported. Thank you.
Kindle 3 - Book Pricing and DRM
Part 5/5 of my Kindle 3 Review
In my previous posts I have talked about the good and the bad of Kindle. In this last post, however, I am going to talk about the ugly part. First of all E-book's pricing. And then the fact that you don't own the book you pay for due to Amazon's DRM.
Amazon tout that at the price of $9.99, many E-book are a bargain compare to physical book. Customers should expect a discount on E-book. After all, an electronic download's marginal cost is nearly 0 compares to a physical book with real cost. But after a quick scan, I find that in many cases E-book actually cost more than paperback. Very few publisher discount a E-book to below $10. Yet a brand new paperback can have for less, for example at $6. Used book, which used to constitute half of my purchases, can have for as little as $1 plus shipping. E-book is no bargain. And I loath to pay more for it.
And then I cannot borrow E-book from libraries. I read a good number of books borrowed from library. In most cases I can afford to pay for the books. And I'm very willing to pay for books I like. I often buy them for collection rather than reading because I've already finished the reading on the library's copy. The thing is I have also bought a stack of book that's sitting and collecting dust without being read. Often they are not as interesting as I first thought. They wasted my money, taking space and I cannot not get myself to throw them away. So the biggest value of borrowing library book is actually to ensure I only buy worthy books and avoid unnecessary cluttering.
Kindle book resolve this in some extent. If I buy a wrong book, it is only a few MB of electronic document, unlike a physical object that I have deal with from time to time. Wasting $10 is not as big a deal. And I think Amazon provides 7 day refund period. Perhaps I need to adjust my spending habit, like set aside a $200 budget a year on E-book. If I have to splurge my hard earn money, there is no better thing to splurge on that culture and literacy.
Far worst than the price is Kindle book's DRM. It is Amazon's copy protection scheme that restrict you to use the E-book you've pay for only on a registered Kindle device. You cannot read it on a different device, like Sony's eReader of Nook. Nor can you purchase a book from other vendor and read on Kindle (unless it is non-DRM).
If you read my posts, it is clear that I have mixed review on Kindle. I expect it to be replace by something quite different soon. For me to build my collection committing to one vendor is a major issue. An electronic gadget like Kindle have a lifespan of only a few years. After that, it either breaks or become functional obsolete. Can I count on Amazon to make a compatible replacement at a reasonable cost in the future? Can I even count on Amazon as an commercial entity to live on? And if it folds, a totally legitimate concern of mime, what is going to happen to my books collection?
All these highlight one major different between buying a physical book and a E-book. The E-book purchase doesn't buy you anything but a long term lease to be use on some designated device. And I'm not happy to pay $10 to rent a book that I may not be able to renew.
There is one good guy in this market. O'Reilly, a major publisher of computer and technology book, offer their E-book free of DRM. You can keep it forever and read it on any existing or future devices. I hope more publishers will do the same. Ironically this is what Amazon have done in the music space. Their music store offers DRM free download against incumbent Apple's DRMed download. That was the reason I prefer Amazon over Apple.
Kindle 3 - Web Browser and other features
Part 4/5 of my Kindle 3 Review
When I first looked at Kindle 3's spec, it was the web browser with unlimited 3G that trips me to order it immediately. Kindle has a much larger screen than my smart phone. Maybe it can replace the smart phone as the mobile browser for me! And 3G is free too!
Unfortunately, once I started to use it, it becomes immediately clear that Kindle's web browser is a lot inferior to smart phone's. I am not expecting to watch Flash video or run Ajax web applications. All I want to do is to zoom into a part of the article and have it formatted at readable font size and line length. In many case Kindle's browser failed to do that. The zooming is set at a 50% increment, meaning it is either too large or too small in most cases. Scrolling using the direction pad is slow and inconvenient. And the lack of touch screen is another deficiency for general web browsing. Kindle is pretty much a last resort choice for web browsing. The good thing is it has a Article mode that strip off all the unnecessary stuff and shows the main content in a readable format. I always use Article when possible.
It does not mean the web browser is a superfluous feature. One thing that distinguish e-Reader from paper book is that it can have live link to the web. I am the kind of person who often follows footnote for some extra information. URL is the footnote of web era. In the short time of using Kindle I have already benefited a lot from this capability.
I think the main issue of Kindle's web browser is a software one. It needs better zooming and text flowing to take advantage of the screen. Perhaps an Opera mini for Kindle can do this better?
I am really desperate to find a good mobile and writing and note taking device. So that every time I see a handheld device with a keyboard, I see it as the note taking device I'm looking for. But every time I'm disappointed. It seems such a basic function but few device can really do it well.
First of all Kindle does not have a note taking app at all. The closet thing is you can attach a note to a book, which appear as a number on a page. But it is conceivable that Kindle might ship a note taking app someday. However my experience with the device says the hardware does not lend itself to a good typing device.
First of all Kindle has all the sin of bad keyboard layout I have found in other mobile device. In addition, its third row of alphabet keys lack a shift key on the left. So Kindle designer just shift the whole row of keys to the left to fill its place. Unfortunately this makes the third row offset by one key compare to regular keyboard. For example, my finger is trained that the 'N' key is directly below the 'U' key. On Kindle, I got the 'M' instead. This causes whole lot of mistyping.
The Kindle's bigger screen have so much promises. Yet it is another disappointment to use it as a writing device. Sadly for me the best note taking device is still the first generation Sidekick.
Next » Pricing and DRM
Kindle 3 Navigation
Part 3/5 of my Kindle 3 Review
Besides reading, the next important thing to do is navigation. At least it is important for non-fiction. I want to find out where I'm currently. And I want to jump to other parts of the book and quickly and as painlessly as possible. Here I look at the details.
The first curious thing about Kindle is its use of a "location" number to indicates your position in the book. Page number is what we usually use for real book. It is easy to see page number cannot be easily translate to E-book since each page shows different number of words depends on the screen size and font size. Still being used to physical book, it is a meaningful measurement to me. 1000 pages is a very long book, 200 pages is a short book, I can read 50 pages of novel a day, etc. The 4-digit location number seems quite meaningless to me.
Maybe I'll understand the location number more as I become acclimatized to Kindle. For now I offer you a quick rule of thumb. Divide the location number by 20 will give you a rough page number. For example, the book below has a length of 8558, which is just over 400 pages. My location is at 4174, that is about 210 pages into the book.
But it is not just the page number that I need. Often I like to know where I'm in the book. Like which chapter, which section? Kindle does not offer any easy way to show me. The web has a good navigation guide called breadcrumb. Can I have the breadcrumb on Kindle? I think it is lot more useful than the location line.
The primary navigation control on Kindle is Page Up and Page Down, accessible with the two designated buttons on both side of the device. This is one of the best part of Kindle. Since jump to next page is the most frequent movement, you will need. It maps to a large page down button. I find it superior to other control available on PC like Scroll bar and scroll wheel. I also dislike iPhone's flicking gesture. It takes times for the slippery screen movement to stabilize. And it either scroll too much or too little, necessitate more finger control to correct the movement. Page up, page down is precise, no more no less, it is just what I need.
This also applies to other function in Kindle, for example when browsing the web. You should use the same page up, page down to go through the document.
Besides page up and page down, other navigation in Kindle is not as easy. On a computer, I'm used to press ctrl-Home or ctrl-End to move to the beginning or the end of a document. There is no special key mapping on Kindle. Going to table of content requires 4 sequence keys - Menu, Select, Down, Select. And this is already an easier task. On a computer, often people can train themselves to remember the key sequence for some frequently used task so that they can press the keys really quickly. It is unlikely I can similarly train myself on Kindle. If you can test this on Kindle, you may find that the third key in the sequence, down, is a leap of faith. In most case navigation requires me call up the menu and carefully picking the right options among the many from the screen. It is not something people can do speedily.
Next » Web Browser etc
Kindle 3 First Impression
I am not really a gadget person. When the Amazon Kindle first come out, I
shrugged it off. It maybe a nice gadget, but what does it really do for me to
justify the $300+ price tag? So I ignored it for 3 years. It was a surprised
when Kindle 3 is announced, I latched on it at once. The affordable $189 price
point is certainly a factor. And a web browser with unlimited 3G wireless?
Perhaps I can ditch my smart phone and save big without the expensive data plan?
So I placed my order immediately, was told it was sold out and the lead time to
shipping will be about one month. The anticipation was overwhelming. I started to
check for the order status and other user review obsessively. I have even
ordered a few E-books to be really to load on the device. It finally arrived at my
door step yesterday, one week before the promised date. [more...]
The Fall of Hong Kong Entertainment
The Hong Kong film industry, once a vibrant and dominant player in the Chinese and Asia cinema, is in a steep decline. It happens that I come across a web entry on Jin Yong (金庸), the most popular and prolific martial art novels writer in Asia. In the entry is a chronological list of all the screen adaptations. Since Jin Yong's novels has been made into TV and movies in regularity, it serves as a proxy to the activity of the entertainment industry. [more...]
My New Desktop Background
This is the picture I've used as my new desktop background. What is that? I think this is just some random picture my 4 year old son has taken. You know, he has learned to aim a camera to something, press the shutter button and got his own pictures. He makes outrageous mistakes by photographers' standard. A lot of times the lens was obscured by his finger, the picture was blurry, or he has framed on some unobvious objects. But sometimes his pictures also have certain abstract, artistic quality, certain intricacy of light, something totally original that you will never know how to make. So instead of just deleting his pictures, I have kept some of them.[more...]
Visualization Using Variable Width Bar Chart
I was plotting a chart to visualize various development project in my area. The primary concern is in development density. But the project footprint itself is also a factor. We want to focus on big project that matters. We also want to identify outliers that has very high or very low density, but otherwise has small overall size and less relevance.
past articles »