tungwaiyip.info

home

about me

links

Media

Yucatán Photos

St Lucia Photos

Photo Album

Videos

Blog

< September 2010 >
SuMoTuWeThFrSa
    1 2 3 4
5 6 7 8 91011
12131415161718
19202122232425
2627282930  

past articles »

Click for San Francisco, California Forecast

San Francisco, USA

 

Traffic Data Analysis

I was doing a traffic data analysis base on the vehicle location data pull from the SF Muni website. This is an extremely interesting project. I pick up a whole lot of new skills while doing this, most notably data analysis and computational geometry. The project also turns out to be a challenging one. A few months (of part time work) has gone by I still haven't nearly achieved my original vision.

Still I think I am getting better at this work. So I'm sharing some interesting data I'm working with. On one bus route I've collected 20,000 location readings on a single day. The goal is to organize them into individual vehicle trip. After a first pass, the data are reduced to 336 trips. But how good is the quality of data? And how good is my trip segregation algorithm? So I plotted the trips' distance against their duration on a chart. The graph quickly helped me to evaluate the quality of the result.

Bus #38 trips characteristic

The first impression is the result looks fairly good. A dense band of points shows that most trip length is about 10 km, and they takes about 40-75 minutes. This seems to match real world experience. At the lower left hand corner is a number of trips that last very short time and cover very little distance. I'll probably treat them as noise and discard them. A curiosity is some trips seem to cover a long distance from 15 to 20 km, much long than the official route. Where did they went?

Closer examination on their track reveal the problem. These are all legitimate eastbound trips. However the bus starts from a depot in the middle of the city. They went all the way to the terminal in the west end. Only from there the real eastbound trip begins. The challenge for me is I'm only interested in the eastbound portion. The trip from the depot to the west end is actually data pollution. I need to find a way to handle these extraneous data.

This still leaves a scatter of points of around 5 km long and last 15 to 35 minutes to inspect. Meanwhile I have difficulty to examine the track because Google Earth is crashing on me all the time. I guess this is a good time to turn away from technical work and to write a blog instead.

2010.09.02 [, ] - comments

 

 

blog comments powered by Disqus

past articles »

 

BBC News

 

Trump signs order undoing Obama climate change policies (28 Mar 2017)

 

Cyclone Debbie: Australia activates disaster response plan (28 Mar 2017)

 

Cyclone Debbie seen from space (28 Mar 2017)

 

Turkey 'spied' on pro-Gulen opponents in Germany (28 Mar 2017)

 

New population of rare tigers found in eastern Thailand (28 Mar 2017)

 

Paris clashes after French police kill Chinese man (28 Mar 2017)

 

Carlos the Jackal : Third French life sentence for notorious militant (28 Mar 2017)

 

Sweden reacts to anger at 'risky births' and maternity care shortages (28 Mar 2017)

 

Kasai unrest: UN experts 'found dead' in DR Congo (28 Mar 2017)

 

Winnie Madikizela-Mandela on Kathrada: 'Same pain as Mandela' (28 Mar 2017)

more »

 

SF Gate

 

Bay Area News (7 Jan 2012)

 

City Insider (11 Feb 2012)

 

Crime Scene (13 Feb 2012)

 

C.W Newius Column (10 Jan 2012)

 

C.W. Nevius Blog (11 Feb 2012)

 

Education News (10 Jan 2012)

 

KALW (11 Feb 2012)

 

Matier and Ross Blog (11 Feb 2012)

 

United’s leggings, Samsung’s Bixby and London’s luxury rentals (27 Mar 2017)

 

Uber resumes self-driving car program after brief suspension (27 Mar 2017)

 

Mission Beach Cafe sued by workers claiming wage violations (27 Mar 2017)

 

Panel grapples with pot banking laws (27 Mar 2017)

 

Business News Roundup, March 28 (27 Mar 2017)

 

Shiptraffic, March 28 (27 Mar 2017)

more »

 


Site feed Updated: 2017-Mar-28 12:00