Monday, September 16, 2013

My first try at Public Data Visualization

I am a bit of a Data Enthusiast (wanting to be a good story teller) - well they call it many names, I'm not too sure of where I fit in right now, but among the things that fascinate me are : Data, Exploring Data & Deriving meaning from Data. If only it were as easy as it sounds. Hell, it doesn't even sound easy to me anymore..

I do have a tendency to ramble on -even in writing, so I m going to cut to the chase and write what I really wanted to in the first place.

So, I have been on the lookout for open data sources that would help me derive meaning and visualize so that it would help tell a story. Despite the huge amounts of data available, you'd be surprised at how much of it clean. I think an even bigger problem - which I really need to find a solution to is this: Okay, so I have the data - a 100,000 rows for example. What do I really want to use it for and how am I going to use it?

Over the past week and a half - I have been playing with ESPN's cricket data on Sachin Tendulkar. This is what I have been able to build:

Things I learnt : 
  • In ODI's, 1998 was the time when SRT was at his peak and India won
  • Despite having a great record in terms of Strike rate, # of 100's, I couldn't help but notice that the last 3 years, before he retired from ODI's were the ones he made the least number of runs in his career.
UPDATE [Sept 22, 2013]
After a lot of drilling/slicing and doing what not, I decided to update this post with a few more of my analysis: Included Test data in addition to ODI's :)

Sachin v/s other players - 1998
  • In the matches that India played  - with Sachin in the playing XI, against Australia, Srilanka, Pakistan, South Africa and England  - he wasn't really the deciding factor of the match- in the sense, he did not really single-handedly win the game for India. In fact if you see the tab titles: Sachin v/s other players below, you will notice that in 1998, he was hardly ever responsible for more than half the runs that India scored.
Sachin - timeline
  • Well, I must say, this took me while to build but being a newbie, I still am not certain if it adds value. I tried to show SRT's performances (# of runs) against various teams throughout his career.. Its a motion chart, and the play button would work only if you downloaded the workbook and opened it on your local.
Last - Ground & Tournament
  • The last tab, well I began wondering what percentage of SRT's total runs were scored in big, important matches - and resorted to the Funnel chart.. This confirmed my initial hypothesis that he scored most in the Preliminary Matches. The table shows the exact number of matches played in major tournaments - great strike rate and seems to like playing against the Aussies(-ignoring the Match Result, of course)
  •  This leads to the final bit - Grounds where SRT has got his runs. In ODI's he scored the most runs in Asia but with in case of the Test matches - its more or less balanced

2 comments:

  1. This is some nice facts that you have brought into light....but there is a small suggestion, please check the information again regarding SRT's peak year and also please post some good facts about SRT as well.

    ReplyDelete
  2. Harshad, don't mistake me, I am a big fan too of SRT - everyone knows the good records, I just wanted to present my findings.. One of the main reasons why I felt 1998 was when he was at his peak - was because when I further drilled down, I found that during this year, no only did he score a lot of runs, but it resulted in India winning. Check this out:
    http://public.tableausoftware.com/views/Cricket_test/SR-PeaksandwhenIndiawon?:embed=y&:display_count=no

    In conclusion, I'm not trying to say he wasn't a great player, all I'm trying to figure out is how much did his individual performance influence the match result in India's favor.

    ReplyDelete