I now have two years worth of data in the Oakland Crime API. The reason I began scrapping data from CrimeWatch was because I believe that only offering the public a rolling 30 days of crime reports is not enough to understand larger trends. I also noticed that there’s a several week data lag from CrimeWatch, which means that recent crimes are under-reported.

I stepped away from the data to focus on my (then) new job and my (more recently) new baby girl. Occasionally, I'd make some changes here and there. Fix a bug. Duplicate the crime data to another storage server. Once I had enough data, the plan was (and is) to create a daily dashboard of Oakland crime data. If I didn't have time to do deep analysis, at least people who aren't pulling/analyzing the data themselves can look at regularly updated charts.

The first step towards the dashboard was collecting the data. The next step would be to build a robust and extensible front end. For this, I decided to go with HighCharts. I'm using a slightly modified version that allows me to create a common charting template and keep the data in external CSV files. The template also makes heavy use of query parameters; I can change the chart title, type, axis labels, and more from the URL of the chart. I also created a sub-domain called viz.unionandgrant.com to host the data and visualization. That is where the dashboard will be.

Here's an example of a chart:

Let's unpack the url for the visual:


This takes data from a .csv file in viz.unionandgrant.com/data/ and applies a template to it


The radius of the data points.



Title and subtitle of the chart, respectively.



Labels for the x- and y-axis.


Generates a x-axis tick every 27 data points (roughly every month for this data set).

You can change any of these values. For example, you can change the title of the chart to say something like 'Serious crime has not meaningfully changed in 2016'. And if you want to share the chart or download a pdf or png, you're free to do so.

A couple things to note: I would like to take this to a point where you can generate charts directly. For now, these CSV files are coming from a SQL database that the crime data is piped into. A quick note about that: I haven't considered how to show the queries I use to generate the data, considering it's coming from a SQL database and not directly from the API. I may just export a .sql file so you can "read", at least, how the data was generated.


comments powered by Disqus