(Part II) Using NYC OpenData to Allow New Yorkers to Map Safer Biking and Walking Routes

Zeke Bergida
4 min readJan 19, 2018

--

In Part I of this blog I went through the steps that were involved in developing SAFERout, the app that uses the NYPD Motor Vehicle Collisions API from NYC OpenData together with Google’s direction services, to allow users to choose safer biking and walking routes in NYC.

As you may guess, the biking route through the park is less accident prone

I also mentioned that there are some serious defects with how the data is currently being assessed and here I will try to plot a course towards resolving those issues.

As a brief review, the basic approach the app takes is to offer the user Google’s regular 2–3 suggested biking or walking routes and then compare the safety of the different routes using the collision location data offered by the collisions API.

The solution that was being applied to the problem of locating the collisions that took place along each route was to use the series of points that make up the overview_path in the Directions API JSON response. The overview_path is a series of LatLong points that form the path that is being offered as the route. Sampling the LatLong data from a JSON response showed that the points were 30 meters apart.

The average street in NYC being no more than 50ft wide, I went ahead and queried the Collisions API for all collisions within 15 meters of each point on the overview_path where a biker or pedestrian was injured or killed and that gave us our data!

query function for the collisions API

The problem is that it turns out that the 30 meter distance between the two points that were sampled doesn’t accurately represent the data! It all depends on how straight the route is. The points on the overview_path are placed mostly by corners and turns but for straighter portions of the route the points can be quite far apart. This of course leaves large portions of the route unaccounted for and tilts the scale against routes that are more twisty and have more turns(and that’s not fair!).

Here is an example of a directions route from Google with markers place on each point on the overview_path.

It’s very obvious that the points on the overview_path are not set equally distant from each other! Clearly there is quite a bit more than 30 meters between some of the markers. Now what?

Well the problem holds the solution. As you can see from the map above, the points on the overview_path are chosen so that a series of straight lines can be drawn from one to the next to create the route.

Since any LatLong point between two consecutive points on the path will be on the route then we can use formulas that can calculate the LatLong of a point every 30 meters between each two points on the overview path. So far my research has shown that it is definitely possible to make the needed calculations (here is one stack overflow link on the topic). Now I need to go and apply them.

(Update — The bearing between any two Lat Long points can be calculated. Consecutive points between those initial two points can then calculated given the bearing and a starting point. This allows us to plot consecutive points along a Google Directions route at a consistent distance from each other as visualized with markers in the below image. We can now query the collisions API for data consistently across our routes :)

Markers placed every 15 meters along a Google Directions route

That takes care of the data that was being left out of the calculations. Next we can address the problem of the data that is being calculated twice!

--

--