Data used

The data (1) that was used for this paper was provided by Waze. It includes alert data from February 2016 that was originally created by users of the app and then stored by Waze.

All Road Hazards

graph

Figure 1: Road hazard graph

From the graph, one can see that the correlation between the levels of traffic and the number of road hazards that have occurred in the area (within 500 feet), is small. This is odd as by its definition, a road hazard is something on the road that prevents or slows down car movements. Factors that might have altered the data will be discussed in the discussion part of this paper.

A look at the outliers

Since the data was shown to have little to no correlation, the focus will instead be shifted to looking at outliers as to explore reasons for why some areas might have worse traffic conditions.

Point 1: Boston City Hall

The point that is the highest on the y-axis which represents the point with the most traffic (513,589) is located at coordinates 42°21’36.3“N 71°03’32.0”W, right outside of the City Hall.

map

Figure 2: Point 1 map

The blue dots represent potholes and the red outline represents an area with a larger number of car on car collisions. The road hazards included with this data point would include hazards on part of Court Street, Cambridge Street and Congress Street. By looking solely at the map, it seems that ironically there is an aggregation of potholes on Congress street, one of the streets on which Boston City Hall is on. Other areas with a high density of potholes include the exit from Storrow Drive to Fenway, Charlesgate near its intersection with Commonwealth Avenue, the Charlestown bridge and its exit and the area north of the Boston Convention center in South Boston. These can be seen on the data visualization page of the project by enabling the pothole view. The number of potholes to the traffic index was plotted on Figure 3.

graph

Figure 3: Potholes graph

On the graph above, one can see that the point (513,589) is on the highest on the y-axis and the second highest on the x-axis. Whilst this alone is not enough information to say that potholes have a considerable impact on traffic, it does provide interesting anecdotal evidence. The trend does not continue as the second and third values by highest traffic index are more central on the x-axis.

Point 2: Massachusetts Turnpike

The point with the second most traffic, located at coordinates 42°20’51.8“N 71°05’57.7”W, is on the Massachusetts Turnpike in the vicinity of Beacon Street.

map

Figure 4: Point 2 map

This time, there seems to be much fewer road hazards present which suggests that factors other than road hazards affect the traffic in that location. One explanation for why there might be a fewer amount of road hazards in the area could possibly be for the fact that the Mass Pike is a toll road and therefore money to maintain the road can come directly through the toll revenue and not through the City of Boston which the rest of Boston’s roads would need to. This is supported by the fact that on the map much fewer potholes appear on the Mass Pike as compared to most other roads. Whilst Point 1 has a greater amount of potholes, Point 2 seem to also have a few accidents. Figure 5 represents the number of accidents against the traffic index.

graph

Figure 5: Accidents graph

In this case, Point 2 is again closer to the median on the x-axis like it was in Figure 1 which makes it unlikely for it to be an outlier. Reasons for why it is will be discussed in the discussion part of the essay.

Findings

The result from the correlation tests suggest that the traffic conditions are not dictated by the road hazards that appear. Other factors are likely to play larger roles.

Discussion

One reason for why the correlation is weak could simply be that road hazards don’t affect traffic as significantly as the infrastructure itself. A traffic jam or a red light can cause more traffic than a pothole or a construction site as whilst the latter slows cars down they usually don’t cause them to come to a halt as hazards such as potholes can usually still be driven over or around and construction is planned and can therefore be avoided by the public ahead if information about where construction sites are are relayed to the public.

One of the limitations of this paper arises from the data itself as it is not possible to create a perfectly accurate representations of road hazards and road conditions with data which only covers a limited amount time as whilst driving patterns are likely to stay the same throughout most of the year, weather conditions can affect the way people drive especially during Bostonian winters.

Another hypothesis for why road hazards does not seem to affect traffic much is that the road hazards are being avoided by drivers with the help of the very company that provided the data. Whilst no claims are being made about Waze altering any of their data to create biased information, it is possible that a large part of the drivers in the city use apps like Waze which allows them to avoid obstacles on the road by taking other roads. Because of this, less drivers would take roads that have road hazards on them reducing the traffic in that location and normalizing the levels of traffic overall. Further research could exclude this factor by seeing analyzing traffic levels with the use of taxi data as to create an estimate of how many cars pass per road per usual, how it differs from times where there is a hazard on the road to counteract the previously mentioned effect.

The last factor that could have influenced the data is the fact that not all roads are equally used and maintained. Roads in downtown Boston such as Court Street or Congress Street are right in the city-center, making it unlikely for there to be much construction or potholes as it would make sense for the city of Boston to minimize dangers on such heavily used roads. Similarly to what was discussed earlier in the section about the point on the Mass Pike, roads which have paid tolls are also more likely to be better maintain as they provide a continuous stream of revenue to their shareholders as long as they are in operation. Additionally, roads which are further from the city center are also less likely to be serviced fast as there are less people to report the road hazard but then again those roads are also being used less as less people have to use them to commute.

A problem arises when one attempts to look at the data without more context as there will always be more information regarding specific roads and areas in Boston which can influence traffic. No matter how well the city works to minimize transportation delays there will always be congestion during rush hours which will result in traffic. The question which remains is what the best approach for improving traffic is. Another factor to take into consideration is how much construction can affect traffic and whether constant maintenance is a good option in the short-run and whether it is sustainable on the long run. What the city of Boston can do, however, is to look at the trends for where and when road hazards appear as to service the roads before they deteriorate and result in considerably less efficient transportation. The tool provided with this project allows for a simplistic look at where such hazards aggregate which can answer the where question. All that remains is the when question which warrants further research.