Note:
Throughout this paper, ‘taxis’ include green
and yellow taxis and ‘other FHVs’ include all the for hire vehicles other than
Uber vehicles and taxis.
Abstract:
For Hire Vehicles have played an important role in New York City’s
transportation. With the increasing number of platforms providing these
services, the number of actors in the city’s transportation network have
increased, raising a wide range of concerns, including their role played in the
city’s traffic congestion.
This project was chosen in light of the debate
between Uber and Mayor de Blasio. For my analysis, I used the ‘Aggregate FHV
Data’ which was available on FiveThirtyEight’s Github account who have been
analyzing the data for the same purposes. This data contains information in the
number of pick ups per day by yellow and green taxis, Uber, Lyft and the other
‘For Hire Vehicles’ in Midtown Manhattan.
For my analysis of the research question – Does
Uber have an impact on the Traffic congestion of the city, I performed a test
of means – Z test to compare the mean of daily pick ups made by Uber vehicles
to the mean of daily pick ups made by the other FHVs in Midtown Manhattan. With
the calculated Z statistic, I rejected the Null Hypothesis, which proved that
Uber vehicles did not lead to traffic congestion in the city (at a significance
level of 0.05).
Introduction:
According to an article published in the blog ‘Hot Air’ in August 2016, when
Uber was launched in New York City in the year 2011, the taxi business in the
city was booming, increasing the number of medallion licenses being issued.
This led to an increase in the number of vehicles on the streets, resulting in
traffic congestion.
In Summer 2015, New York City Mayor Bill de
Blasio raised his concerns about the increase in traffic congestion due to the
increasing number of ride hailing apps, most popular of them being Uber. Mayor
de Blasio decided to cap the number of Uber vehicles on the streets in the city
implying that the uncapped number of vehicles along with the number of taxis on
the streets of the city may lead to ‘urban gridlock’ (FiveThirtyEight, October
2015).
As a result, to study this further, in January
2016, de Blasio administration released the ‘For-Hire Vehicle Transportation
Study’ which highlighted that even though the number of Uber vehicles have
increased in the city, they are not responsible for the increasing traffic
congestion because they are replacing the yellow cabs.
Similarly, a study done by FiveThirtyEight (a
website involved in a number of poll analysis in the fields of politics,
economics, sports etc.) performed a similar statistical test and came up with
the same conclusion as the report by the Mayor’s administration.
Based on the above mentioned studies, I have
attempted to answer the following research question:
Does Uber have an impact on the traffic patterns
in New York City?
Null Hypothesis:
The average number of Uber pick ups in a day on
the streets in Midtown Manhattan is more than the average number of ‘For Hire
Vehicles’ and taxi pick ups in a day on the streets in Midtown Manhattan.
Alternate Hypothesis:
The average number of Uber pick ups in a day on
the streets in Midtown Manhattan is less than or equal to the average number of
‘For Hire Vehicles’ and taxi trips in a day on the streets in Midtown
Manhattan.
The significance level for this analysis is
0.05.
To answer this question, I first specified my
null and alternate hypothesis, followed by specifying the significance level. I
decided to perform a Z test to answer this research question. The Z test
compares the standard deviation of the expected distribution and the observed
result. It tells us how many standard deviations from the mean an observation
is, under the assumption of normality. The logic behind using this test will be
detailed out further in the next section.
Data:
To answer my research question, I needed the following information:
1.
Number of Uber pick ups
2.
Number of Taxi pick ups
3.
Number of other For Hire Vehicles pick ups
Since Midtown Manhattan is one of the Central
Business Districts of New York City, I realized that it would be a good area to
analyze traffic in.
I received data for the months of July, August
and September 2014.
I got all this information from
FiveThirtyEight’s github account which had this information apart from other
numbers such as:
·
Average trips per Hour and day of week (Uber,
Lyft and the other FHVs)
·
Uber, Lyft pick ups per day within Manhattan
core, LaGuardia airport and JFK (2014)
·
Taxi pick ups per day within Manhattan core,
LaGuardia airport and JFK (2013 and 2014)
·
Change in daily Uber and Lyft trips in Manhattan
Core (Sept 2014)
·
Change in daily yellow taxi trips in Manhattan
Core (September 2013 compared with September 2014)
However, I did not need this information to
answer my research question, so I dropped these columns while data wrangling.
Since the document had a lot of unfilled
columns and rows, I dropped them all so that I could get a clean dataframe
which was good for processing the data quickly.
Methodology:
To make the dataframe easy to understand, I made a new column in the dataframe
which added the total number of FHVs (excluding Uber vehicles, yellow and green
taxis) on the streets. Similarly, I combined both the green and yellow taxis.
I also grouped the information given in groups
of three (using groupby) – for the months of July, August and September – this
gave me a clear look at the traffic patterns through the three months in summer
2014.
I also plotted the total daily trips made by Uber, Taxis and other For Hire Vehicles everyday from July 1, 2014 to September 30, 2014 to look at the trends.