This is a report on the impact of the epidemic on the aviation industry.
Use the package manager pip to install the package to run the python code
pip install pandas numpy statsmodels
Some raw datasets are too large to upload in Canvas, we upload them in google drive instead. All datasets must be placed in a relative directory named "./dataset" under the same directory name as the Python file.
The method to read the datasets in Python would be
import pandas as pd
df = pd.read_csv('./dataset/airline.csv')
-
cb_2017_us_county_5m: This is a file containing information about counties in the United States, including their boundaries and other relevant data.
-
cb_2017_us_state_5m: This is a file containing information about states in the United States, including their boundaries and other relevant data.
-
US_counties_COVID19_health_weather_data.csv: This is a CSV file containing COVID-19 cases, health, and weather data for counties in the United States.
-
COVID-19_Vaccinations_in_the_United_States_County.csv: This is a CSV file containing information about COVID-19 vaccinations in counties in the United States.
-
alljoined_airlines.csv: This is a CSV file containing information about all airlines from 18-22, including their origin airports, destination, cancellation, cancellation reason, and other relevant data.
-
airline_key.csv: This is a CSV file containing an identification code for each airport in the alljoined_airlines.csv file.
-
airport_info.csv: This is a CSV file containing information about airports, including their names, identification codes, locations, and other relevant data.
-
passenger data csv: This is a CSV file containin the number of passengers of each airline company from 2002 to 2022, including both international and domestic.
-
covid_airline.csv: This is a CSV file after merging covid and airlines datasets, containing airline data, including the county's name, number of flights, number of cancellations, etc.
-
covid_geo.csv: This is a CSV file containing geographical information about covid, such as location and population data.
-
airline_county_gdf.csv: This is a CSV file containing information about the airlines and counties, including their locations, names, and other relevant data.
-
covid_county_week.csv: This is a CSV file containing information about the number of COVID-19 cases in different counties, organized by week.
-
covid_clean_df_date.csv: This is a cleaned version of the COVID-19 data, organized by date.
-
geo_airports_week.csv: This is a CSV file containing information about the airports in different locations, organized by week.
-
us-counties.csv: This is a CSV file containing information about counties in the United States, including identical the state and county code.
-
covid_gdf.csv: This is a CSV file containing information about COVID-19 cases, organized by location.
-
geo_airports.csv: This is a CSV file containing information about airports, including their locations and names.
Since some of the datasets are too large (over 1 GB), we organize them into a google drive.
-
- pandas
- numpy
- matplotlib
- sklearn
- geopandas
-
- passenger_data.csv
- COVID-19_Vaccinations_in_the_United_States_County.csv
- alljoined_airlines.csv
- airline_key.csv
- airport_info.csv
- cb_2017_us_county_5m
- cb_2017_us_state_5m
-
- Execute the AirportFilter.ipynb, CovidFilter.ipynb
- Handle missing values
- Convert data types
- Remove unnecessary columns
- geo_airports_week.csv
- airline_county_gdf.csv
- Execute Covid_Airline_Merge.ipynb
- Generate covid_airline.csv
-
- Execute Data_Analysis.ipynb
- international_domestic_stuff.ipynb
-
- Plot histograms of the decreasing of airlines from 2018 to 2022
- Plot scatter plots
- Plot line plots to show the changes before and after pandemic
-
- Calculate the mean, median, and sum of the total airline numbers
- Calculate the standard deviation
- Perform hypothesis testing
- Perform linear regression for hypotheses of possible correlation.
-
- Plot bar charts
- Plot line charts
- Plot the result of each airport in a domestic geo graph
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.