This is an analysis on data from nycflights13 package which has information on flights from three New York airports in 2013. The dataset consists of 200,000 observations across 43 variables and a separate test set consisting of 136,776 observations. The data consists of information about flights departure, observed weather, airport locations and flight details.
The goal is to predict if a flight would be delayed or not. In the pathway to accomplish this goal I would demonstrate various tasks such as handling missing values, exploratory data analysis and fitting various models to predict flight delay.