Flight delays have a negative effect on airlines, airports and passengers. Their prediction is crucial during the decision-making process for all players of commercial aviation. Moreover, the development of accurate prediction models for flight delays became cumbersome due to the complexity of air transportation system, the amount of methods for prediction, and the deluge of data related to such system. In this context, this paper presents a thorough literature review of approaches used to build flight delay prediction models from the Data Science perspective. We propose a taxonomy and summarize the initiatives used to address the flight delay prediction problem, according to scope, data and computational methods, giving special attention to an increasing usage of machine learning methods. Besides, we also present a timeline of major works that depicts relationships between flight delay prediction problems and research trends to address them.