Abstract:To address the problem that the neural network classification model has a large prediction error on the unbalanced data in the Bureau of transportation statistics (BTS) flight dataset, the adaptive synthetic sampling approach (ADASYN) and the synthetic minority over-sampling technique (SMOTE) are used to balance the flight delay categories, ADASYN and synthetic minority over-sampling technique (SMOTE) are used to balance the flight delay categories, and the random forest RF (random forest) model is used for training and Bayesian conditioning. The results show that compared with the method without balanced sampling, the accuracy, recall and F1 score of the method under weight averaging are improved by 19%, 8% and 16%, respectively; the classification prediction accuracy is improved by 8.03% and the model fit index (area under curve, AUC) is improved by 5.4%. Meanwhile, Graph WaveNet, a multi-feature fusion graph neural network model, was used to predict the average flight delay time, and the experimental results showed that the average absolute error and root mean square error of the model were reduced by 16% and 12.45%, respectively, compared with the single-feature model. These methods and results are of reference value for studying flight delay classification and prediction algorithm research.