Studying the suitability of different data mining methods for delay analysis in construction projects


Department of Industerial Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran


The main purpose of this paper is to investigate the suitability of diverse data mining techniques for construction delay analysis. Data of this research obtained from 120 Iranian construction projects. The analysis consists of developing and evaluating various data mining models for factor selection, delay classification, and delay prediction. The results of this research indicate that with respect to accuracy and correlation indexes, genetic algorithm with K-NN learning model is the most suitable model for factor selection. By conducting the genetic algorithm, eight significant variables causing construction delay are identified as: Changes in project manager, Difficulties in financing project by owner, Number of employees, Project duration, Unforeseen events, Project Location, Number of equipment, How to get the project. This research also revealed that in the case of delay classification and prediction, respectively, bagging decision tree and bagging neural network has the least amount of error in comparison with other techniques. In addition, to compare the diversity of data mining methods, the optimized parameter vectors of the selected models were also identified.