Studying the suitability of different data mining methods for delay analysis in construction projects


Department of Industerial Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran


The main purpose of this paper is to investigate the suitability of diverse data mining techniques for construction delay analysis. Data of this research obtained from 120 Iranian construction projects. The analysis consists of developing and evaluating various data mining models for factor selection, delay classification, and delay prediction. The results of this research indicate that with respect to accuracy and correlation indexes, genetic algorithm with K-NN learning model is the most suitable model for factor selection. By conducting the genetic algorithm, eight significant variables causing construction delay are identified as: Changes in project manager, Difficulties in financing project by owner, Number of employees, Project duration, Unforeseen events, Project Location, Number of equipment, How to get the project. This research also revealed that in the case of delay classification and prediction, respectively, bagging decision tree and bagging neural network has the least amount of error in comparison with other techniques. In addition, to compare the diversity of data mining methods, the optimized parameter vectors of the selected models were also identified.


Abd El-Razek, M. E., Bassioni, H. A., & Mobarak, A. M.(2008). “Causes of delay in building construction projects in Egypt”, Journal of Construction Engineering and Management, Vol. 134, No. 11, pp. 831-841.

Ahmed, S. M., Azhar, S., Castillo, M., & Kappagantula, P. (2002). “Construction delays in Florida: An empirical study”, Final Report Submitted to State Florida, Department of Community Affairs.

Al-Momani, A. H. (2000). “Construction delay: a quantitative analysis”, International journal of project management, Vol. 18, No. 1, pp. 51-59.

Anand, S. S., & Büchner, A. G. (1998) Decision support using data mining, Financial Times Management.

Arditi, D., Akan, G. T., & Gurdamar, S. (1985). “Reasons for delays in public projects in Turkey”,Construction Management and Economics, Vol. 3, No. 2, pp. 171-181.

Assaf, S. A., & Al-Hejji, S. (2006). “Causes of delay in large construction projects”. International journal of project management, Vol. 24, No. 4, pp. 349-357.

Aziz, R. F. (2013). “Ranking of delay factors in construction projects after Egyptian revolution”, Alexandria Engineering Journal, Vol. 52, No. 3, pp. 387-406.

Baldwin, J. R., Manthei, J. M., Rothbart, H., & Harris, R. B. (1971). “Causes of delay in the construction industry”, Journal of the Construction Engineering,Vol. 97, No. 2, pp. 177-187.

Berry, M. J., & Linoff, G. (1997) Data mining techniques: for marketing, sales, and customer support, John Wiley & Sons, Inc .

Cabena, P., Hadjinian, P., Stadler, R., Verhees, J., & Zanasi, A. (1998) Discovering data mining: from concept to implementation, Prentice-Hall, Inc

Chan, A. P., Ho, D. C., & Tam, C. M. (2001) “Design and build project success factors: multivariate analysis”, Journal of construction engineering and management, Vol. 127, No. 2, pp. 93-100.

Chan, D. W., & Kumaraswamy, M. M. (1997). “A comparative study of causes of time overruns in Hong Kong construction projects”, International Journal of project management, Vol. 15, No. 1, pp. 55-63.

Chang, A. S., & Leu, S. S. (2006). “Data mining model for identifying project profitability variables”, International Journal of Project Management, Vol. 24, No. 3, pp. 199-206.

Cheng, M. Y., Tsai, H. C., & Liu, C. L. (2009). “Artificial intelligence approaches to achieve strategic control over project cash flows”, Automation in construction, Vol. 18, No. 4, pp. 386-393.

Cheng, M. Y., Wu, Y. W., & Wu, C. F. (2010). “Project success prediction using an evolutionary support vector machine inference model”, Automation in Construction, Vol. 19, No. 3, pp. 302- 307.

Cheung, S. O., Wong, P. S. P., Fung, A. S., & Coffey, W. V. (2006). “Predicting project performance through neural networks”, International Journal of Project Management, Vol. 24, No. 3, pp. 207- 215.

Chi, S., Suk, S. J., Kang, Y., & Mulva, S. P. (2012). “Development of a data mining-based analysis framework for multi-attribute construction project information”. Advanced Engineering Informatics, Vol. 26, No. 3, pp. 574-581.

Chua, D. K. H., Loh, P. K., Kog, Y. C., & Jaselskis, E. J. (1997). “Neural networks for construction project success”, Expert Systems with Applications, Vol. 13, No. 4, pp. 317-328.

Cios, K. J., & Kurgan, L. A. (2005). “Trends in data mining and knowledge discovery”, In Advanced techniques in knowledge discovery and data mining (pp. 1-26), Springer London.

CRISP-DM. (2003) Cross Industry Standard Process for Data Mining Doloi, H. (2009). “Analysis of pre‐qualification criteria in contractor selection and their impacts on project success”, Construction Management and Economics, Vol. 27, No. 12, pp. 1245-1263.

Doloi, H., Sawhney, A., Iyer, K. C., & Rentala, S. (2012). “Analysing factors affecting delays in Indian construction projects”. International Journal of Project Management, Vol. 30, No. 4, pp. 479-489.

Fallahnejad, M. H. (2013). “Delay causes in Iran gas pipeline projects”. International Journal of Project Management, Vol. 31, No. 1, pp. 136-146.

Fan, H., AbouRizk, S., Kim, H., & Zaïane, O. (2008). “Assessing residual value of heavy construction equipment using predictive data mining model”. Journal of Computing in Civil Engineering, Vol. 22, No. 3, pp. 181-191.

Faridi, A. S., & El‐Sayegh, S. M. (2006). “Significant factors causing delay in the UAE construction industry”,Construction Management and Economics, Vol. 24, No. 11, pp. 1167-1176.

Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). “The KDD process for extracting useful knowledge from volumes of data”, Communications of the ACM, Vol. 39, No. 11, pp. 27-34.

Frawley, W. J., Piatetsky-Shapiro, G., & Matheus, C. J. (1992). “Knowledge discovery in databases: An overview”, AI magazine, Vol. 13, No. 3, pp. 57.

Frimpong, Y., Oluwoye, J., & Crawford, L. (2003). “Causes of delay and cost overruns in construction of groundwater projects in a developing countries; Ghana as a case study”. International Journal of project management, Vol. 21, No. 5, pp. 321-326.

Han, S. H., Kim, D. Y., & Kim, H. (2007). “Predicting profit performance for selecting candidate international construction projects”. Journal of Construction Engineering and Management, Vol. 133, No. 6, pp. 425-436.

Iranmanesh, S. H., & Mokhtari, Z. (2008). “Application of data mining tools to predicate completion time of a project”, In Proceeding of world academy of science, engineering and technology. Vol. 32, No. 1, pp. 234-239.

Kaliba, C., Muya, M., & Mumba, K. (2009). “Cost escalation and schedule delays in road construction projects in Zambia”. International Journal of Project Management, Vol. 27, No. 5, pp. 522-531.

Kaming, P. F., Olomolaiye, P. O., Holt, G. D., & Harris, F. C. (1997). “Factors influencing construction time and cost overruns on high-rise projects in Indonesia”, Construction Management & Economics, Vol. 15, No. 1, pp. 83-94.

Kantardzic, M.(2011) Data mining: concepts, models, methods, and algorithms, John Wiley & Sons. Kim, H., Soibelman, L., & Grobler, F. (2008). “Factor selection for delay analysis using Knowledge Discovery in Databases”. Automation in Construction, Vol. 17, No. 5, pp. 550-560.

Ko, C. H., & Cheng, M. Y. (2007). “Dynamic prediction of project success using artificial intelligence”. Journal of construction engineering and management, Vol. 133, No. 4, pp. 316-324.

Kohavi, R., & Sommerfield, D. (1995)”Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology”, In KDD (pp. 192-197).

Marcoulides, G. A. (2005). “Discovering Knowledge in Data: an Introduction to Data Mining”. Journal of the American Statistical Association, Vol. 100, No. 472, pp. 1465-1465.

Ma, Z., Lu, N., & Gu, W. (2008). “A decision support system for construction projects based on standardized exchanged documents”, Tsinghua Science & Technology, Vol. 13, No. 1, pp. 354-361.

Mansfield, N. R., Ugwu, O. O., & Doran, T. (1994). “Causes of delay and cost overruns in Nigerian construction projects”, International Journal of Project Management, Vol. 12, No. 4, pp. 254-260.

Munns, A. K., & Bjeirmi, B. F. (1996). “The role of project management in achieving project success”. International journal of project management, Vol, 14, No. 2, pp. 81-87.

Odeh, A. M., & Battaineh, H. T. (2002). “Causes of construction delay: traditional contracts” International journal of project management, Vol. 20, No. 1, pp. 67-73.

Sambasivan, M., & Soon, Y. W. (2007). “Causes and effects of delays in Malaysian construction industry”, International Journal of project management, Vol. 25, No. 5, pp. 517-526.

Shadrokh, S., & Aghdashi, S. (2012) “Data Mining in Construction’s Project Time ManagementKayson Case Study”

Shahrabi, J., & Taghavi, Z. S. (2012) “A Data Mining Model For Feasibility Analysis Of Mineral Projects”. International Journal of Advances in Engineering & Technology, Vol. 4, No. 2.

Sharma, S., & Osei-Bryson, K. M. (2009). “Framework for formal implementation of the business understanding phase of data mining projects”, Expert Systems with Applications, Vol. 36, No. 2, pp. 4114-4124.

Son, H., Kim, C., & Kim, C. (2012). “Hybrid principal component analysis and support vector machine model for predicting the cost performance of commercial building projects using pre-project planning variables”, Automation in Construction, Vol. 27, No. 1. pp. 60-66.

Yang, J., & Honavar, V. (1998). “Feature subset selection using a genetic algorithm”, In Feature extraction, construction and selection (pp. 117-136), Springer US.

Zack, J. G. (2003). “Schedule delay analysis; is there agreement?”, In Proc., PMI-CPM College of Performance Spring Conf (pp. 7-9).