Prediction of Employee Attendance Factors Using C4.5 Algorithm, Random Tree, Random Forest

Riza Fahlapi; Hermanto Hermanto; Antonius Yadi Kuntoro; Lasman Effendi; Ridatu Oca Nitra; Siti Nurlela

doi:10.18196/st.v23i1.7984

Authors

Riza Fahlapi STMIK Nusa Mandiri
Hermanto Hermanto STMIK Nusa Mandiri
Antonius Yadi Kuntoro STMIK Nusa Mandiri
Lasman Effendi STMIK Nusa Mandiri
Ridatu Oca Nitra STMIK Nusa Mandiri
Siti Nurlela STMIK Nusa Mandiri, Jakarta

DOI:

https://doi.org/10.18196/st.v23i1.7984

Keywords:

Employee, performance, accuracy, Random Three, Random Forest

Abstract

Research on the performance of workers based on the determination of standard working hours for absences conducted by workers in a certain period. In disciplinary supervision, workers are expected to be able to provide the best performance in the implementation of work in accordance with predetermined working hours. The measurement of the level of discipline of admission hours for placement workers is carried out every working day, continuously and continuously. Attendance monitoring already uses online attendance by using data downloaded from the online attendance provider as the main data. In addition, data collection is done by filtering employee absentee data and supporting information on the categories that cause mismatches in meeting work schedules. Mobilization of workers according to location and working hours has been regulated in company regulations allowing the placement of workers in accordance with the residence so as not to affect the desired work results the company is still within reasonable limits and can be increased. The assessment of this study as a progression factor inhibiting the company in achieving company targets. From the results of the author's analysis of the prediction of employee delay factors using three algorithms, namely the C.45 algorithm accuracy = 79.37% and AUC value = 0.646, Random Forest Algorithm accuracy = 78.58% and AUC value = 0.807 while for the Random Tree algorithm accuracy = 76.26% and the AUC value = 0.610.

References

Oded Maimon, Lior Rokach, (2010), 2nd Edition Data Mining And Knowledge Doscovery Handbook.

Prasetyo, Eko. (2014). Data Mining Mengolah Data Menjadi Informasi Menggunakan Matlab. Yogyakarta: Andi Offset.

Kalmegh, S.R. (2015). Comparative Analysis of WEKA Data Mining Algorithm RandomForest, RandomTree and LADTree for Classification of Indigenous News Data. International Journal of Emerging Technology and Advanced Engineering, 5(1), 507–517.

Pfahringer, B. (2011). Semi-random model tree ensembles: An effective and scalable regression method. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). https://doi.org/10.1007/978-3-642-25832-9_24

Breiman, L., & Cutler, A. (2007). Random forests — Classification description: Random forests.

Aprilla Dennis. (2013). Belajar Data Mining dengan RapidMiner. Innovation and Knowledge Management in Business Globalization: Theory & Practice, Vols 1 and 2. https://doi.org/10.1007/s13398-014-0173-7.2

Gorunescu, F. (2011). Data mining: Concepts, models and techniques. Intelligent Systems Reference Library. https://doi.org/10.1007/978-3-642-19721-5.

Hastuti, K. (2012). Analisis komparasi algoritma klasifikasi data mining untuk prediksi mahasiswa non aktif. Seminar Nasional Teknologi Informasi & Komunikasi Terapan.

Witten, I. H., Frank, E., & Hall, M. a. (2011). Data Mining: Practical Machine Learning Tools and Techniques (Google eBook). In Complementary literature None.

North, M. (2012). Data Mining for the Masses. In Computer.

Larose, D. T. (2005). Discovering Knowledge in Data: An Introduction to Data Mining. In Discovering Knowledge in Data: An Introduction to Data Mining. https://doi.org/10.1002/0471687545.

Hermanto, B., & SN, A. (2017). Klasifikasi Nilai Kelayakan Calon Debitur Baru Menggunakan Decision Tree C4.5. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 11(1), 43. https://doi.org/10.22146/ijccs.15946

Anggarwal, C.C. (2015). Data Mining: The Textbook. Switzerland: Springer.

Bahar, (2011). Penentuan Jurusan Sekolah Atas Dengan Algoritma Fuzzy C-Means.

Han, J., Kamber, M., & Pei, J. (2012). Data Mining: Concepts and Techniques (3rd ed.). San Francisco: Morgan Kaufmann.

Nguyen, H. K. and Chew, M. T. (2017). RFID-based attendance management system. IEEE. 2017 2nd Workshop on Recent Trends in Telecommunications Research (RTTR), pp. 1-6.

Nowakowski, S., Ognjanović, I., Grandbastien, M., Jovanovic, J., and Šendelj, R. (2014). Two Recommending Strategies to Enhance Online Presence in Personal Learning Environments. Springer. Recommender Systems for Technology Enhanced Learning, pp 227-249.

Shipway, N. J., Huthwaite, P., Lowe, M. J. S., and Barden, T. J.. (2019). Performance Based Modifications of Random Forest to Perform Automated Defect Detection for Fluorescent Penetrant Inspection. Springer. Journal of Nondestructive Evaluation, Vol. 38, No. 37, pp. 1-11.

Wang, H. (2017). Design for Attendance System with the Direction Identification Based on RFID. International Conference on Machine Learning and Intelligent Communications. Springer. MLICOM 2017: Machine Learning and Intelligent Communications, pp 282-290.

Witten, Ian H., (2011). Data Mining Practical Machine Learning Tools and Techniques..

Younis, M. I., Al-Tameemi, J. F. A., Ismail, W., and Zamli, K. Z.. (2013). Design and Implementation of a Scalable RFID-Based Attendance System with an Intelligent Scheduling Technique. Springer. Wireless Personal Communications, Vol. 71, No. 3, pp. 2161-2179.

Prediction of Employee Attendance Factors Using C4.5 Algorithm, Random Tree, Random Forest

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Information