/
/
/
Data Quality Enhancement for Decision Tree Algorithm using Knowledge-Based Model

Data Quality Enhancement for Decision Tree Algorithm using Knowledge-Based Model

Mar 23, 2020Vol. 20 No. 2 (2020)

Abstract

Data mining is an approach to discovering knowledge or unrevealed patterns from huge data sets by using several methods, such as statistics, machine learning and other data analysis techniques. However, the main limitation of these conventional techniques is the ignorance of data relationships and semantics. The data are considered as meaningless numbers with statistical methods being used for model building. For example, the decision tree, a classification method of data mining, is produced from a given set of labeled data, and those data are classified without understanding the semantics of the data or the relationships between attributes. To understand the inherent meaning in the data and to take advantage of the relationships between data elements, we introduce a knowledge-based approach to improve data quality. The proposed approach uses the ontology as the background knowledge to assist the decision tree classification in the process of data preparation. The ontology is used to infer the relationships between attributes and concepts in an ontology. This relationship information can assist the system in identifying related attributes which could assist in the classification process. Two datasets in different domains; agriculture and economics, were used to evaluate the generalization of the proposed approach. Accuracy was the standard measure of success, and was tested in the evaluation of the model. The experimental results showed that the proposed approach can efficiently enhance the performance of the data classification process.

                                                                                                                        

Keywords: data analytics; data mining; ontology; semantic; classification; decision tree  

*Corresponding author: Tel.: +66 81 555 7499

             E-mail: kraisakk@nu.ac.th

References

Author Information

Kraisak Kesorn*

Department of Computer Science and Information Technology, Faculty of Science, Naresuan University, Phitsanulok, Thailand

Kraisak Kesorn*

Department of Computer Science and Information Technology, Faculty of Science, Naresuan University, Phitsanulok, Thailand

About this Article

Journal

Vol. 20 No. 2 (2020)

Keywords

data analytics; data mining; ontology; semantic; classification; decision tree

Published

23 March 2020

Current Journal

Journal Cover
Vol. 20 No. 2 (2020)

Search

Latest Articles

Unknown
Jul 8, 2025

Physical and Antioxidant Properties of Bamboo Shoot: Impact of Boiling on Purine Content and Antioxidant Activity

Unknown
Jul 8, 2025

Optimization of Needleless Electrospinning for the Large- Scale Production of Photocatalytic Nanofibers

Unknown
Jul 8, 2025

Impact of Pichia manshurica UNJCC Y-123 and Pichia cecembensis UNJCC Y-157 on Fermentation of Maggot (Hermetia illucens) Growth Media for Enhanced Broiler Chicken Carcass Quality

Unknown
Jul 8, 2025

Isolation, Screening, and Molecular Identification of Plant Growth-Promoting Rhizobacteria from Maize Rhizosphere Soil