论文标题
数据挖掘在生产过程中的应用
The Application of Data Mining in the Production Processes
论文作者
论文摘要
传统的统计和测量结果无法以正确的方式和适当的时间解决所有工业数据。公开市场意味着客户增加了,生产必须增加才能提供所有客户需求。如今,每天从不同的生产过程以及传统统计或有限测量值中产生的大量数据不足以处理所有日常数据。提高生产和质量需求,以分析数据并提取有关过程如何改进的重要信息。数据挖掘成功地应用于工业过程和一些算法,例如采矿协会规则,决策树记录了不同的工业和生产领域的高专业结果。该研究应用了七种算法来分析生产数据并提取行业领域的最佳结果和算法。 KNN,TREE,SVM,随机森林,ANN,幼稚的贝叶斯和Adaboost应用于基于三个属性分类的数据,而无需忽略任何变量,无论该变量是数值还是分类。从决策树及其整体算法(随机森林和adaboost)获得的曲线(ROC)下的准确性和面积的最佳结果。因此,决策树是处理制造和生产数据的合适算法,尤其是该算法可以处理数值和分类数据。
Traditional statistical and measurements are unable to solve all industrial data in the right way and appropriate time. Open markets mean the customers are increased, and production must increase to provide all customer requirements. Nowadays, large data generated daily from different production processes and traditional statistical or limited measurements are not enough to handle all daily data. Improve production and quality need to analyze data and extract the important information about the process how to improve. Data mining applied successfully in the industrial processes and some algorithms such as mining association rules, and decision tree recorded high professional results in different industrial and production fields. The study applied seven algorithms to analyze production data and extract the best result and algorithm in the industry field. KNN, Tree, SVM, Random Forests, ANN, Naïve Bayes, and AdaBoost applied to classify data based on three attributes without neglect any variables whether this variable is numerical or categorical. The best results of accuracy and area under the curve (ROC) obtained from Decision tree and its ensemble algorithms (Random Forest and AdaBoost). Thus, a decision tree is an appropriate algorithm to handle manufacturing and production data especially this algorithm can handle numerical and categorical data.