论文标题
对监督最佳路径森林分类的距离测量之间的比较研究
Comparative Study Between Distance Measures On Supervised Optimum-Path Forest Classification
论文作者
论文摘要
在过去的十年中,机器学习引起了人们的关注,因为它可以解决深远的任务,例如图像分类,对象识别,异常检测和数据预测。解决此类应用程序的标准方法基于监督学习,该学习得到了大量标记数据的帮助,并由所谓的分类器(例如逻辑回归,决策树,随机森林和支持向量机)进行。传统分类器的替代方案是无参数的最佳路径森林(OPF),它使用基于图形的方法和距离度量来在节点和因此树木之间创建弧,从而征服节点,定义其标签并塑造森林。然而,其性能与适当的距离度量密切相关,这可能会根据数据集的性质而异。因此,这项工作提出了一项比较研究,该研究对用于监督的最佳路径森林分类的广泛距离测量。实验结果是使用众所周知的文献数据集进行的,并在基准分类器中进行了比较,这说明了OPF适应不同领域的能力。
Machine Learning has attracted considerable attention throughout the past decade due to its potential to solve far-reaching tasks, such as image classification, object recognition, anomaly detection, and data forecasting. A standard approach to tackle such applications is based on supervised learning, which is assisted by large sets of labeled data and is conducted by the so-called classifiers, such as Logistic Regression, Decision Trees, Random Forests, and Support Vector Machines, among others. An alternative to traditional classifiers is the parameterless Optimum-Path Forest (OPF), which uses a graph-based methodology and a distance measure to create arcs between nodes and hence sets of trees, responsible for conquering the nodes, defining their labels, and shaping the forests. Nevertheless, its performance is strongly associated with an appropriate distance measure, which may vary according to the dataset's nature. Therefore, this work proposes a comparative study over a wide range of distance measures applied to the supervised Optimum-Path Forest classification. The experimental results are conducted using well-known literature datasets and compared across benchmarking classifiers, illustrating OPF's ability to adapt to distinct domains.