逻辑引导的遗传算法

论文标题

逻辑引导的遗传算法

Logic Guided Genetic Algorithms

论文作者

Ashok, Dhananjay, Scott, Joseph, Wetzel, Sebastian, Panju, Maysum, Ganesh, Vijay

论文摘要

我们提出了一种新颖的辅助真理增强的遗传算法（GA），该算法（GA）使用逻辑或数学约束作为数据增强的手段以及计算损失（与传统MSE结合使用），目的是提高符号回归（SR）算法的数据效率和准确性。我们的方法是逻辑引导的遗传算法（LGGA），将输入一组标签的数据点和辅助真相（ATS）（数学事实）（数学事实，先验是关于回归器旨在学习的未知功能的先验事实），并输出一个可以与任何SR方法一起使用的特殊生成和策划的数据集。我们的方法的三个关键见解是：首先，SR用户经常知道他们试图学习的功能的简单ATS。其次，每当SR系统产生与这些ATS不一致的候选方程时，我们都可以计算反例以证明不一致，此外，此反例可用于增强数据集并以纠正式反馈循环为单位。第三，这些AT的增值是它们在损耗函数和数据增强过程中的使用都可以提高收敛，准确性和数据效率的速度。我们根据“ Feynman关于物理学讲座”的书籍对16个物理方程的Eureqa和TuringBot进行了针对最先进的SR工具的LGGA评估。我们发现，将这些SR工具与LGGA结合使用会导致它们求解多达30.0％的方程式，与没有LGGA的同一工具相比，仅需要一小部分数据，即，数据效率提高了61.9％。

We present a novel Auxiliary Truth enhanced Genetic Algorithm (GA) that uses logical or mathematical constraints as a means of data augmentation as well as to compute loss (in conjunction with the traditional MSE), with the aim of increasing both data efficiency and accuracy of symbolic regression (SR) algorithms. Our method, logic-guided genetic algorithm (LGGA), takes as input a set of labelled data points and auxiliary truths (ATs) (mathematical facts known a priori about the unknown function the regressor aims to learn) and outputs a specially generated and curated dataset that can be used with any SR method. Three key insights underpin our method: first, SR users often know simple ATs about the function they are trying to learn. Second, whenever an SR system produces a candidate equation inconsistent with these ATs, we can compute a counterexample to prove the inconsistency, and further, this counterexample may be used to augment the dataset and fed back to the SR system in a corrective feedback loop. Third, the value addition of these ATs is that their use in both the loss function and the data augmentation process leads to better rates of convergence, accuracy, and data efficiency. We evaluate LGGA against state-of-the-art SR tools, namely, Eureqa and TuringBot on 16 physics equations from "The Feynman Lectures on Physics" book. We find that using these SR tools in conjunction with LGGA results in them solving up to 30.0% more equations, needing only a fraction of the amount of data compared to the same tool without LGGA, i.e., resulting in up to a 61.9% improvement in data efficiency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题