论文标题
机器学习和人工智能中的少量
Submodularity In Machine Learning and Artificial Intelligence
论文作者
论文摘要
在此手稿中,我们对supply性和超模型性及其特性进行了轻柔的评论。我们提供了大量的少量定义;完整描述了许多示例的子解多功能及其概括;示例离散约束;讨论最大化,最小化和其他操作的基本算法;简要概述了连续的下扩展;以及一些历史应用。然后,我们转向了子二次性在机器学习和人工智能中的有用。这包括摘要,我们完整地说明了NLP中草图,核,核核和抽象性摘要之间的差异,数据蒸馏和凝结以及数据子集选择和特征选择。我们讨论了产生可用于机器学习有用的多种方法,包括启发式手工手工进行,学习或近似学习子管函数或其方面的功能,以及将supsodular函数用作核心生产者的一些优势。我们讨论了次模拟组合信息功能,以及supproullity如何对聚类,数据分配,并行机器学习,主动和半监督学习,概率建模以及结构化规范和损失功能有用。
In this manuscript, we offer a gentle review of submodularity and supermodularity and their properties. We offer a plethora of submodular definitions; a full description of a number of example submodular functions and their generalizations; example discrete constraints; a discussion of basic algorithms for maximization, minimization, and other operations; a brief overview of continuous submodular extensions; and some historical applications. We then turn to how submodularity is useful in machine learning and artificial intelligence. This includes summarization, and we offer a complete account of the differences between and commonalities amongst sketching, coresets, extractive and abstractive summarization in NLP, data distillation and condensation, and data subset selection and feature selection. We discuss a variety of ways to produce a submodular function useful for machine learning, including heuristic hand-crafting, learning or approximately learning a submodular function or aspects thereof, and some advantages of the use of a submodular function as a coreset producer. We discuss submodular combinatorial information functions, and how submodularity is useful for clustering, data partitioning, parallel machine learning, active and semi-supervised learning, probabilistic modeling, and structured norms and loss functions.