论文标题
我们在推断时应该在多大程度上信任AI模型?
To what extent should we trust AI models when they extrapolate?
论文作者
论文摘要
许多影响人类生活的应用都取决于在机器学习和人工智能的保护下已知的模型。这些AI模型通常是复杂的数学函数,可从输入空间映射到输出空间。利益相关者有兴趣了解模型决策和功能行为背后的理由。我们研究了与用于创建模型的数据有关的这种功能行为。关于这个话题,学者经常假设模型不会推断,即,他们从培训样本中学习并通过插值处理新的输入。这个假设值得怀疑:我们表明模型经常推断。外推的程度各不相同,并且可能在社会上是可观的。我们证明,对于大部分数据集而发生的外推发生了多个会考虑合理的。如果我们不知道它们是否推断出来,我们该如何信任模型?给定一个训练有素的模型,可以为患者推荐临床程序,当模型考虑到培训集中的所有样本的患者时,我们是否可以信任建议?如果训练集主要是白人,我们可以在多大程度上相信它关于黑人和西班牙裔患者的建议?推断会发生哪个维度(种族,性别或年龄)?即使对所有种族的人进行了模型训练,它仍然可以以与种族有关的重要方式推断。主要问题是,当AI模型处理训练设置的输入时,我们可以在多大程度上信任它们?本文调查了AI的几个社交应用,显示了模型如何在没有通知的情况下推断出来。我们还研究了受AI模型的特定个体的外推的不同子空间,并报告了如何从数学上而是从人文主义的角度来解释这些外推。
Many applications affecting human lives rely on models that have come to be known under the umbrella of machine learning and artificial intelligence. These AI models are usually complicated mathematical functions that map from an input space to an output space. Stakeholders are interested to know the rationales behind models' decisions and functional behavior. We study this functional behavior in relation to the data used to create the models. On this topic, scholars have often assumed that models do not extrapolate, i.e., they learn from their training samples and process new input by interpolation. This assumption is questionable: we show that models extrapolate frequently; the extent of extrapolation varies and can be socially consequential. We demonstrate that extrapolation happens for a substantial portion of datasets more than one would consider reasonable. How can we trust models if we do not know whether they are extrapolating? Given a model trained to recommend clinical procedures for patients, can we trust the recommendation when the model considers a patient older or younger than all the samples in the training set? If the training set is mostly Whites, to what extent can we trust its recommendations about Black and Hispanic patients? Which dimension (race, gender, or age) does extrapolation happen? Even if a model is trained on people of all races, it still may extrapolate in significant ways related to race. The leading question is, to what extent can we trust AI models when they process inputs that fall outside their training set? This paper investigates several social applications of AI, showing how models extrapolate without notice. We also look at different sub-spaces of extrapolation for specific individuals subject to AI models and report how these extrapolations can be interpreted, not mathematically, but from a humanistic point of view.