PERONA：用于资源有效的大数据分析的强大基础架构指纹

论文标题

PERONA：用于资源有效的大数据分析的强大基础架构指纹

Perona: Robust Infrastructure Fingerprinting for Resource-Efficient Big Data Analytics

论文作者

Scheinert, Dominik, Becker, Soeren, Bader, Jonathan, Thamsen, Lauritz, Will, Jonathan, Kao, Odej

论文摘要

为大数据分析应用程序选择良好的资源配置可能具有挑战性，尤其是在云环境中。自动化方法是可取的，因为错误的决策可以降低绩效并提高成本。现有的大多数自动化方法要么从先前的工作负载执行中构建绩效模型，要么进行迭代资源配置分析，直到找到近乎最佳的解决方案为止。这样一来，他们只能获得对基础架构的隐性理解，这很难转移到替代基础架构上，因此，在非常具体的情况下，无法维持分析和建模见解。我们提出了Perona，这是一种在大数据分析的背景下用于使用的强大基础架构指纹指纹的新型方法。 Perona采用基准设备工具的共同集合和目标资源的配置，因此由此产生的基准指标是直接可比较的，并且启用了排名。通过学习输入度量矢量的低维表示，可以丢弃微不足道的基准指标，并考虑到上下文意识，并考虑了以前的基准执行，从而可以检测资源降级。我们对从我们自己的实验收集的数据以及用于资源配置优化的相关工作中收集的数据评估了我们的方法，这表明Perona以紧凑的方式捕获了基准运行的特征，并产生可以直接使用的表示。

Choosing a good resource configuration for big data analytics applications can be challenging, especially in cloud environments. Automated approaches are desirable as poor decisions can reduce performance and raise costs. The majority of existing automated approaches either build performance models from previous workload executions or conduct iterative resource configuration profiling until a near-optimal solution has been found. In doing so, they only obtain an implicit understanding of the underlying infrastructure, which is difficult to transfer to alternative infrastructures and, thus, profiling and modeling insights are not sustained beyond very specific situations. We present Perona, a novel approach to robust infrastructure fingerprinting for usage in the context of big data analytics. Perona employs common sets and configurations of benchmarking tools for target resources, so that resulting benchmark metrics are directly comparable and ranking is enabled. Insignificant benchmark metrics are discarded by learning a low-dimensional representation of the input metric vector, and previous benchmark executions are taken into consideration for context-awareness as well, allowing to detect resource degradation. We evaluate our approach both on data gathered from our own experiments as well as within related works for resource configuration optimization, demonstrating that Perona captures the characteristics from benchmark runs in a compact manner and produces representations that can be used directly.

下载PDF全文

下载文献需遵守相关版权规定

论文标题