论文标题
通过高维计算进行大规模平行的开放修改光谱库搜索
Massively Parallel Open Modification Spectral Library Searching with Hyperdimensional Computing
论文作者
论文摘要
通常用于蛋白质识别的质谱法会产生大量的光谱,需要与大型数据库相匹配。实际上,由于意外的翻译后修改,大多数人仍然保持不明或不匹配。已经提出了开放修改搜索(OMS)作为一种策略,以通过考虑光谱中的每一个可能的变化来提高识别率,但它呈指数级扩展搜索空间。在这项工作中,我们提出了HyperOMS,该HyperOMS基于高维计算来重新设计OMS以应对此类挑战。与代表带有浮点数的光谱数据的现有算法不同,HyperOMS用高维二进制向量编码它们,并在高维空间中执行有效的OMS。借助大量的并行性和简单的布尔操作,可以在并行计算平台上有效地处理HyperOMS。实验结果表明,GPU上的HyperOM最高$ 17 \ times $ $,$ 6.4 \ times $ $ $ $ $ $ $ $ $ $均比最先进的基于GPU的OMS工具高,同时提供了与竞争性搜索工具的可比搜索质量。
Mass spectrometry, commonly used for protein identification, generates a massive number of spectra that need to be matched against a large database. In reality, most of them remain unidentified or mismatched due to unexpected post-translational modifications. Open modification search (OMS) has been proposed as a strategy to improve the identification rate by considering every possible change in spectra, but it expands the search space exponentially. In this work, we propose HyperOMS, which redesigns OMS based on hyperdimensional computing to cope with such challenges. Unlike existing algorithms that represent spectral data with floating point numbers, HyperOMS encodes them with high dimensional binary vectors and performs the efficient OMS in high-dimensional space. With the massive parallelism and simple boolean operations, HyperOMS can be efficiently handled on parallel computing platforms. Experimental results show that HyperOMS on GPU is up to $17\times$ faster and $6.4\times$ more energy efficient than the state-of-the-art GPU-based OMS tool while providing comparable search quality to competing search tools.