论文标题
生物启发的散列,用于无监督的相似性搜索
Bio-Inspired Hashing for Unsupervised Similarity Search
论文作者
论文摘要
果蝇果蝇的嗅觉电路启发了一种新的地方敏感的哈希(LSH)算法,Flyhash。与产生低维哈希代码的经典LSH算法相反,Flyhash产生稀疏的高维哈希码,并且与经典的LSH算法相似,在相似性搜索中,还显示出具有出色的经验性能。但是,Flyhash使用随机预测,无法从数据中学习。我们的工作以Flyhash的灵感和神经生物学中稀疏膨胀表示的无处不在,我们的作品提出了一种新型的Hashing算法生物Hash,以数据驱动的方式产生稀疏的高维Hash代码。我们表明,生物哈什先前发布的各种哈希方法的基准优于先前发布的基准。由于我们的学习算法基于本地和生物学上合理的突触可塑性规则,因此我们的工作为提议提供了证据,即LSH可能是各种生物系统中稀疏膨胀基序的计算原因。我们还提出了一种卷积变体生物围泡,以进一步提高性能。从计算机科学的角度来看,Biohash和Bioconvhash是快速,可扩展和产量的压缩二进制表示,可用于相似性搜索。
The fruit fly Drosophila's olfactory circuit has inspired a new locality sensitive hashing (LSH) algorithm, FlyHash. In contrast with classical LSH algorithms that produce low dimensional hash codes, FlyHash produces sparse high-dimensional hash codes and has also been shown to have superior empirical performance compared to classical LSH algorithms in similarity search. However, FlyHash uses random projections and cannot learn from data. Building on inspiration from FlyHash and the ubiquity of sparse expansive representations in neurobiology, our work proposes a novel hashing algorithm BioHash that produces sparse high dimensional hash codes in a data-driven manner. We show that BioHash outperforms previously published benchmarks for various hashing methods. Since our learning algorithm is based on a local and biologically plausible synaptic plasticity rule, our work provides evidence for the proposal that LSH might be a computational reason for the abundance of sparse expansive motifs in a variety of biological systems. We also propose a convolutional variant BioConvHash that further improves performance. From the perspective of computer science, BioHash and BioConvHash are fast, scalable and yield compressed binary representations that are useful for similarity search.