论文标题

3Dmolnet:分子结构的生成网络

3DMolNet: A Generative Network for Molecular Structures

论文作者

Nesterov, Vitali, Wieser, Mario, Roth, Volker

论文摘要

随着量子化学的机器学习的最新进展,现在可以预测化合物的化学特性并产生新的分子。现有的生成模型主要使用基于字符串或图的表示,但是原子的精确三维坐标通常不编码。已经提出了朝这个方向进行的首次尝试,其中自回旋或基于GAN的模型会产生原子坐标。那些要么在自回归环境中缺乏潜在空间,因此无法对化合物空间进行平滑的探索,或者不能推广到变化的化学成分。我们提出了一种有效生成不限于固定尺寸或组成的分子结构的新方法。我们的模型是基于各种自动编码器,该自动编码器学习了分子的翻译,旋转和置换不变的低维表示。我们的实验产生的平均重建误差低于0.05埃斯特罗姆,表现优于当前的最新方法的四倍,甚至低于大多数化学描述符的空间量化误差。在一组实验中,已通过量子化学方法证实了新产生的分子的组成和结构有效性。

With the recent advances in machine learning for quantum chemistry, it is now possible to predict the chemical properties of compounds and to generate novel molecules. Existing generative models mostly use a string- or graph-based representation, but the precise three-dimensional coordinates of the atoms are usually not encoded. First attempts in this direction have been proposed, where autoregressive or GAN-based models generate atom coordinates. Those either lack a latent space in the autoregressive setting, such that a smooth exploration of the compound space is not possible, or cannot generalize to varying chemical compositions. We propose a new approach to efficiently generate molecular structures that are not restricted to a fixed size or composition. Our model is based on the variational autoencoder which learns a translation-, rotation-, and permutation-invariant low-dimensional representation of molecules. Our experiments yield a mean reconstruction error below 0.05 Angstrom, outperforming the current state-of-the-art methods by a factor of four, and which is even lower than the spatial quantization error of most chemical descriptors. The compositional and structural validity of newly generated molecules has been confirmed by quantum chemical methods in a set of experiments.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源