Application of support vector machine to three-dimensional shape-based virtual screening using comprehensive three-dimensional molecular shape overlay with known inhibitors.

Research paper by Tomohiro T Sato, Hitomi H Yuki, Daisuke D Takaya, Shunta S Sasaki, Akiko A Tanaka, Teruki T Honma

Indexed on: 20 Mar '12Published on: 20 Mar '12Published in: Journal of Chemical Information and Modeling


In this study, machine learning using support vector machine was combined with three-dimensional (3D) molecular shape overlay, to improve the screening efficiency. Since the 3D molecular shape overlay does not use fingerprints or descriptors to compare two compounds, unlike 2D similarity methods, the application of machine learning to a 3D shape-based method has not been extensively investigated. The 3D similarity profile of a compound is defined as the array of 3D shape similarities with multiple known active compounds of the target protein and is used as the explanatory variable of support vector machine. As the measures of 3D shape similarity for our new prediction models, the prediction performances of the 3D shape similarity metrics implemented in ROCS, such as ShapeTanimoto and ScaledColor, were validated, using the known inhibitors of 15 target proteins derived from the ChEMBL database. The learning models based on the 3D similarity profiles stably outperformed the original ROCS when more than 10 known inhibitors were available as the queries. The results demonstrated the advantages of combining machine learning with the 3D similarity profile to process the 3D shape information of plural active compounds.