Research Article | OPEN ACCESS
Retrieval Performance using Different Type of Similarity Coefficient for Virtual Screening
1Shereena Arif, 1Noor Zeemah Shamsheh Khan, 2Nurul Malim and 1Suhaila Zainudin
1Centre of Artificial Intelligence, Faculty of Information Sciences and Technology,
Universiti Kebangsaan Malaysia, 43650 UKM Bangi, Malaysia
2School of Computer Science, Universiti Sains Malaysia, 11800 Penang, Malaysia
Research Journal of Applied Sciences, Engineering and Technology 2015 5:391-395
Received: September 22, 2014 | Accepted: October 24, 2014 | Published: February 15, 2015
Abstract
Development of a new drug needs chemical databases as references to find lead compounds. This study aims to determine the best similarity coefficient to be used for virtual screening task using chemical databases. We calculated the structural resemblance between each pair of chemical structures in their own activity class to get the Mean Pairwise Similarity (MPS) value to see the nature of heterogeneity for each natural product and synthetic chemical databases. The process involves the 2D descriptor of type ECFC4 fingerprint to represent each structure and Tanimoto coefficient to calculate the similarity score between each pair of chemical structures in the same activity class. MPS for an activity class was obtained by taking the average of all similarity scores within that class. Next, three types of similarity coefficients have been used to calculate the similarity score between a query structure and each of the database structure. The results indicate that Tanimoto coefficient shows better performance compared to Russell Rao and Forbes in retrieval task using chemical database. This implies that Tanimoto coefficient is recommended to carry out virtual screening in drug development. More work should be carried out to determine the best combination of similarity coefficient and fingerprint type to get optimal retrieval performance.
Keywords:
Chemoinformatics, mean pairwise similarity, retrieval, similarity search , virtual screening,
References
-
Andersson, P.M., M. Sjöström, S. Wold and T. Lundstedt, 2000. Comparison between physicochemical and calculated molecular descriptors. J. Chemometr., 14(5-6): 629-642.
CrossRef -
Bender, A., J.L. Jenkins, J. Scheiber, S.C.K. Sukuru, M. Glick and J.W. Davies, 2009. How similar are similarity searching methods? A principal component analysis of molecular descriptor space. J. Chem. Inf. Model., 49(1): 108-119.
CrossRef PMid:19123924 -
Chen, C.Y.C., 2011. TCM database@Taiwan: The world's largest traditional Chinese medicine database for drug screening In silico. PloS One, 6(1): e15939.
CrossRef PMid:21253603 PMCid:PMC3017089 -
Franco, P., N. Porta, J.D. Holliday and P. Willett, 2014. The use of 2D fingerprint methods to support the assessment of structural similarity in orphan drug legislation. J. Chem. Inf., 6(1): 1-10.
CrossRef -
Hancock, T., R. Put, D. Coomans, Y. Vander Heyden and Y. Everingham, 2005. A performance comparison of modern statistical techniques for molecular descriptor selection and retention prediction in chromatographic QSRR studies. Chemometr. Intell. Lab., 76(2): 185-196.
CrossRef -
Holliday, J.D., C.Y. Hu and P. Willett, 2002. Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings. Comb. Chem. High T. Scr., 5(2): 155-166.
CrossRef -
Johnson, M.A. and G.M. Maggiora, 1990. Concepts and Application of Molecular Similarity. John Wiley and Sons, New York.
-
Kovacevic, S.Z., S.O. Podunavac-Kuzmanovic, L.R. Jevric, E.A. Djurendic and J.J. Ajdukovic, 2014. Non-linear assessment of anticancer activity of 17-picolyl and 17-picolinylidene androstane derivatives-chemometric guidelines for further syntheses. Eur. J. Pharm. Sci., 62: 258-266.
CrossRef PMid:24929053 -
Medina-Franco, J.L., K. Martínez-Mayorga, A. Bender, R.M. Marín, M.A. Giulianotti, C. Pinilla and R.A. Houghten, 2009. Characterization of activity landscapes using 2D and 3D similarity methods: Consensus activity cliffs. J. Chem. Inf. Model., 49(1): 477-491.
CrossRef PMid:19434846 -
Mridha, P., P. Pal and K. Roy, 2014. Chemometric modelling of triphenylmethyl derivatives as potent anticancer agents. Mol. Simulat., 40(15): 1-18.
CrossRef -
Nantasenamat, C., A. Worachartcheewan, P. Mandi, T. Monnor, C. Isarankura-Na-Ayudhya and V. Prachayasittikul, 2014. QSAR modeling of aromatase inhibition by flavonoids using machine learning approaches. Chem. Pap., 68(5): 697-713.
CrossRef -
Prakash, N. and D.A. Gareja, 2010. Cheminformatics. J. Proteomics Bioinform., 3(1): 249-252.
CrossRef -
Saeed, F., N. Salim and A. Abdo, 2012. Voting-based consensus clustering for combining multiple clusterings of chemical structures. J. Cheminformat., 4(1): 1-8.
CrossRef PMid:23244782 PMCid:PMC3541359 -
Salim, N., J. Holliday and P. Willett, 2003. Combination of fingerprint-based similarity coefficients using data fusion. J. Chem. Inf. Comp. Sci., 43(2): 435-442.
CrossRef PMid:12653506 -
Sheridan, R.P. and S. Joseph, 2004. Calculating similarities between biological activities in the MDL drug data report database. J. Chem. Inf. Comp. Sci., 44(2): 727-740.
CrossRef PMid:15032555 -
Todeschini, R. and V. Consonni, 2009. Molecular Descriptors for Chemoinformatics. John Wiley and Sons, New York.
CrossRef PMCid:PMC2724548 -
Warr, W.A., 2012. Scientific workflow systems: Pipeline pilot and KNIME. J. Comput. Aid. Mol. Des., 26(7): 1-4.
CrossRef PMid:22644661 PMCid:PMC3414708 -
Willett, P., 2003. Similarity-based approaches to virtual screening. Biochem. Soc. T., 31(3): 603-606.
CrossRef PMid:12773164 -
Willett, P., 2011. Similarity searching using 2D structural fingerprints. Method. Mol. Biol., 672(1): 133-158.
PMid:20838967 -
Wolpert, D.H. and W.G. Macready, 1997. No free lunch theorems for optimization. IEEE T. Evolut. Comput., 1(1): 67-82.
CrossRef
Competing interests
The authors have no competing interests.
Open Access Policy
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Copyright
The authors have no competing interests.
|
|
|
ISSN (Online): 2040-7467
ISSN (Print): 2040-7459 |
|
Information |
|
|
|
Sales & Services |
|
|
|