ImpClust: An Algorithm to Cluster Chemical Datasets for Drug Discovery.

التفاصيل البيبلوغرافية
العنوان:	ImpClust: An Algorithm to Cluster Chemical Datasets for Drug Discovery.
المؤلفون:	Bhagat, Hutashan V., Kiran, G. Uday, Singh, Manminder
المصدر:	International Journal of Intelligent Engineering & Systems; 2024, Vol. 17 Issue 1, p54-64, 11p
مصطلحات موضوعية:	DRUG discovery, LEAD compounds, CHEMICAL processes, ALGORITHMS, CLUSTER analysis (Statistics), FUZZY algorithms
مستخلص:	Data clustering, an unsupervised machine learning technique, plays a critical part in the process of drug discovery in chemoinformatics. Researchers have come up with numerous clustering algorithms over the past decades that are well suited to analyze large chemical datasets of high dimensionality. The applications of clustering algorithms can be seen in lead compound selection which is the process of identifying the chemical compound that helps in the treatment of disease and results in the development of a new drug in the drug discovery process. The quantitative structure-property relationship (QSPR) in the drug discovery process identifies the compounds having similar properties using clustering algorithms over the structural descriptors of the chemical compounds. The quantitative structure-activity relationship (QSAR) process uses cluster analysis to identify the empirical relationships between the chemical structure and biological activities among similar compounds. The acute toxicity of the chemical compound is controlled by the chemists in the drug discovery process using cluster analysis. Considering the numerous applications of data clustering in the drug discovery process, in this paper, an improved clustering algorithm ImpClust is proposed to cluster similar compounds based on chemical composition. Five benchmark datasets are considered to evaluate the performance of the proposed ImpClust algorithm. The experimental results obtained are compared with the five commonly used clustering algorithms. A total of five cluster validation indexes (DI-Index, COP-Index, DBIndex, CH-Index and Silhouette Index) are used to evaluate the clusters formed utilizing the different clustering algorithms. The experimental findings show that the proposed ImpClust algorithm achieves a significantly high score for Silhouette Index, DI-Index, and CH-Index whereas for COP-Index and DB-Index the proposed ImpClust algorithm achieves a significantly low score in comparison to the five existing clustering techniques. [ABSTRACT FROM AUTHOR]
	Copyright of International Journal of Intelligent Engineering & Systems is the property of Intelligent Networks & Systems Society and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
قاعدة البيانات:	Complementary Index

الوصف
تدمد:	2185310X
DOI:	10.22266/ijies2024.0229.06