Tautomeric Conflicts in Forty Small-Molecule Databases

التفاصيل البيبلوغرافية
العنوان: Tautomeric Conflicts in Forty Small-Molecule Databases
المؤلفون: Marc Nicklaus, Devendra Kumar Dhaked
بيانات النشر: American Chemical Society (ACS), 2021.
سنة النشر: 2021
الوصف: We have analyzed forty different databases ranging in size from a few thousand to nearly 100 million molecules, comprising a total of over 200 million structures, for their tautomeric conflicts. A tautomeric conflict is defined as an occurrence of two or more structures within a data set identified by the tautomeric rules applied as being tautomers of each other. We tested a total of 119 detailed tautomeric transform rules expressed as SMIRKS, out of which 79 yielded at least one conflict. The databases analyzed spanned a wide variety of types including large aggregating databases, drug collections, and experimentally based structure collections. Almost all databases analyzed showed intra-database tautomeric conflicts. The conflict rates as percentage of the database were typically in the few tenths of a percent range, which for the largest databases amounts to more than 100,000 cases per database.
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::22cf188a009b3affddf02367c5952a77
https://doi.org/10.26434/chemrxiv.14779254
حقوق: OPEN
رقم الأكسشن: edsair.doi...........22cf188a009b3affddf02367c5952a77
قاعدة البيانات: OpenAIRE