Uma Nova Amostragem de Descritores para Predição de Atividade Biológica

التفاصيل البيبلوغرافية
العنوان: Uma Nova Amostragem de Descritores para Predição de Atividade Biológica
المؤلفون: João Vitor Soares Tenório
المساهمون: Loïc Pascal Gilles Cerf, João Paulo Ataide Martins, Raquel Cardoso de Melo, Renato Martins Assuncao
المصدر: Repositório Institucional da UFMG
Universidade Federal de Minas Gerais (UFMG)
instacron:UFMG
بيانات النشر: Universidade Federal de Minas Gerais, 2018.
سنة النشر: 2018
مصطلحات موضوعية: Quimiometria, Aprendizado de Máquina, QSAR (Bioquímica), QSAR, Aprendizado do Computador, Bioinformática, Computação
الوصف: O planejamento de fármacos auxiliado por computador (CADD) usa modelos preditivos para planejar e aprimorar compostos que possuem atividade biológica e podem ser usados como fármacos. O LQTA-QSAR é uma técnica para CADD, onde a amostragem dos descritores usados para treinar o modelo preditivo é feita inserindo os perfis de amostragem conformacional (PAC) dos compostos em uma grade 3D, para calcular a interação entre o PAC e uma sonda nos pontos dessa grade. O problema dessa amostragem é que quando a sonda passa por pontos internos ao PAC, são amostrados descritores com valores irreais. Essa dissertação propõe uma nova amostragem que considera o formato do PAC e impede que a sonda passe por pontos internos ou próximos demais ao PAC. Foram realizados experimentos em conjuntos de compostos usados como fármacos para tratamento de diversas doenças. A proposta conseguiu melhorar a precisão dos modelos preditivos nos seis cenários avaliados. O maior aumento percentual obtido foi de 44%. Machine learning methods are being used to solve different problems in the areas of bioinformatics and chemometrics. One such problem is computer-aided drug design (CADD), which uses predictive modeling to design and improve compounds that have biological activity and can be used as drugs. One of the techniques used CADD is the study of quantitative structure-activity relationships (QSAR), which allows to develop a predictive model that relates the properties of the compounds and their biological activities, this model is typically a linear regression. LQTA-QSAR is a 4D-QSAR technique, where the descriptors used for predictive model training are sampled by aligning the conformational ensemble profiles (CEP) of the compounds in a 3D grid and calculating the interaction between the CEP and a probe (it can be an atom, ion, or functional group) in each point of this grid. The problem with this sampling is that the probe crosses the CEP, when the probe falls into or close to an atom of the CEP, some descriptors presents unrealistic values. To overcome this problem, a new approach for sampling descriptors was proposed in this thesis, which uses surface expansions defined by the convex hull to construct layers around the CEP where the probe must pass. This sampling prevents the probe from passing through the points inside or too close the CEP. To validate the proposal, several experiments were carried out on sets of compounds that can be used as drugs for the treatment of several diseases. The results showed that the proposal was able to build predictive models with greater precision than the original method in the six scenarios evaluated. The highest percentage increase was 44%. We also proposed a workflow where linear regression was replaced by regression tree, which allows to build models easier to interpret. Experiments with this new workflow were also carried out in six scenarios, where in one case the precision was superior to the linear models and in the other cases it was lower, but still satisfactory.
اللغة: Portuguese
URL الوصول: https://explore.openaire.eu/search/publication?articleId=od______3056::c081e148ab20e2eac6803b3a9a806c16
حقوق: OPEN
رقم الأكسشن: edsair.od......3056..c081e148ab20e2eac6803b3a9a806c16
قاعدة البيانات: OpenAIRE