دورية أكاديمية
Improving structural variant clustering to reduce the negative effect of the breakpoint uncertainty problem
العنوان: | Improving structural variant clustering to reduce the negative effect of the breakpoint uncertainty problem |
---|---|
المؤلفون: | Jan Geryk, Alzbeta Zinkova, Iveta Zedníková, Halina Simková, Vlastimil Stenzl, Marie Korabecna |
المصدر: | BMC Bioinformatics, Vol 22, Iss 1, Pp 1-14 (2021) |
بيانات النشر: | BMC, 2021. |
سنة النشر: | 2021 |
المجموعة: | LCC:Computer applications to medicine. Medical informatics LCC:Biology (General) |
مصطلحات موضوعية: | Structural variants, Breakpoints uncertainty problem, Whole genome sequencing, Mendelian inheritance error, Constrained clustering, Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5 |
الوصف: | Abstract Background Structural variants (SVs) represent an important source of genetic variation. One of the most critical problems in their detection is breakpoint uncertainty associated with the inability to determine their exact genomic position. Breakpoint uncertainty is a characteristic issue of structural variants detected via short-read sequencing methods and complicates subsequent population analyses. The commonly used heuristic strategy reduces this issue by clustering/merging nearby structural variants of the same type before the data from individual samples are merged. Results We compared the two most used dissimilarity measures for SV clustering in terms of Mendelian inheritance errors (MIE), kinship prediction, and deviation from Hardy–Weinberg equilibrium. We analyzed the occurrence of Mendelian-inconsistent SV clusters that can be collapsed into one Mendelian-consistent SV as a new measure of dataset consistency. We also developed a new method based on constrained clustering that explicitly identifies these types of clusters. Conclusions We found that the dissimilarity measure based on the distance between SVs breakpoints produces slightly better results than the measure based on SVs overlap. This difference is evident in trivial and corrected clustering strategy, but not in constrained clustering strategy. However, constrained clustering strategy provided the best results in all aspects, regardless of the dissimilarity measure used. |
نوع الوثيقة: | article |
وصف الملف: | electronic resource |
اللغة: | English |
تدمد: | 1471-2105 |
Relation: | https://doaj.org/toc/1471-2105 |
DOI: | 10.1186/s12859-021-04374-3 |
URL الوصول: | https://doaj.org/article/85600dc053d147b4a0c29b0ce31f7ec1 |
رقم الأكسشن: | edsdoj.85600dc053d147b4a0c29b0ce31f7ec1 |
قاعدة البيانات: | Directory of Open Access Journals |
تدمد: | 14712105 |
---|---|
DOI: | 10.1186/s12859-021-04374-3 |