BRAKER3: Fully Automated Genome Annotation Using RNA-Seq and Protein Evidence with GeneMark-ETP, AUGUSTUS and TSEBRA

التفاصيل البيبلوغرافية
العنوان: BRAKER3: Fully Automated Genome Annotation Using RNA-Seq and Protein Evidence with GeneMark-ETP, AUGUSTUS and TSEBRA
المؤلفون: Gabriel, Lars, Brůna, Tomáš, Hoff, Katharina J., Ebel, Matthis, Lomsadze, Alexandre, Borodovsky, Mark, Stanke, Mario
المصدر: bioRxiv
بيانات النشر: Cold Spring Harbor Laboratory, 2023.
سنة النشر: 2023
مصطلحات موضوعية: Article
الوصف: Gene prediction remains an active area of bioinformatics research. Challenges are presented by large eukaryotic genomes and heterogeneous data situations. To meet the challenges, several streams of evidence must be integrated, from protein homology and transcriptome data, as well as information derived from the genome itself. The amount and significance of the available evidence from transcriptomes and proteomes vary from genome to genome, between genes and even along a single gene. User-friendly and accurate annotation pipelines that can cope with such data heterogeneity are needed. The previously developed annotation pipelines BRAKER1 and BRAKER2 use RNA-Seq or protein data, respectively, but not both. The recently released GeneMark-ETP integrates all three types of data and achieves much higher levels of accuracy. We here present the BRAKER3 pipeline that builds on GeneMark-ETP and AUGUSTUS and further improves accuracy using the TSEBRA combiner. BRAKER3 annotates protein-coding genes in eukaryotic genomes using both short-read RNA-Seq and a large protein database along with statistical models learned iteratively and specifically for the target genome. We benchmarked the new pipeline on 11 species under controlled conditions on the assumed relatedness of the target species to available proteomes. BRAKER3 outperformed BRAKER1 and BRAKER2, increasing the average transcript-level F1-score by ∼ 20 percentage points, most pronounced for species with large and complex genomes. BRAKER3 also outperforms MAKER2 and Funannotate. For the first time, we provide a Singularity container for the BRAKER software to minimize installation obstacles. Overall, BRAKER3 is an accurate, easy-to-use tool for the annotation of eukaryotic genomes.
اللغة: English
URL الوصول: https://explore.openaire.eu/search/publication?articleId=pmid________::bac399eb35e8f83cff31a0075de6a445
https://europepmc.org/articles/PMC10312602/
حقوق: OPEN
رقم الأكسشن: edsair.pmid..........bac399eb35e8f83cff31a0075de6a445
قاعدة البيانات: OpenAIRE