Fundamental Limits of Pooled-DNA Sequencing

التفاصيل البيبلوغرافية
العنوان: Fundamental Limits of Pooled-DNA Sequencing
المؤلفون: Najafi, Amir, Nashta-ali, Damoun, Motahari, Seyed Abolfazl, Khani, Mehrdad, Khalaj, Babak H., Rabiee, Hamid R.
سنة النشر: 2016
المجموعة: Computer Science
Mathematics
مصطلحات موضوعية: Computer Science - Information Theory
الوصف: In this paper, fundamental limits in sequencing of a set of closely related DNA molecules are addressed. This problem is called pooled-DNA sequencing which encompasses many interesting problems such as haplotype phasing, metageomics, and conventional pooled-DNA sequencing in the absence of tagging. From an information theoretic point of view, we have proposed fundamental limits on the number and length of DNA reads in order to achieve a reliable assembly of all the pooled DNA sequences. In particular, pooled-DNA sequencing from both noiseless and noisy reads are investigated in this paper. In the noiseless case, necessary and sufficient conditions on perfect assembly are derived. Moreover, asymptotically tight lower and upper bounds on the error probability of correct assembly are obtained under a biologically plausible probabilistic model. For the noisy case, we have proposed two novel DNA read denoising methods, as well as corresponding upper bounds on assembly error probabilities. It has been shown that, under mild circumstances, the performance of the reliable assembly converges to that of the noiseless regime when, for a given read length, the number of DNA reads is sufficiently large. Interestingly, the emergence of long DNA read technologies in recent years envisions the applicability of our results in real-world applications.
Comment: 39 pages, Submitted to IEEE Transactions on Information Theory
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/1604.04735
رقم الأكسشن: edsarx.1604.04735
قاعدة البيانات: arXiv