Weakly supervised cross-modal learning in high-content screening

التفاصيل البيبلوغرافية
العنوان: Weakly supervised cross-modal learning in high-content screening
المؤلفون: Gabriel, Watkinson, Ethan, Cohen, Nicolas, Bourriez, Ihab, Bendidi, Guillaume, Bollot, Auguste, Genovesio
سنة النشر: 2023
المجموعة: Computer Science
Quantitative Biology
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition, Quantitative Biology - Quantitative Methods
الوصف: With the surge in available data from various modalities, there is a growing need to bridge the gap between different data types. In this work, we introduce a novel approach to learn cross-modal representations between image data and molecular representations for drug discovery. We propose EMM and IMM, two innovative loss functions built on top of CLIP that leverage weak supervision and cross sites replicates in High-Content Screening. Evaluating our model against known baseline on cross-modal retrieval, we show that our proposed approach allows to learn better representations and mitigate batch effect. In addition, we also present a preprocessing method for the JUMP-CP dataset that effectively reduce the required space from 85Tb to a mere usable 7Tb size, still retaining all perturbations and most of the information content.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2311.04678
رقم الأكسشن: edsarx.2311.04678
قاعدة البيانات: arXiv