Benchmarking Multi-Domain Active Learning on Image Classification

التفاصيل البيبلوغرافية
العنوان: Benchmarking Multi-Domain Active Learning on Image Classification
المؤلفون: Li, Jiayi, Taori, Rohan, Hashimoto, Tatsunori B.
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
الوصف: Active learning aims to enhance model performance by strategically labeling informative data points. While extensively studied, its effectiveness on large-scale, real-world datasets remains underexplored. Existing research primarily focuses on single-source data, ignoring the multi-domain nature of real-world data. We introduce a multi-domain active learning benchmark to bridge this gap. Our benchmark demonstrates that traditional single-domain active learning strategies are often less effective than random selection in multi-domain scenarios. We also introduce CLIP-GeoYFCC, a novel large-scale image dataset built around geographical domains, in contrast to existing genre-based domain datasets. Analysis on our benchmark shows that all multi-domain strategies exhibit significant tradeoffs, with no strategy outperforming across all datasets or all metrics, emphasizing the need for future research.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2312.00364
رقم الأكسشن: edsarx.2312.00364
قاعدة البيانات: arXiv