Density Ratio Estimation and Neyman Pearson Classification with Missing Data

التفاصيل البيبلوغرافية
العنوان: Density Ratio Estimation and Neyman Pearson Classification with Missing Data
المؤلفون: Givens, Josh, Liu, Song, Reeve, Henry W J
سنة النشر: 2023
المجموعة: Computer Science
Statistics
مصطلحات موضوعية: Statistics - Machine Learning, Computer Science - Machine Learning, Statistics - Methodology
الوصف: Density Ratio Estimation (DRE) is an important machine learning technique with many downstream applications. We consider the challenge of DRE with missing not at random (MNAR) data. In this setting, we show that using standard DRE methods leads to biased results while our proposal (M-KLIEP), an adaptation of the popular DRE procedure KLIEP, restores consistency. Moreover, we provide finite sample estimation error bounds for M-KLIEP, which demonstrate minimax optimality with respect to both sample size and worst-case missingness. We then adapt an important downstream application of DRE, Neyman-Pearson (NP) classification, to this MNAR setting. Our procedure both controls Type I error and achieves high power, with high probability. Finally, we demonstrate promising empirical performance both synthetic data and real-world data with simulated missingness.
Comment: 40 pages, 11 Figures. To be published in proceedings for AISTAT 2023
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2302.10655
رقم الأكسشن: edsarx.2302.10655
قاعدة البيانات: arXiv