Predicting Batting Averages in Specific Matchups Using Generalized Linked Matrix Factorization

التفاصيل البيبلوغرافية
العنوان: Predicting Batting Averages in Specific Matchups Using Generalized Linked Matrix Factorization
المؤلفون: O'Connell, Michael J.
سنة النشر: 2024
المجموعة: Statistics
مصطلحات موضوعية: Statistics - Methodology
الوصف: Predicting batting averages for specific batters against specific pitchers is a challenging problem in baseball. Previous methods for estimating batting averages in these matchups have used regression models that can incorporate the pitcher's and batter's individual batting averages. However, these methods are limited in their flexibility to include many additional parameters because of the challenges of high-dimensional data in regression. Dimension reduction methods can be used to incorporate many predictors into the model by finding a lower rank set of patterns among them, providing added flexibility. This paper illustrates that dimension reduction methods can be useful for predicting batting averages. To incorporate binomial data (batting averages) as well as additional data about each batter and pitcher, this paper proposes a novel dimension reduction method that uses alternating generalized linear models to estimate shared patterns across three data sources related to batting averages. While the novel method slightly outperforms existing methods for imputing batting averages based on simulations and a cross-validation study, the biggest advantage is that it can easily incorporate other sources of data. As data-collection technology continues to improve, more variables will be available, and this method will be more accurate with more informative data in the future.
Comment: 19 pages, 4 figures
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2402.01914
رقم الأكسشن: edsarx.2402.01914
قاعدة البيانات: arXiv