Home Article

A Polynomial Algorithm for Best Subset Selection

2022-04-11

Speaker: WANG Xueqinprofessor of Statistics at the Department of Statistics and Finance, School of Management, University of Science and Technology of China (USTC)

Venue: Tencent meeting ID: 935-419-590

Abstract:

Best subset selection aims to find a small subset of predictors that lead to the most desirable and pre-defined prediction accuracy in a linear regression model. It is not only the most fundamental problem in regression analysis, but also has far reaching applications in every facet of research including computer science and medicine. We introduce a polynomial algorithm which under mild conditions, solves the problem. This algorithm exploits the idea of sequencing and splicing to reach the stable solution in finite steps when the sparsity level of the model is fixed but unknown. We define a novel information criterion that the algorithm uses to select the true sparsity level with a high probability. We show when the algorithm produces a stable optimal solution that is the oracle estimator of the true parameters with probability one. We also demonstrate the power of the algorithm in several numerical studies.