Infinitely Imbalanced Logistic Regression

Speaker: 

Art B. Owen

Affiliation: 

Stanford University

Date: 

Fri, 10/02/2012 - 4:00pm to 5:00pm

Venue: 

OMB-145 - Old Main Building, Kensington Campus

Abstract: 

In binary classification problems it is common for the two classes to be imbalanced: one case is very rare compared to the other. In this work we consider the infinitely imbalanced case where one class has a finite sample size and the other class’s sample size grows without bound. For logistic regression, the infinitely imbalanced case often has a useful solution. Under mild conditions, the intercept diverges as expected, but the rest of the coefficient vector approaches a non trivial and useful limit. That limit can be expressed in terms of exponential tilting and is the  minimum of a convex objective function. The limiting form of logistic regression suggests a computational shortcut for fraud detection problems.

School Seminar Series: