Machine Learning Seminar: Preventing Fairness Gerrymandering in Machine Learning
The most prevalent notions of fairness in machine learning are statistical definitions: they fix a small collection of pre-defined attributes (such as race, gender, age or disability), and then ask for parity of some statistic of the classifier across these attributes. Constraints of this form are susceptible to intentional or inadvertent "fairness gerrymandering", in which a classifier appears to be fair on each attribute marginally, but badly violates the fairness constraint on one or more structured subgroups (such as disabled Hispanic women over age 55). We instead propose statistical notions of fairness binding across exponentially (or infinitely) many subgroups, defined by a structured class of functions over the protected attributes. This interpolates between statistical definitions of fairness and recently proposed individual notions of fairness, but raises several interesting computational challenges.
We describe an algorithm that provably converges to the best subgroup-fair classifier. This algorithm is based on a formulation of subgroup fairness as a two-player zero-sum game between a Learner and an Auditor. We provide an extensive empirical evaluation of our algorithm on a number of fairness-sensitive datasets.
Joint work with Seth Neel, Aaron Roth and Zhiwei Steven Wu, based on papers at ICML 2018 and FAT* 2019.