Monograph records
https://feeds.library.caltech.edu/people/Ligett-K/monograph.rss
A Caltech Library Repository Feedhttp://www.rssboard.org/rss-specificationpython-feedgenenThu, 30 Nov 2023 18:11:39 +0000Learning to Prune: Speeding up Repeated Computations
https://resolver.caltech.edu/CaltechAUTHORS:20190626-145729225
Authors: Alabi, Daniel; Kalai, Adam Tauman; Ligett, Katrina; Musco, Cameron; Tzamos, Christos; Vitercik, Ellen
Year: 2019
DOI: 10.48550/arXiv.1904.11875
It is common to encounter situations where one must solve a sequence of similar computational problems. Running a standard algorithm with worst-case runtime guarantees on each instance will fail to take advantage of valuable structure shared across the problem instances. For example, when a commuter drives from work to home, there are typically only a handful of routes that will ever be the shortest path. A naive algorithm that does not exploit this common structure may spend most of its time checking roads that will never be in the shortest path. More generally, we can often ignore large swaths of the search space that will likely never contain an optimal solution.
We present an algorithm that learns to maximally prune the search space on repeated computations, thereby reducing runtime while provably outputting the correct solution each period with high probability. Our algorithm employs a simple explore-exploit technique resembling those used in online algorithms, though our setting is quite different. We prove that, with respect to our model of pruning search spaces, our approach is optimal up to constant factors. Finally, we illustrate the applicability of our model and algorithm to three classic problems: shortest-path routing, string search, and linear programming. We present experiments confirming that our simple algorithm is effective at significantly reducing the runtime of solving repeated computations.https://authors.library.caltech.edu/records/4wnh4-mv004A necessary and sufficient stability notion for adaptive generalization
https://resolver.caltech.edu/CaltechAUTHORS:20190626-101449888
Authors: Ligett, Katrina; Shenfeld, Moshe
Year: 2019
DOI: 10.48550/arXiv.1906.00930
We introduce a new notion of the stability of computations, which holds under post-processing and adaptive composition, and show that the notion is both necessary and sufficient to ensure generalization in the face of adaptivity, for any computations that respond to bounded-sensitivity linear queries while providing accuracy with respect to the data sample set. The stability notion is based on quantifying the effect of observing a computation's outputs on the posterior over the data sample elements. We show a separation between this stability notion and previously studied notions.https://authors.library.caltech.edu/records/4hdgc-1nf60Truthful Linear Regression
https://resolver.caltech.edu/CaltechAUTHORS:20190627-150412956
Authors: Cummings, Rachel; Ioannidis, Stratis; Ligett, Katrina
Year: 2019
DOI: 10.48550/arXiv.1506.03489
We consider the problem of fitting a linear model to data held by individuals who are concerned about their privacy. Incentivizing most players to truthfully report their data to the analyst constrains our design to mechanisms that provide a privacy guarantee to the participants; we use differential privacy to model individuals' privacy losses. This immediately poses a problem, as differentially private computation of a linear model necessarily produces a biased estimation, and existing approaches to design mechanisms to elicit data from privacy-sensitive individuals do not generalize well to biased estimators. We overcome this challenge through an appropriate design of the computation and payment scheme.https://authors.library.caltech.edu/records/h40vh-exq36Penalizing Unfairness in Binary Classification
https://resolver.caltech.edu/CaltechAUTHORS:20190627-153828844
Authors: Bechavod, Yahav; Ligett, Katrina
Year: 2019
DOI: 10.48550/arXiv.1707.00044
We present a new approach for mitigating unfairness in learned classifiers. In particular, we focus on binary classification tasks over individuals from two populations, where, as our criterion for fairness, we wish to achieve similar false positive rates in both populations, and similar false negative rates in both populations. As a proof of concept, we implement our approach and empirically evaluate its ability to achieve both fairness and accuracy, using datasets from the fields of criminal risk assessment, credit, lending, and college admissions.https://authors.library.caltech.edu/records/gzzjp-dfz12Equal Opportunity in Online Classification with Partial Feedback
https://resolver.caltech.edu/CaltechAUTHORS:20190626-152715232
Authors: Bechavod, Yahav; Ligett, Katrina; Roth, Aaron; Waggoner, Bo; Wu, Zhiwei Steven
Year: 2019
DOI: 10.48550/arXiv.1902.02242
We study an online classification problem with partial feedback in which individuals arrive one at a time from a fixed but unknown distribution, and must be classified as positive or negative. Our algorithm only observes the true label of an individual if they are given a positive classification. This setting captures many classification problems for which fairness is a concern: for example, in criminal recidivism prediction, recidivism is only observed if the inmate is released; in lending applications, loan repayment is only observed if the loan is granted. We require that our algorithms satisfy common statistical fairness constraints (such as equalizing false positive or negative rates --- introduced as "equal opportunity" in Hardt et al. (2016)) at every round, with respect to the underlying distribution. We give upper and lower bounds characterizing the cost of this constraint in terms of the regret rate (and show that it is mild), and give an oracle efficient algorithm that achieves the upper bound.https://authors.library.caltech.edu/records/79zf1-8qz79Efficiently characterizing games consistent with perturbed equilibrium observations
https://resolver.caltech.edu/CaltechAUTHORS:20190627-103855805
Authors: Ziani, Juba; Chandrasekaran, Venkat; Ligett, Katrina
Year: 2019
DOI: 10.48550/arXiv.1603.01318
We study the problem of characterizing the set of games that are consistent with observed equilibrium play. Our contribution is to develop and analyze a new methodology based on convex optimization to address this problem for many classes of games and observation models of interest. Our approach provides a sharp, computationally efficient characterization of the extent to which a particular set of observations constrains the space of games that could have generated them. This allows us to solve a number of variants of this problem as well as to quantify the power of games from particular classes (e.g., zero-sum, potential, linearly parameterized) to explain player behavior. We illustrate our approach with numerical simulations.https://authors.library.caltech.edu/records/j6dy9-09z38Take it or Leave it: Running a Survey when Privacy Comes at a Cost
https://resolver.caltech.edu/CaltechAUTHORS:20190628-104835073
Authors: Ligett, Katrina; Roth, Aaron
Year: 2019
DOI: 10.48550/arXiv.1202.4741
In this paper, we consider the problem of estimating a potentially sensitive (individually stigmatizing) statistic on a population. In our model, individuals are concerned about their privacy, and experience some cost as a function of their privacy loss. Nevertheless, they would be willing to participate in the survey if they were compensated for their privacy cost. These cost functions are not publicly known, however, nor do we make Bayesian assumptions about their form or distribution. Individuals are rational and will misreport their costs for privacy if doing so is in their best interest. Ghosh and Roth recently showed in this setting, when costs for privacy loss may be correlated with private types, if individuals value differential privacy, no individually rational direct revelation mechanism can compute any non-trivial estimate of the population statistic. In this paper, we circumvent this impossibility result by proposing a modified notion of how individuals experience cost as a function of their privacy loss, and by giving a mechanism which does not operate by direct revelation. Instead, our mechanism has the ability to randomly approach individuals from a population and offer them a take-it-or-leave-it offer. This is intended to model the abilities of a surveyor who may stand on a street corner and approach passers-by.https://authors.library.caltech.edu/records/hea4q-5ry15Putting Peer Prediction Under the Micro(economic)scope and Making Truth-telling Focal
https://resolver.caltech.edu/CaltechAUTHORS:20190628-082029235
Authors: Kong, Yuqing; Schoenebeck, Grant; Ligett, Katrina
Year: 2019
DOI: 10.48550/arXiv.1603.07319
Peer-prediction is a (meta-)mechanism which, given any proper scoring rule, produces a mechanism to elicit privately-held, non-verifiable information from self-interested agents. Formally, truth-telling is a strict Nash equilibrium of the mechanism. Unfortunately, there may be other equilibria as well (including uninformative equilibria where all players simply report the same fixed signal, regardless of their true signal) and, typically, the truth-telling equilibrium does not have the highest expected payoff. The main result of this paper is to show that, in the symmetric binary setting, by tweaking peer-prediction, in part by carefully selecting the proper scoring rule it is based on, we can make the truth-telling equilibrium focal---that is, truth-telling has higher expected payoff than any other equilibrium.
Along the way, we prove the following: in the setting where agents receive binary signals we 1) classify all equilibria of the peer-prediction mechanism; 2) introduce a new technical tool for understanding scoring rules, which allows us to make truth-telling pay better than any other informative equilibrium; 3) leverage this tool to provide an optimal version of the previous result; that is, we optimize the gap between the expected payoff of truth-telling and other informative equilibria; and 4) show that with a slight modification to the peer prediction framework, we can, in general, make the truth-telling equilibrium focal---that is, truth-telling pays more than any other equilibrium (including the uninformative equilibria).https://authors.library.caltech.edu/records/fe520-djp54A simple and practical algorithm for differentially private data release
https://resolver.caltech.edu/CaltechAUTHORS:20190628-094218811
Authors: Hardt, Moritz; Ligett, Katrina; McSherry, Frank
Year: 2019
DOI: 10.48550/arXiv.1012.4763
We present new theoretical results on differentially private data release useful with respect to any target class of counting queries, coupled with experimental results on a variety of real world data sets.
Specifically, we study a simple combination of the multiplicative weights approach of [Hardt and Rothblum, 2010] with the exponential mechanism of [McSherry and Talwar, 2007]. The multiplicative weights framework allows us to maintain and improve a distribution approximating a given data set with respect to a set of counting queries. We use the exponential mechanism to select those queries most incorrectly tracked by the current distribution. Combing the two, we quickly approach a distribution that agrees with the data set on the given set of queries up to small error.
The resulting algorithm and its analysis is simple, but nevertheless improves upon previous work in terms of both error and running time. We also empirically demonstrate the practicality of our approach on several data sets commonly used in the statistical community for contingency table release.https://authors.library.caltech.edu/records/js3wn-n1217Privacy-Compatibility For General Utility Metrics
https://resolver.caltech.edu/CaltechAUTHORS:20190702-104130244
Authors: Kleinberg, Robert; Ligett, Katrina
Year: 2019
DOI: 10.48550/arXiv.1010.2705
In this note, we present a complete characterization of the utility metrics that allow for non-trivial differential privacy guarantees.https://authors.library.caltech.edu/records/4q56d-e6z05Information-Sharing and Privacy in Social Networks
https://resolver.caltech.edu/CaltechAUTHORS:20190702-110206311
Authors: Kleinberg, Jon; Ligett, Katrina
Year: 2019
DOI: 10.48550/arXiv.1003.0469
We present a new model for reasoning about the way information is shared among friends in a social network, and the resulting ways in which it spreads. Our model formalizes the intuition that revealing personal information in social settings involves a trade-off between the benefits of sharing information with friends, and the risks that additional gossiping will propagate it to people with whom one is not on friendly terms. We study the behavior of rational agents in such a situation, and we characterize the existence and computability of stable information-sharing networks, in which agents do not have an incentive to change the partners with whom they share information. We analyze the implications of these stable networks for social welfare, and the resulting fragmentation of the social network.https://authors.library.caltech.edu/records/gh6jz-0gh03Differential Privacy with Compression
https://resolver.caltech.edu/CaltechAUTHORS:20190702-110751042
Authors: Zhou, Shuheng; Ligett, Katrina; Wasserman, Larry
Year: 2019
DOI: 10.48550/arXiv.0901.1365
This work studies formal utility and privacy guarantees for a simple multiplicative database transformation, where the data are compressed by a random linear or affine transformation, reducing the number of data records substantially, while preserving the number of original input variables. We provide an analysis framework inspired by a recent concept known as differential privacy (Dwork 06). Our goal is to show that, despite the general difficulty of achieving the differential privacy guarantee, it is possible to publish synthetic data that are useful for a number of common statistical learning applications. This includes high dimensional sparse regression (Zhou et al. 07), principal component analysis (PCA), and other statistical measures (Liu et al. 06) based on the covariance of the initial data.https://authors.library.caltech.edu/records/kcd3f-5va43