CaltechAUTHORS: Monograph

CaltechAUTHORS: Monograph https://feeds.library.caltech.edu/people/Chandrasekaran-V/monograph.rss A Caltech Library Repository Feed http://www.rssboard.org/rss-specification python-feedgen en Tue, 25 Feb 2025 19:00:17 -0800 Complexity of Inference in Graphical Models https://resolver.caltech.edu/CaltechAUTHORS:20121011-133225017 Year: 2010 Graphical models provide a convenient representation for a broad class of probability distributions. Due to their powerful and sophisticated modeling capabilities, such models have found numerous applications in machine learning and other areas. In this paper we consider the complexity of commonly encountered tasks involving graphical models such as the computation of the mode of a posterior probability distribution (i.e., MAP estimation), and the computation of marginal probabilities or the partition function. It is well-known that such inference problems are hard in the worst case, but are tractable for models with bounded treewidth. We ask whether treewidth is the only structural criterion of the underlying graph that enables tractable inference. In other words, is there some class of structures with unbounded treewidth in which inference is tractable? Subject to a combinatorial hypothesis due to Robertson, Seymour, and Thomas (1994), we show that low treewidth is indeed the only structural restriction that can ensure tractability. More precisely we show that for every growing family of graphs indexed by tree-width, there exists a choice of potential functions such that the corresponding inference problem is intractable. Thus even for the "best case" graph structures of high treewidth, there is no polynomial-time inference algorithm. Our analysis employs various concepts from complexity theory and graph theory, with graph minors playing a prominent role. https://resolver.caltech.edu/CaltechAUTHORS:20121011-133225017 Resource Allocation for Statistical Estimation https://resolver.caltech.edu/CaltechAUTHORS:20190702-091807009 Year: 2014 DOI: 10.48550/arXiv.1412.6613 Statistical estimation in many contemporary settings involves the acquisition, analysis, and aggregation of datasets from multiple sources, which can have significant differences in character and in value. Due to these variations, the effectiveness of employing a given resource (e.g., a sensing device or computing power) for gathering or processing data from a particular source depends on the nature of that source. As a result, the appropriate division and assignment of a collection of resources to a set of data sources can substantially impact the overall performance of an inferential strategy. In this expository article, we adopt a general view of the notion of a resource and its effect on the quality of a data source, and we describe a framework for the allocation of a given set of resources to a collection of sources in order to optimize a specified metric of statistical efficiency. We discuss several stylized examples involving inferential tasks such as parameter estimation and hypothesis testing based on heterogeneous data sources, in which optimal allocations can be computed either in closed form or via efficient numerical procedures based on convex optimization. https://resolver.caltech.edu/CaltechAUTHORS:20190702-091807009 Sufficient Dimension Reduction and Modeling Responses Conditioned on Covariates: An Integrated Approach via Convex Optimization https://resolver.caltech.edu/CaltechAUTHORS:20170614-101104339 Year: 2015 DOI: 10.48550/arXiv.1508.03852 Given observations of a collection of covariates and responses (Y,X) ∈ R^p × R^q, sufficient dimension reduction (SDR) techniques aim to identify a mapping f: R^q → R^k with k ≪ q such that Y|f(X) is independent of X. The image f(X) summarizes the relevant information in a potentially large number of covariates X that influence the responses Y. In many contemporary settings, the number of responses p is also quite large, in addition to a large number q of covariates. This leads to the challenge of fitting a succinctly parameterized statistical model to Y|f(X), which is a problem that is usually not addressed in a traditional SDR framework. In this paper, we present a computationally tractable convex relaxation based estimator for simultaneously (a) identifying a linear dimension reduction f(X) of the covariates that is sufficient with respect to the responses, and (b) fitting several types of structured low-dimensional models-factor models, graphical models, latent-variable graphical models - to the conditional distribution of Y|f(X). We analyze the consistency properties of our estimator in a high-dimensional scaling regime. We also illustrate the performance of our approach on a newsgroup dataset and on a dataset consisting of financial asset prices. https://resolver.caltech.edu/CaltechAUTHORS:20170614-101104339 Efficiently characterizing games consistent with perturbed equilibrium observations https://resolver.caltech.edu/CaltechAUTHORS:20190627-103855805 Year: 2016 DOI: 10.48550/arXiv.1603.01318 We study the problem of characterizing the set of games that are consistent with observed equilibrium play. Our contribution is to develop and analyze a new methodology based on convex optimization to address this problem for many classes of games and observation models of interest. Our approach provides a sharp, computationally efficient characterization of the extent to which a particular set of observations constrains the space of games that could have generated them. This allows us to solve a number of variants of this problem as well as to quantify the power of games from particular classes (e.g., zero-sum, potential, linearly parameterized) to explain player behavior. We illustrate our approach with numerical simulations. https://resolver.caltech.edu/CaltechAUTHORS:20190627-103855805 A Matrix Factorization Approach for Learning Semidefinite-Representable Regularizers https://resolver.caltech.edu/CaltechAUTHORS:20201016-144218520 Year: 2017 DOI: 10.48550/arXiv.1701.01207 Regularization techniques are widely employed in optimization-based approaches for solving ill-posed inverse problems in data analysis and scientific computing. These methods are based on augmenting the objective with a penalty function, which is specified based on prior domain-specific expertise to induce a desired structure in the solution. We consider the problem of learning suitable regularization functions from data in settings in which precise domain knowledge is not directly available. Previous work under the title of `dictionary learning' or `sparse coding' may be viewed as learning a regularization function that can be computed via linear programming. We describe generalizations of these methods to learn regularizers that can be computed and optimized via semidefinite programming. Our framework for learning such semidefinite regularizers is based on obtaining structured factorizations of data matrices, and our algorithmic approach for computing these factorizations combines recent techniques for rank minimization problems along with an operator analog of Sinkhorn scaling. Under suitable conditions on the input data, our algorithm provides a locally linearly convergent method for identifying the correct regularizer that promotes the type of structure contained in the data. Our analysis is based on the stability properties of Operator Sinkhorn scaling and their relation to geometric aspects of determinantal varieties (in particular tangent spaces with respect to these varieties). The regularizers obtained using our framework can be employed effectively in semidefinite programming relaxations for solving inverse problems. https://resolver.caltech.edu/CaltechAUTHORS:20201016-144218520