Committee Feed
https://feeds.library.caltech.edu/people/Chandrasekaran-V/committee.rss
A Caltech Library Repository Feedhttp://www.rssboard.org/rss-specificationpython-feedgenenMon, 15 Apr 2024 15:06:23 +0000Convex Analysis for Minimizing and Learning Submodular Set Functions
https://resolver.caltech.edu/CaltechTHESIS:05312013-151014984
Authors: {'items': [{'email': 'peterstobbe@gmail.com', 'id': 'Stobbe-Peter', 'name': {'family': 'Stobbe', 'given': 'Peter'}, 'show_email': 'NO'}]}
Year: 2013
DOI: 10.7907/1A1J-SA64
<p>The connections between convexity and submodularity are explored, for purposes of minimizing and learning submodular set functions.</p>
<p>First, we develop a novel method for minimizing a particular class of submodular functions, which can be expressed as a sum of concave functions composed with modular functions. The basic algorithm uses an accelerated first order method applied to a smoothed version of its convex extension. The smoothing algorithm is particularly novel as it allows us to treat general concave potentials without needing to construct a piecewise linear approximation as with graph-based techniques.</p>
<p>Second, we derive the general conditions under which it is possible to find a minimizer of a submodular function via a convex problem. This provides a framework for developing submodular minimization algorithms. The framework is then used to develop several algorithms that can be run in a distributed fashion. This is particularly useful for applications where the submodular objective function consists of a sum of many terms, each term dependent on a small part of a large data set.</p>
<p>Lastly, we approach the problem of learning set functions from an unorthodox perspective---sparse reconstruction. We demonstrate an explicit connection between the problem of learning set functions from random evaluations and that of sparse signals. Based on the observation that the Fourier transform for set functions satisfies exactly the conditions needed for sparse reconstruction algorithms to work, we examine some different function classes under which uniform reconstruction is possible.</p>
https://thesis.library.caltech.edu/id/eprint/7798A Geometric Analysis of Convex Demixing
https://resolver.caltech.edu/CaltechTHESIS:05202013-091317123
Authors: {'items': [{'email': 'michael.b.mccoy@gmail.com', 'id': 'McCoy-Michael-Brian', 'name': {'family': 'McCoy', 'given': 'Michael Brian'}, 'orcid': '0000-0002-9479-2090', 'show_email': 'NO'}]}
Year: 2013
DOI: 10.7907/156S-EZ89
<p>Demixing is the task of identifying multiple signals given only their sum and prior information about their structures. Examples of demixing problems include (i) separating a signal that is sparse with respect to one basis from a signal that is sparse with respect to a second basis; (ii) decomposing an observed matrix into low-rank and sparse components; and (iii) identifying a binary codeword with impulsive corruptions. This thesis describes and analyzes a convex optimization framework for solving an array of demixing problems.</p>
<p>Our framework includes a random orientation model for the constituent signals that ensures the structures are incoherent. This work introduces a summary parameter, the statistical dimension, that reflects the intrinsic complexity of a signal. The main result indicates that the difficulty of demixing under this random model depends only on the total complexity of the constituent signals involved: demixing succeeds with high probability when the sum of the complexities is less than the ambient dimension; otherwise, it fails with high probability.</p>
<p>The fact that a phase transition between success and failure occurs in demixing is a consequence of a new inequality in conic integral geometry. Roughly speaking, this inequality asserts that a convex cone behaves like a subspace whose dimension is equal to the statistical dimension of the cone. When combined with a geometric optimality condition for demixing, this inequality provides precise quantitative information about the phase transition, including the location and width of the transition region.</p> https://thesis.library.caltech.edu/id/eprint/7726Topics in Randomized Numerical Linear Algebra
https://resolver.caltech.edu/CaltechTHESIS:06102013-100609092
Authors: {'items': [{'email': 'swiftset@gmail.com', 'id': 'Gittens-Alex-A', 'name': {'family': 'Gittens', 'given': 'Alex A.'}, 'show_email': 'NO'}]}
Year: 2013
DOI: 10.7907/3K1S-R458
<p>This thesis studies three classes of randomized numerical linear algebra algorithms, namely: (i) randomized matrix sparsification algorithms, (ii) low-rank approximation algorithms that use randomized unitary transformations, and (iii) low-rank approximation algorithms for positive-semidefinite (PSD) matrices. </p>
<p>Randomized matrix sparsification algorithms set randomly chosen entries of the input matrix to zero. When the approximant is substituted for the original matrix in computations, its sparsity allows one to employ faster sparsity-exploiting algorithms. This thesis contributes bounds on the approximation error of nonuniform randomized sparsification schemes, measured in the spectral norm and two NP-hard norms that are of interest in computational graph theory and subset selection applications.</p>
<p> Low-rank approximations based on randomized unitary transformations have several desirable properties: they have low communication costs, are amenable to parallel implementation, and exploit the existence of fast transform algorithms. This thesis investigates the tradeoff between the accuracy and cost of generating such approximations. State-of-the-art spectral and Frobenius-norm error bounds are provided. </p>
<p> The last class of algorithms considered are SPSD "sketching" algorithms. Such sketches can be computed faster than approximations based on projecting onto mixtures of the columns of the matrix. The performance of several such sketching schemes is empirically evaluated using a suite of canonical matrices drawn from machine learning and data analysis applications, and a framework is developed for establishing theoretical error bounds. </p>
<p> In addition to studying these algorithms, this thesis extends the Matrix Laplace Transform framework to derive Chernoff and Bernstein inequalities that apply to all the eigenvalues of certain classes of random matrices. These inequalities are used to investigate the behavior of the singular values of a matrix under random sampling, and to derive convergence rates for each individual eigenvalue of a sample covariance matrix.</p>https://thesis.library.caltech.edu/id/eprint/7880Efficient Methods for Stochastic Optimal Control
https://resolver.caltech.edu/CaltechTHESIS:05312014-011052261
Authors: {'items': [{'email': 'matanya.horowitz@gmail.com', 'id': 'Horowitz-Matanya-Benasher', 'name': {'family': 'Horowitz', 'given': 'Matanya Benasher'}, 'show_email': 'YES'}]}
Year: 2014
DOI: 10.7907/D40A-9E03
<p>The Hamilton Jacobi Bellman (HJB) equation is central to stochastic optimal control (SOC) theory, yielding the optimal solution to general problems specified by known dynamics and a specified cost functional. Given the assumption of quadratic cost on the control input, it is well known that the HJB reduces to a particular partial differential equation (PDE). While powerful, this reduction is not commonly used as the PDE is of second order, is nonlinear, and examples exist where the problem may not have a solution in a classical sense. Furthermore, each state of the system appears as another dimension of the PDE, giving rise to the curse of dimensionality. Since the number of degrees of freedom required to solve the optimal control problem grows exponentially with dimension, the problem becomes intractable for systems with all but modest dimension.</p>
<p>In the last decade researchers have found that under certain, fairly non-restrictive structural assumptions, the HJB may be transformed into a linear PDE, with an interesting analogue in the discretized domain of Markov Decision Processes (MDP). The work presented in this thesis uses the linearity of this particular form of the HJB PDE to push the computational boundaries of stochastic optimal control.</p>
<p>This is done by crafting together previously disjoint lines of research in computation. The first of these is the use of Sum of Squares (SOS) techniques for synthesis of control policies. A candidate polynomial with variable coefficients is proposed as the solution to the stochastic optimal control problem. An SOS relaxation is then taken to the partial differential constraints, leading to a hierarchy of semidefinite relaxations with improving sub-optimality gap. The resulting approximate solutions are shown to be guaranteed over- and under-approximations for the optimal value function. It is shown that these results extend to arbitrary parabolic and elliptic PDEs, yielding a novel method for Uncertainty Quantification (UQ) of systems governed by partial differential constraints. Domain decomposition techniques are also made available, allowing for such problems to be solved via parallelization and low-order polynomials.</p>
<p>The optimization-based SOS technique is then contrasted with the Separated Representation (SR) approach from the applied mathematics community. The technique allows for systems of equations to be solved through a low-rank decomposition that results in algorithms that scale linearly with dimensionality. Its application in stochastic optimal control allows for previously uncomputable problems to be solved quickly, scaling to such complex systems as the Quadcopter and VTOL aircraft. This technique may be combined with the SOS approach, yielding not only a numerical technique, but also an analytical one that allows for entirely new classes of systems to be studied and for stability properties to be guaranteed.</p>
<p>The analysis of the linear HJB is completed by the study of its implications in application. It is shown that the HJB and a popular technique in robotics, the use of navigation functions, sit on opposite ends of a spectrum of optimization problems, upon which tradeoffs may be made in problem complexity. Analytical solutions to the HJB in these settings are available in simplified domains, yielding guidance towards optimality for approximation schemes. Finally, the use of HJB equations in temporal multi-task planning problems is investigated. It is demonstrated that such problems are reducible to a sequence of SOC problems linked via boundary conditions. The linearity of the PDE allows us to pre-compute control policy primitives and then compose them, at essentially zero cost, to satisfy a complex temporal logic specification.</p> https://thesis.library.caltech.edu/id/eprint/8453The Power of Quantum Fourier Sampling
https://resolver.caltech.edu/CaltechTHESIS:05302014-131308138
Authors: {'items': [{'email': 'bfefferman@gmail.com', 'id': 'Fefferman-William-Jason', 'name': {'family': 'Fefferman', 'given': 'William Jason'}, 'show_email': 'NO'}]}
Year: 2014
DOI: 10.7907/6HJB-MC69
<p>How powerful are Quantum Computers? Despite the prevailing belief that Quantum Computers are more powerful than their classical counterparts, this remains a conjecture backed by little formal evidence. Shor's famous factoring algorithm [Shor97] gives an example of a problem that can be solved efficiently on a quantum computer with no known efficient classical algorithm. Factoring, however, is unlikely to be NP-Hard, meaning that few unexpected formal consequences would arise, should such a classical algorithm be discovered. Could it then be the case that any quantum algorithm can be simulated efficiently classically? Likewise, could it be the case that Quantum Computers can quickly solve problems much harder than factoring? If so, where does this power come from, and what classical computational resources do we need to solve the hardest problems for which there exist efficient quantum algorithms?</p>
<p>We make progress toward understanding these questions through studying the relationship between classical nondeterminism and quantum computing. In particular, is there a problem that can be solved efficiently on a Quantum Computer that cannot be efficiently solved using nondeterminism? In this thesis we address this problem from the perspective of sampling problems. Namely, we give evidence that approximately sampling the Quantum Fourier Transform of an efficiently computable function, while easy quantumly, is hard for any classical machine in the Polynomial Time Hierarchy. In particular, we prove the existence of a class of distributions that can be sampled efficiently by a Quantum Computer, that likely cannot be approximately sampled in randomized polynomial time with an oracle for the Polynomial Time Hierarchy.</p>
<p>Our work complements and generalizes the evidence given in Aaronson and Arkhipov's work [AA2013] where a different distribution with the same computational properties was given. Our result is more general than theirs, but requires a more powerful quantum sampler. </p>
https://thesis.library.caltech.edu/id/eprint/8443Optimal Uncertainty Quantification via Convex Optimization and Relaxation
https://resolver.caltech.edu/CaltechTHESIS:10162013-111333269
Authors: {'items': [{'email': 'hanshuo99@gmail.com', 'id': 'Han-Shuo', 'name': {'family': 'Han', 'given': 'Shuo'}, 'show_email': 'NO'}]}
Year: 2014
DOI: 10.7907/X00K-T615
<p>Many engineering applications face the problem of bounding the expected value of a quantity of interest (performance, risk, cost, etc.) that depends on stochastic uncertainties whose probability distribution is not known exactly. Optimal uncertainty quantification (OUQ) is a framework that aims at obtaining the best bound in these situations by explicitly incorporating available information about the distribution. Unfortunately, this often leads to non-convex optimization problems that are numerically expensive to solve.</p>
<p>This thesis emphasizes on efficient numerical algorithms for OUQ problems. It begins by investigating several classes of OUQ problems that can be reformulated as convex optimization problems. Conditions on the objective function and information constraints under which a convex formulation exists are presented. Since the size of the optimization problem can become quite large, solutions for scaling up are also discussed. Finally, the capability of analyzing a practical system through such convex formulations is demonstrated by a numerical example of energy storage placement in power grids.</p>
<p>When an equivalent convex formulation is unavailable, it is possible to find a convex problem that provides a meaningful bound for the original problem, also known as a convex relaxation. As an example, the thesis investigates the setting used in Hoeffding's inequality. The naive formulation requires solving a collection of non-convex polynomial optimization problems whose number grows doubly exponentially. After structures such as symmetry are exploited, it is shown that both the number and the size of the polynomial optimization problems can be reduced significantly. Each polynomial optimization problem is then bounded by its convex relaxation using sums-of-squares. These bounds are found to be tight in all the numerical examples tested in the thesis and are significantly better than Hoeffding's bounds.</p>https://thesis.library.caltech.edu/id/eprint/7991Convex Relaxation for Low-Dimensional Representation: Phase Transitions and Limitations
https://resolver.caltech.edu/CaltechTHESIS:08182014-091546460
Authors: {'items': [{'email': 'sametoymak@gmail.com', 'id': 'Oymak-Samet', 'name': {'family': 'Oymak', 'given': 'Samet'}, 'show_email': 'NO'}]}
Year: 2015
DOI: 10.7907/Z9S46PWX
<p>There is a growing interest in taking advantage of possible patterns and structures in data so as to extract the desired information and overcome the curse of dimensionality. In a wide range of applications, including computer vision, machine learning, medical imaging, and social networks, the signal that gives rise to the observations can be modeled to be approximately sparse and exploiting this fact can be very beneficial. This has led to an immense interest in the problem of efficiently reconstructing a sparse signal from limited linear observations. More recently, low-rank approximation techniques have become prominent tools to approach problems arising in machine learning, system identification and quantum tomography.</p>
<p>In sparse and low-rank estimation problems, the challenge is the inherent intractability of the objective function, and one needs efficient methods to capture the low-dimensionality of these models. Convex optimization is often a promising tool to attack such problems. An intractable problem with a combinatorial objective can often be "relaxed" to obtain a tractable but almost as powerful convex optimization problem. This dissertation studies convex optimization techniques that can take advantage of low-dimensional representations of the underlying high-dimensional data. We provide provable guarantees that ensure that the proposed algorithms will succeed under reasonable conditions, and answer questions of the following flavor:</p>
<UL>
<LI> For a given number of measurements, can we reliably estimate the true signal?</LI>
<LI> If so, how good is the reconstruction as a function of the model parameters?</LI>
</UL>
<p>More specifically, i) Focusing on linear inverse problems, we generalize the classical error bounds known for the least-squares technique to the lasso formulation, which incorporates the signal model. ii) We show that intuitive convex approaches do not perform as well as expected when it comes to signals that have multiple low-dimensional structures simultaneously. iii) Finally, we propose convex relaxations for the graph clustering problem and give sharp performance guarantees for a family of graphs arising from the so-called stochastic block model. We pay particular attention to the following aspects. For i) and ii), we aim to provide a general geometric framework, in which the results on sparse and low-rank estimation can be obtained as special cases. For i) and iii), we investigate the precise performance characterization, which yields the right constants in our bounds and the true dependence between the problem parameters.</p>https://thesis.library.caltech.edu/id/eprint/8635Recovering Structured Signals in High Dimensions via Non-Smooth Convex Optimization: Precise Performance Analysis
https://resolver.caltech.edu/CaltechTHESIS:06032016-144604076
Authors: {'items': [{'email': 'thramboc@gmail.com', 'id': 'Thrampoulidis-Christos', 'name': {'family': 'Thrampoulidis', 'given': 'Christos'}, 'orcid': '0000-0001-9053-9365', 'show_email': 'NO'}]}
Year: 2016
DOI: 10.7907/Z998850V
<p>The typical scenario that arises in modern large-scale inference problems is one where the ambient dimension of the unknown signal is very large (e.g., high-resolution images, recommendation systems), yet its desired properties lie in some low-dimensional structure such as, sparsity or low-rankness. In the past couple of decades, non-smooth convex optimization methods have emerged as a powerful tool to extract those structures, since they are often computationally efficient, and also they offer enough flexibility while simultaneously being amenable to performance analysis. Especially, since the advent of Compressed Sensing (CS) there has been significant progress towards this direction. One of the key ideas is that random linear measurements offer an efficient way to acquire structured signals. When the measurement matrix has entries iid from a wide class of distributions (including Gaussians), a series of recent works have established a complete and transparent theory that precisely captures the performance in the noiseless setting. In the more practical scenario of noisy measurements the performance analysis task becomes significantly more challenging and corresponding precise and unifying results have hitherto remained scarce. The available class of optimization methods, often referred to as regularized M-estimators, is now richer; additional factors (e.g., the noise distribution, the loss function, and the regularizer parameter) and several different measures of performance (e.g., squared-error, probability of support recovery) need to be taken into account.</p>
<p>This thesis develops a novel analytical framework that overcomes these challenges, and establishes {precise asymptotic performance guarantees for regularized M-estimators under Gaussian measurement matrices. In particular, the framework allows for a unifying analysis among different instances (such as the Generalized LASSO, and the LAD, to name a few) and accounts for a wide class of performance measures. Among others, we show results on the mean-squared-error of the Generalized-LASSO method and make insightful connections to the classical theory of ordinary least squares and to noiseless CS. Empirical evidence is presented that suggests the Gaussian assumption is not necessary. Beyond iid measurement matrices, motivated by practical considerations, we study certain classes of random matrices with orthogonal rows and establish their superior performance when compared to Gaussians.</p>
<p>A prominent application of this generic theory is on the analysis of the bit-error rate (BER) of the popular convex-relaxation of the Maximum Likelihood decoder for recovering BPSK signals in a massive Multiple Input Multiple Output setting. Our precise BER analysis allows comparison of these schemes to the unattainable Matched-filter bound, and further suggests means to provably boost their performance. </p>
<p>The last challenge is to evaluate the performance under non-linear measurements. For the Generalized LASSO, it is shown that this is (asymptotically) equivalent to the one under noisy linear measurements with appropriately scaled variance. This encompasses state-of-the art theoretical results of one-bit CS , and is also used to prove that the optimal quantizer of the measurements that minimizes the estimation error of the Generalized LASSO is the celebrated Lloyd-Max quantizer.</p>
<p>The framework is based on Gaussian process methods; in particular, on a new strong and tight version of a classical comparison inequality (due to Gordon, 1988) in the presence of additional convexity assumptions. We call this the Convex Gaussian Min-max Theorem (CGMT).</p>https://thesis.library.caltech.edu/id/eprint/9836Convex Programming-Based Phase Retrieval: Theory and Applications
https://resolver.caltech.edu/CaltechTHESIS:05312016-051759406
Authors: {'items': [{'email': 'kishorejaganathan@gmail.com', 'id': 'Jaganathan-Kishore', 'name': {'family': 'Jaganathan', 'given': 'Kishore'}, 'show_email': 'YES'}]}
Year: 2016
DOI: 10.7907/Z9C82775
<p>Phase retrieval is the problem of recovering a signal from its Fourier magnitude. This inverse problem arises in many areas of engineering and applied physics, and has been studied for nearly a century. Due to the absence of Fourier phase, the available information is incomplete in general. Classic identifiability results state that phase retrieval of one-dimensional signals is impossible, and that phase retrieval of higher-dimensional signals is almost surely possible under mild conditions. However, there are no efficient recovery algorithms with theoretical guarantees. Classic algorithms are based on the method of alternating projections. These algorithms do not have theoretical guarantees, and have limited recovery abilities due to the issue of convergence to local optima.</p>
<p>Recently, there has been a renewed interest in phase retrieval due to technological advances in measurement systems and theoretical developments in structured signal recovery. In particular, it is now possible to obtain specific kinds of additional magnitude-only information about the signal, depending on the application. The premise is that, by carefully redesigning the measurement process, one could potentially overcome the issues of phase retrieval. To this end, another approach could be to impose certain kinds of prior on the signal, depending on the application. On the algorithmic side, convex programming based approaches have played a key role in modern phase retrieval, inspired by their success in provably solving several quadratic constrained problems.</p>
<p>In this work, we study several variants of phase retrieval using modern tools, with focus on applications like X-ray crystallography, diffraction imaging, optics, astronomy and radar. In the one-dimensional setup, we first develop conditions, which when satisfied, allow unique reconstruction. Then, we develop efficient recovery algorithms based on convex programming, and provide theoretical guarantees. The theory and algorithms we develop are independent of the dimension of the signal, and hence can be used in all the aforementioned applications. We also perform a comparative numerical study of the convex programming and the alternating projection based algorithms. Numerical simulations clearly demonstrate the superior ability of the convex programming based methods, both in terms of successful recovery in the noiseless setting and stable reconstruction in the noisy setting.</p>https://thesis.library.caltech.edu/id/eprint/9814Distributed Optimal Control of Cyber-Physical Systems: Controller Synthesis, Architecture Design and System Identification
https://resolver.caltech.edu/CaltechTHESIS:03312016-100604768
Authors: {'items': [{'email': 'nikolai.matni@gmail.com', 'id': 'Matni-Nikolai', 'name': {'family': 'Matni', 'given': 'Nikolai'}, 'orcid': '0000-0003-4936-3921', 'show_email': 'YES'}]}
Year: 2016
DOI: 10.7907/Z99884Z0
<p>The centralized paradigm of a single controller and a single plant upon which modern control theory is built is no longer applicable to modern cyber-physical systems of interest, such as the power-grid, software defined networks or automated highways systems, as these are all large-scale and spatially distributed. Both the scale and the distributed nature of these systems has motivated the decentralization of control schemes into local sub-controllers that measure, exchange and act on locally available subsets of the globally available system information. This decentralization of control logic leads to different decision makers acting on asymmetric information sets, introduces the need for coordination between them, and perhaps not surprisingly makes the resulting optimal control problem much harder to solve. In fact, shortly after such questions were posed, it was realized that seemingly simple decentralized optimal control problems are computationally intractable to solve, with the Wistenhausen counterexample being a famous instance of this phenomenon. Spurred on by this perhaps discouraging result, a concerted 40 year effort to identify tractable classes of distributed optimal control problems culminated in the notion of quadratic invariance, which loosely states that if sub-controllers can exchange information with each other at least as quickly as the effect of their control actions propagates through the plant, then the resulting distributed optimal control problem admits a convex formulation.</p>
<p>The identification of quadratic invariance as an appropriate means of "convexifying" distributed optimal control problems led to a renewed enthusiasm in the controller synthesis community, resulting in a rich set of results over the past decade. The contributions of this thesis can be seen as being a part of this broader family of results, with a particular focus on closing the gap between theory and practice by relaxing or removing assumptions made in the traditional distributed optimal control framework. Our contributions are to the foundational theory of distributed optimal control, and fall under three broad categories, namely controller synthesis, architecture design and system identification.</p>
<p>We begin by providing two novel controller synthesis algorithms. The first is a solution to the distributed <i>H</i><sub>∞</sub> optimal control problem subject to delay constraints, and provides the only known exact characterization of delay-constrained distributed controllers satisfying an <i>H</i><sub>∞</sub> norm bound. The second is an explicit dynamic programming solution to a two player LQR state-feedback problem with varying delays. Accommodating varying delays represents an important first step in combining distributed optimal control theory with the area of Networked Control Systems that considers lossy channels in the feedback loop. Our next set of results are concerned with controller architecture design. When designing controllers for large-scale systems, the architectural aspects of the controller such as the placement of actuators, sensors, and the communication links between them can no longer be taken as given -- indeed the task of designing this architecture is now as important as the design of the control laws themselves. To address this task, we formulate the Regularization for Design (RFD) framework, which is a unifying computationally tractable approach, based on the model matching framework and atomic norm regularization, for the simultaneous co-design of a structured optimal controller and the architecture needed to implement it. Our final result is a contribution to distributed system identification. Traditional system identification techniques such as subspace identification are not computationally scalable, and destroy rather than leverage any a priori information about the system's interconnection structure. We argue that in the context of system identification, an essential building block of any scalable algorithm is the ability to estimate local dynamics within a large interconnected system. To that end we propose a promising heuristic for identifying the dynamics of a subsystem that is still connected to a large system. We exploit the fact that the transfer function of the local dynamics is low-order, but full-rank, while the transfer function of the global dynamics is high-order, but low-rank, to formulate this separation task as a nuclear norm minimization problem. Finally, we conclude with a brief discussion of future research directions, with a particular emphasis on how to incorporate the results of this thesis, and those of optimal control theory in general, into a broader theory of dynamics, control and optimization in layered architectures.</p>https://thesis.library.caltech.edu/id/eprint/9637Optimization and Control of Power Flow in Distribution Networks
https://resolver.caltech.edu/CaltechTHESIS:12092015-021431773
Authors: {'items': [{'email': 'mfarivar@gmail.com', 'id': 'Farivar-Masoud', 'name': {'family': 'Farivar', 'given': 'Masoud'}, 'orcid': '0000-0001-7298-3526', 'show_email': 'YES'}]}
Year: 2016
DOI: 10.7907/Z9JW8BSM
<p>Climate change is arguably the most critical issue facing our generation and the next. As we move towards a sustainable future, the grid is rapidly evolving with the integration of more and more renewable energy resources and the emergence of electric vehicles. In particular, large scale adoption of residential and commercial solar photovoltaics (PV) plants is completely changing the traditional slowly-varying unidirectional power flow nature of distribution systems. High share of intermittent renewables pose several technical challenges, including voltage and frequency control. But along with these challenges, renewable generators also bring with them millions of new DC-AC inverter controllers each year. These fast power electronic devices can provide an unprecedented opportunity to increase energy efficiency and improve power quality, if combined with well-designed inverter control algorithms. The main goal of this dissertation is to develop scalable power flow optimization and control methods that achieve system-wide efficiency, reliability, and robustness for power distribution networks of future with high penetration of distributed inverter-based renewable generators.</p>
<p>Proposed solutions to power flow control problems in the literature range from fully centralized to fully local ones. In this thesis, we will focus on the two ends of this spectrum. In the first half of this thesis (chapters 2 and 3), we seek optimal solutions to voltage control problems provided a centralized architecture with complete information. These solutions are particularly important for better understanding the overall system behavior and can serve as a benchmark to compare the performance of other control methods against. To this end, we first propose a branch flow model (BFM) for the analysis and optimization of radial and meshed networks. This model leads to a new approach to solve optimal power flow (OPF) problems using a two step relaxation procedure, which has proven to be both reliable and computationally efficient in dealing with the non-convexity of power flow equations in radial and weakly-meshed distribution networks. We will then apply the results to fast time- scale inverter var control problem and evaluate the performance on real-world circuits in Southern California Edison’s service territory.</p>
<p>The second half (chapters 4 and 5), however, is dedicated to study local control approaches, as they are the only options available for immediate implementation on today’s distribution networks that lack sufficient monitoring and communication infrastructure. In particular, we will follow a reverse and forward engineering approach to study the recently proposed piecewise linear volt/var control curves. It is the aim of this dissertation to tackle some key problems in these two areas and contribute by providing rigorous theoretical basis for future work.</p>https://thesis.library.caltech.edu/id/eprint/9317A Direct Approach to Robustness Optimization
https://resolver.caltech.edu/CaltechTHESIS:08122015-172710296
Authors: {'items': [{'email': 'euclid85@gmail.com', 'id': 'You-Seungil', 'name': {'family': 'You', 'given': 'Seungil'}, 'show_email': 'NO'}]}
Year: 2016
DOI: 10.7907/Z9X34VDV
This dissertation reformulates and streamlines the core tools of robustness analysis for linear time invariant systems using now-standard methods in convex optimization. In particular, robust performance analysis can be formulated as a primal convex optimization in the form of a semidefinite program using a semidefinite representation of a set of Gramians. The same approach with semidefinite programming duality is applied to develop a linear matrix inequality test for well-connectedness analysis, and many existing results such as the Kalman-Yakubovich--Popov lemma and various scaled small gain tests are derived in an elegant fashion. More importantly, unlike the classical approach, a decision variable in this novel optimization framework contains all inner products of signals in a system, and an algorithm for constructing an input and state pair of a system corresponding to the optimal solution of robustness optimization is presented based on this information. This insight may open up new research directions, and as one such example, this dissertation proposes a semidefinite programming relaxation of a cardinality constrained variant of the H ∞ norm, which we term sparse H ∞ analysis, where an adversarial disturbance can use only a limited number of channels. Finally, sparse H ∞ analysis is applied to the linearized swing dynamics in order to detect potential vulnerable spots in power networks.
https://thesis.library.caltech.edu/id/eprint/9101Algebraic Techniques in Coding Theory: Entropy Vectors, Frames, and Constrained Coding
https://resolver.caltech.edu/CaltechTHESIS:09042015-171723764
Authors: {'items': [{'email': 'matt.thill@gmail.com', 'id': 'Thill-Matthew-David', 'name': {'family': 'Thill', 'given': 'Matthew David'}, 'orcid': '0000-0003-0885-6260', 'show_email': 'NO'}]}
Year: 2016
DOI: 10.7907/Z9F18WNW
<p>The study of codes, classically motivated by the need to communicate information reliably in the presence of error, has found new life in fields as diverse as network communication, distributed storage of data, and even has connections to the design of linear measurements used in compressive sensing. But in all contexts, a code typically involves exploiting the algebraic or geometric structure underlying an application. In this thesis, we examine several problems in coding theory, and try to gain some insight into the algebraic structure behind them.</p>
<p>The first is the study of the entropy region - the space of all possible vectors of joint entropies which can arise from a set of discrete random variables. Understanding this region is essentially the key to optimizing network codes for a given network. To this end, we employ a group-theoretic method of constructing random variables producing so-called "group-characterizable" entropy vectors, which are capable of approximating any point in the entropy region. We show how small groups can be used to produce entropy vectors which violate the Ingleton inequality, a fundamental bound on entropy vectors arising from the random variables involved in linear network codes. We discuss the suitability of these groups to design codes for networks which could potentially outperform linear coding.</p>
<p>The second topic we discuss is the design of frames with low coherence, closely related to finding spherical codes in which the codewords are unit vectors spaced out around the unit sphere so as to minimize the magnitudes of their mutual inner products. We show how to build frames by selecting a cleverly chosen set of representations of a finite group to produce a "group code" as described by Slepian decades ago. We go on to reinterpret our method as selecting a subset of rows of a group Fourier matrix, allowing us to study and bound our frames' coherences using character theory. We discuss the usefulness of our frames in sparse signal recovery using linear measurements.</p>
<p>The final problem we investigate is that of coding with constraints, most recently motivated by the demand for ways to encode large amounts of data using error-correcting codes so that any small loss can be recovered from a small set of surviving data. Most often, this involves using a systematic linear error-correcting code in which each parity symbol is constrained to be a function of some subset of the message symbols. We derive bounds on the minimum distance of such a code based on its constraints, and characterize when these bounds can be achieved using subcodes of Reed-Solomon codes.</p>https://thesis.library.caltech.edu/id/eprint/9141Graph Clustering: Algorithms, Analysis and Query Design
https://resolver.caltech.edu/CaltechTHESIS:09222017-130217881
Authors: {'items': [{'email': 'ramya.kvinayak@gmail.com', 'id': 'Korlakai-Vinayak-Ramya', 'name': {'family': 'Korlakai Vinayak', 'given': 'Ramya'}, 'orcid': '0000-0003-0248-9551', 'show_email': 'YES'}]}
Year: 2018
DOI: 10.7907/Z9RR1WFK
<p>A wide range of applications in engineering as well as the natural and social sciences have datasets that are unlabeled. Clustering plays a major role in exploring structure in such unlabeled datasets. Owing to the heterogeneity in the applications and the types of datasets available, there are plenty of clustering objectives and algorithms. In this thesis we focus on two such clustering problems: <i>Graph Clustering</i> and <i>Crowdsourced Clustering</i>.</p>
<p>In the first part, we consider the problem of graph clustering and study convex-optimization-based clustering algorithms. Datasets are often messy -- ridden with noise, outliers (items that do not belong to any clusters), and missing data. Therefore, we are interested in algorithms that are robust to such discrepancies. We present and analyze convex-optimization-based clustering algorithms which aim to recover the low-rank matrix that encodes the underlying cluster structure for two clustering objectives: <i>clustering partially observed graphs</i> and <i>clustering similarity matrices with outliers</i>. Using block models as generative models, we characterize the performance of these convex clustering algorithms. In particular, we provide <i>explicit bounds</i>, without any large unknown constants, on the problem parameters that determine the success and failure of these convex approaches.</p>
<p>In the second part, we consider the problem of crowdsourced clustering -- the task of clustering items using answers from non-expert crowd workers who can answer similarity comparison queries. Since the workers are not experts, they provide noisy answers. Further, due to budget constraints, we cannot make all possible comparisons between items in the dataset. Thus, it is important to <i>design queries that can reduce the noise in the responses</i> and <i>design algorithms that can work with noisy and partial data</i>. We demonstrate that random triangle queries (where three items are compared per query) provide less noisy data as well as greater quantity of data, for a fixed query budget, as compared to random edge queries (where two items are compared per query). We extend the analysis of convex clustering algorithms to show that the exact recovery guarantees hold for triangle queries despite involving dependent edges. In addition to random querying strategies, we also present a novel <i>active querying</i> algorithm that is guaranteed to find all the clusters regardless of their sizes and without the knowledge of any parameters as long as the workers are better than random guessers. We also provide a tight upper bound on the number of queries made by the proposed active querying algorithm. Apart from providing theoretical guarantees for the clustering algorithms we also apply our algorithms to real datasets.</p>https://thesis.library.caltech.edu/id/eprint/10447Online Algorithms: From Prediction to Decision
https://resolver.caltech.edu/CaltechTHESIS:10182017-210853845
Authors: {'items': [{'email': 'niangjun@gmail.com', 'id': 'Chen-Niangjun', 'name': {'family': 'Chen', 'given': 'Niangjun'}, 'orcid': '0000-0002-2289-9737', 'show_email': 'YES'}]}
Year: 2018
DOI: 10.7907/Z95M63W4
<p>Making use of predictions is a crucial, but under-explored, area of sequential decision problems with limited information. While in practice most online algorithms rely on predictions to make real time decisions, in theory their performance is only analyzed in simplified models of prediction noise, either adversarial or i.i.d. The goal of this thesis is to bridge this divide between theory and practice: to study online algorithm under more practical predictions models, gain better understanding about the value of prediction, and design online algorithms that make the best use of predictions.</p>
<p>This thesis makes three main contributions. First, we propose a stochastic prediction error model that generalizes prior models in the learning and stochastic control communities, incorporates correlation among prediction errors, and captures the fact that predictions improve as time passes. Using this general prediction model, we prove that Averaging Fixed Horizon Control (AFHC) can simultaneously achieve sublinear regret and constant competitive ratio in expectation using only a constant- sized prediction window, overcoming the hardnesss results in adversarial prediction models. Second, to understand the optimal use of noisy prediction, we introduce a new class of policies, Committed Horizon Control (CHC), that generalizes both popular policies Receding Horizon Control (RHC) and Averaging Fixed Horizon Control (AFHC). Our results provide explicit results characterizing the optimal use of prediction in CHC policy as a function of properties of the prediction noise, e.g., variance and correlation structure. Third, we apply the general prediction model and algorithm design framework to the deferrable load control problem in power systems. Our proposed model predictive algorithm provides significant reduction in variance of total load in the power system. Throughout this thesis, we provide both average-case analysis and concentration results for our proposed online algorithms, highlighting that the typical performance is tightly concentrated around the average-case performance.</p>https://thesis.library.caltech.edu/id/eprint/10530Impact of Transmission Network Topology on Electrical Power Systems
https://resolver.caltech.edu/CaltechTHESIS:05312019-191005982
Authors: {'items': [{'email': 'linqi.guo.cms@gmail.com', 'id': 'Guo-Linqi', 'name': {'family': 'Guo', 'given': 'Linqi'}, 'orcid': '0000-0001-5771-2752', 'show_email': 'NO'}]}
Year: 2019
DOI: 10.7907/EN8K-W872
<p>Power system reliability is a crucial component in the development of sustainable infrastructure. Because of the intricate interactions among power system components, it is often difficult to make general inferences on how the transmission network topology impacts performance of the grid in different scenarios. This complexity poses significant challenges for researches in the modeling, control, and management of power systems.</p>
<p>In this work, we develop a theory that aims to address this challenge from both the fast-timescale and steady state aspects of power grids. Our analysis builds upon the transmission network Laplacian matrix, and reveals new properties of this well-studied concept in spectral graph theory that are specifically tailored to the power system context. A common theme of this work is the representation of certain physical quantities in terms of graphical structures, which allows us to establish algebraic results on power grid performance using purely topological information. This view is particularly powerful and often leads to surprisingly simple characterizations of complicated system behaviors. Depending on the timescale of the underlying problem, our results can be roughly categorized into the study of frequency regulation and the study of cascading failures.</p>
<p><i>Fast-timescale: Frequency Regulation</i>. We first study how the transmission network impacts power system robustness against disturbances in transient phase. Towards this goal, we develop a framework based on the Laplacian spectrum that captures the interplay among network topology, system inertia, and generator/load damping. This framework shows that the impact of network topology in frequency regulation can be quantified through the network Laplacian eigenvalues, and that such eigenvalues fully determine the grid robustness against low frequency perturbations. Moreover, we can explicitly decompose the frequency signal along scaled Laplacian eigenvectors when damping-inertia ratios are uniform across the buses. The insights revealed by this framework explain why load-side participation in frequency regulation not only makes the system respond faster, but also helps lower the system nadir after a disturbance, providing useful guidelines in the controller design. We simulate an improved controller reverse engineered from our results on the IEEE 39-bus New England interconnection system, and illustrate its robustness against high frequency oscillations compared to both the conventional droop control and a recent controller design.</p>
<p>We then switch to a more combinatorial problem that seeks to characterize the controllability and observability of the power system in frequency regulation if only a subset of buses are equipped with controllers/sensors. Our results show that the controllability/observability of the system depends on two orthogonal conditions: (a) intrinsic structure of the system graph, and (b) algebraic coverage of buses with controllers/sensors. Condition (a) encodes information on graph symmetry and is shown to hold for almost all practical systems. Condition (b) captures how buses interact with each other through the network and can be verified using the eigenvectors of the graph Laplacian matrix. Based on this characterization, the optimal placement of controllers and sensors in the network can be formulated as a set cover problem. We demonstrate how our results identify the critical buses in real systems using a simulation in the IEEE 39-bus New England interconnection test system. In particular, for this testbed a single well chosen bus is capable of providing full controllability and observability.</p>
<p><i>Steady State: Cascading Failures</i>. Cascading failures in power systems exhibit non-monotonic, non-local propagation patterns which make the analysis and mitigation of failures difficult. By studying the transmission network Laplacian matrix, we reveal two useful structures that make the analysis of this complex evolution more tractable: (a) In contrast to the lack of monotonicity in the physical system, there is a rich collection of monotonicity we can explore in the spectrum of the Laplacian matrix. This allows us to systematically design topological measures that are monotonic over the cascading event. (b) Power redistribution patterns are closely related to the distribution of different types of trees in the power network topology. Such graphical interpretation captures the Kirchhoff's Law in a precise way and naturally suggests that we can eliminate long-distance propagation of system disturbances by forming a tree-partition.</p>
<p>We then show that the tree-partition of transmission networks provides a precise analytical characterization of line failure localizability. Specifically, when a non-bridge line is tripped, the impact of this failure only propagates within well-defined components, which we refer to as cells, of the tree-partition defined by the bridges. In contrast, when a bridge line is tripped, the impact of this failure propagates globally across the network, affecting the power flow on all remaining transmission lines. This characterization suggests that it is possible to improve the system robustness by switching off certain transmission lines, so as to create more, smaller components in the tree-partition; thus spatially localizing line failures and making the grid less vulnerable to large-scale outages. We illustrate this approach using the IEEE 118-bus test system and demonstrate that switching off a negligible portion of transmission lines allows the impact of line failures to be significantly more localized without substantial changes in line congestion.</p>
<p><i>Unified Controller on Tree-partitions</i>. Combining our results from both the fast-timescale and steady state behaviors of power grids, we propose a distributed control strategy that offers strong guarantees in both the mitigation and localization of cascading failures in power systems. This control strategy leverages a new controller design known as Unified Controller (UC) from frequency regulation literature, and revolves around the powerful properties that emerge when the management areas that UC operates over form a tree-partition. After an initial failure, the proposed strategy always prevents successive failures from happening, and regulates the system to the desired steady state where the impact of initial failures are localized as much as possible. For extreme failures that cannot be localized, the proposed framework has a configurable design that progressively involves and coordinates across more control areas for failure mitigation and, as a last resort, imposes minimal load shedding. We compare the proposed control framework with the classical Automatic Generation Control (AGC) on the IEEE 118-bus test system. Simulation results show that our novel control greatly improves the system robustness in terms of the <i>N-1</i> security standard, and localizes the impact of initial failures in majority of the load profiles that are examined. Moreover, the proposed framework incurs significantly less load loss, if any, compared to AGC, in all of our case studies.</p>https://thesis.library.caltech.edu/id/eprint/11590Fitting Convex Sets to Data: Algorithms and Applications
https://resolver.caltech.edu/CaltechTHESIS:09282018-091842941
Authors: {'items': [{'email': 'sohyongsheng87@gmail.com', 'id': 'Soh-Yong-Sheng', 'name': {'family': 'Soh', 'given': 'Yong Sheng'}, 'orcid': '0000-0003-3367-1401', 'show_email': 'YES'}]}
Year: 2019
DOI: 10.7907/jkmq-b430
<p>This thesis concerns the geometric problem of finding a convex set that best fits a given dataset. Our question serves as an abstraction for data-analytical tasks arising in a range of scientific and engineering applications. We focus on two specific instances:</p>
<p>1. A key challenge that arises in solving inverse problems is ill-posedness due to a lack of measurements. A prominent family of methods for addressing such issues is based on augmenting optimization-based approaches with a convex penalty function so as to induce a desired structure in the solution. These functions are typically chosen using prior knowledge about the data. In Chapter 2, we study the problem of learning convex penalty functions directly from data for settings in which we lack the domain expertise to choose a penalty function. Our solution relies on suitably transforming the problem of learning a penalty function into a fitting task.</p>
<p>2. In Chapter 3, we study the problem of fitting tractably-described convex sets given the optimal value of linear functionals evaluated in different directions.</p>
<p>Our computational procedures for fitting convex sets are based on a broader framework in which we search among families of sets that are parameterized as linear projections of a fixed structured convex set. The utility of such a framework is that our procedures reduce to the computation of simple primitives at each iteration, and these primitives can be further performed in parallel. In addition, by choosing structured sets that are non-polyhedral, our framework provides a principled way to search over expressive collections of non-polyhedral descriptions; in particular, convex sets that can be described via semidefinite programming provide a rich source of non-polyhedral sets, and such sets feature prominently in this thesis.</p>
<p>We provide performance guarantees for our procedures. Our analyses rely on understanding geometrical aspects of determinantal varieties, building on ideas from empirical processes as well as random matrix theory. We demonstrate the utility of our framework with numerical experiments on synthetic data as well as applications in image denoising and computational geometry.</p>
<p>As secondary contributions, we consider the following:</p>
<p>1. In Chapter 4, we consider the problem of optimally approximating a convex set as a spectrahedron of a given size. Spectrahedra are sets that can be expressed as feasible regions of a semidefinite program.</p>
<p>2. In Chapter 5, we consider change-point estimation in a sequence of high-dimensional signals given noisy observations. Our method integrates classical approaches with a convex optimization-based step that is useful for exploiting structure in high-dimensional data.</p>https://thesis.library.caltech.edu/id/eprint/11208Geometry-Driven Model Reduction
https://resolver.caltech.edu/CaltechTHESIS:12102018-011614041
Authors: {'items': [{'email': 'max.budninskiy@gmail.com', 'id': 'Budninskiy-Maxim-A', 'name': {'family': 'Budninskiy', 'given': 'Maxim A.'}, 'orcid': '0000-0002-9288-0249', 'show_email': 'NO'}]}
Year: 2019
DOI: 10.7907/0RCX-0369
<p>In this thesis we bring discrete differential geometry to bear on model reduction, both in the context of data analysis and numerical simulation of physical phenomena.</p>
<p>First, we present a novel controllable as-isometric-as-possible embedding method for low- and high-dimensional geometric datasets through sparse matrix eigenanalysis. This approach is equally suitable for performing nonlinear dimensionality reduction on big data and nonlinear shape editing of 3D meshes and pointsets. At the core of our approach is the construction of a "multi-Laplacian" quadratic form that is assembled from local operators whose kernels only contain locally affine functions. Minimizing this quadratic form produces an embedding that best preserves all relative coordinates of points within their local neighborhoods. We demonstrate the improvements that our approach brings over existing nonlinear local manifold learning methods on a number of datasets, and formulate the first eigen-based as-rigid-as-possible shape deformation technique by applying our affine-kernel embedding approach to 3D data augmented with user-imposed constraints on select vertices.</p>
<p>Second, we introduce a new global manifold learning approach based on metric connection for generating a quasi-isometric, low-dimensional mapping from a sparse and irregular sampling of an arbitrary low-dimensional manifold embedded in a high-dimensional space. Our geometric procedure computes a low-dimensional embedding that best preserves all pairwise geodesic distances over the input pointset similarly to one of the staples of manifold learning, the Isomap algorithm, and exhibits the same strong resilience to noise. While Isomap relies on Dijkstra's shortest path algorithm to approximate geodesic distances over the input pointset, we instead propose to compute them through "parallel transport unfolding," a discrete form of Cartan's development, to offer robustness to poor sampling and arbitrary topology. Our novel approach to evaluating geodesic distances using discrete differential geometry results in a markedly improved robustness to irregularities and sampling voids. In particular, it does not suffer from Isomap's limitation to geodesically convex sampled domains. Moreover, it involves only simple linear algebra, significantly improves the accuracy of all pairwise geodesic distance approximations, and has the same computational complexity as Isomap. We also show that our connection-based distance estimation can be used for faster variants of Isomap such as Landmark-Isomap.</p>
<p>Finally, we introduce an operator-adapted multiresolution analysis for finite-element differential forms. From a given continuous, linear, bijective, and self-adjoint positive-definite operator <i>L</i>, a hierarchy of basis functions and associated wavelets for discrete differential forms is constructed in a fine-to-coarse fashion and in quasilinear time. The resulting wavelets are <i>L</i>-orthogonal across all scales, and can be used to obtain a Galerkin discretization of the operator with a block diagonal stiffness matrix composed of uniformly well-conditioned and sparse blocks. Because our approach applies to arbitrary differential <i>p</i>-forms, we can derive both scalar-valued and vector-valued wavelets that block diagonalize a prescribed operator. Our construction applies to various types of computational grids, offers arbitrary smoothness orders of basis functions and wavelets, and can accommodate linear differential constraints such as divergence-freeness. We also demonstrate the benefits of the operator-adapted multiresolution decomposition for coarse-graining and model reduction of linear and nonlinear partial differential equations.</p>
<p>We conclude with a short discussion on how future work in geometric model reduction may impact other related topics such as semi-supervised learning.</p>https://thesis.library.caltech.edu/id/eprint/11303Statistical Methods for Gene Differential Expression Analysis of RNA-Sequencing
https://resolver.caltech.edu/CaltechTHESIS:10102018-143313907
Authors: {'items': [{'email': 'donglinyi@gmail.com', 'id': 'Yi-Lynn-Donglin', 'name': {'family': 'Yi', 'given': 'Lynn Donglin'}, 'orcid': '0000-0003-4575-0158', 'show_email': 'NO'}]}
Year: 2019
DOI: 10.7907/0YE6-2217
<p>RNA-Sequencing ("RNA-Seq") is performed to measure gene expression, often to ask the question of what genes are differentially expressed across various biological conditions. Statistical methods have been used to model RNA-Seq quantifications in order to determine differential expression, and have traditionally be divided into gene-level methods and transcript-level methods. There has been little attempt to connect the statistical divide, although transcript expression and gene expression are biologically inextricably linked. In this thesis, we provide a case study of a comparative differential expression analysis, demonstrating that many differential expression events happen on the isoform-level, and that performing an analysis using only summarized gene quantifications would fail to capture these events. Furthermore, we develop statistical methods that unify the transcript-level and gene-level analysis. In bulk RNA-Seq, by using p-value aggregation methods, we are able to translate transcript-level results into gene-level results under a unified framework. For single cell RNA-Seq, we propose using multiple logistic regression, leveraging the high dimensionality of the data in order to determine if the transcript quantifications pertaining to a gene are able to constitute a linear discriminant for cell type. This method combines differential transcript expression analysis and differential gene expression analysis into a unified framework which we call “gene differential expression.” Lastly, we demonstrate that our methods could be used on transcript compatibility counts instead of transcript quantifications in order to bypass ambiguous read assignment and improve accuracy. We show that transcript compatibility counts obtained via transcriptome pseudoalignment are comparable in quantification accuracy to quantifications from genome alignment methods.</p>https://thesis.library.caltech.edu/id/eprint/11226Time-Varying Optimization and Its Application to Power System Operation
https://resolver.caltech.edu/CaltechTHESIS:01222019-221628111
Authors: {'items': [{'email': 'tyj518@gmail.com', 'id': 'Tang-Yujie', 'name': {'family': 'Tang', 'given': 'Yujie'}, 'orcid': '0000-0002-4921-8372', 'show_email': 'YES'}]}
Year: 2019
DOI: 10.7907/6N9W-3J20
The main topic of this thesis is time-varying optimization, which studies algorithms that can track optimal trajectories of optimization problems that evolve with time. A typical time-varying optimization algorithm is implemented in a running fashion in the sense that the underlying optimization problem is updated during the iterations of the algorithm, and is especially suitable for optimizing large-scale fast varying systems. Motivated by applications in power system operation, we propose and analyze first-order and second-order running algorithms for time-varying nonconvex optimization problems.
The first-order algorithm we propose is the regularized proximal primal-dual gradient algorithm, and we develop a comprehensive theory on its tracking performance. Specifically, we provide analytical results in terms of tracking a KKT point, and derive bounds for the tracking error defined as the distance between the algorithmic iterates and a KKT trajectory. We then provide sufficient conditions under which there exists a set of algorithmic parameters that guarantee that the tracking error bound holds. Qualitatively, the sufficient conditions for the existence of feasible parameters suggest that the problem should be "sufficiently convex" around a KKT trajectory to overcome the nonlinearity of the nonconvex constraints. The study of feasible algorithmic parameters motivates us to analyze the continuous-time limit of the discrete-time algorithm, which we formulate as a system of differential inclusions; results on its tracking performance as well as feasible and optimal algorithmic parameters are also derived. Finally, we derive conditions under which the KKT points for a given time instant will always be isolated so that bifurcations or merging of KKT trajectories do not happen.
The second-order algorithms we develop are approximate Newton methods that incorporate second-order information. We first propose the approximate Newton method for a special case where there are no explicit inequality or equality constraints. It is shown that good estimation of second-order information is important for achieving satisfactory tracking performance. We also propose a specific version of the approximate Newton method based on L-BFGS-B that handles box constraints. Then, we propose two variants of the approximate Newton method that handle explicit inequality and equality constraints. The first variant employs penalty functions to obtain a modified version of the original problem, so that the approximate Newton method for the special case can be applied. The second variant can be viewed as an extension of the sequential quadratic program in the time-varying setting.
Finally, we discuss application of the proposed algorithms to power system operation. We formulate the time-varying optimal power flow problem, and introduce partition of the decision variables that enables us to model the power system by an implicit power flow map. The implicit power flow map allows us to incorporate real-time feedback measurements naturally in the algorithm. The use of real-time feedback measurement is a central idea in real-time optimal power flow algorithms, as it helps reduce the computation burden and potentially improve robustness against model mismatch. We then present in detail two real-time optimal power flow algorithms, one based on the regularized proximal primal-dual gradient algorithm, and the other based on the approximate Newton method with the penalty approach.https://thesis.library.caltech.edu/id/eprint/11358Online Platforms in Networked Markets: Transparency, Anticipation and Demand Management
https://resolver.caltech.edu/CaltechTHESIS:03132019-143428796
Authors: {'items': [{'email': 'johnpzf@gmail.com', 'id': 'Pang-John-Zhen-Fu', 'name': {'family': 'Pang', 'given': 'John Zhen Fu'}, 'orcid': '0000-0002-6485-7922', 'show_email': 'YES'}]}
Year: 2019
DOI: 10.7907/XY8M-8D94
<p>The global economy has been transformed by the introduction of online platforms in the past two decades. These companies, such as Uber and Amazon, have benefited and undergone massive growth, and are a critical part of the world economy today. Understanding these online platforms, their designs and how participation change with anticipation and uncertainty can help us identify the necessary ingredients for successful implementation of online platforms in the future, especially for those with underlying network constraints, e.g., the electricity grid.</p>
<p>This thesis makes three main contributions. First, we identify and compare common access and allocation control designs for online platforms, and highlight their trade-offs between transparency and control. We make these comparisons under a networked Cournot competition model and consider three popular designs: (i) open access, (ii) discriminatory access, and (iii) controlled allocation. Our findings reveal that designs that control over access are more efficient than designs that control over allocations, but open access designs are susceptible to substantial search costs. Next, we study the impact of demand management in a networked Stackelberg model considering network constraints and producer anticipation. We provide insights on limiting manipulation under these constrained networked marketplaces with nodal prices, and show that demand management mechanisms that traditionally aid system stability also help plays a vital role economically. In particular, we show that demand management empower consumers and give them "market power" to counter that of producers, limiting the impact of their anticipation and their potential for manipulation. Lastly, we study how participants (e.g., drivers on Uber) make competitive real-time production (driving) decisions. To that end, we design a novel pursuit algorithm for making online optimization under limited inventory constraints. Our analysis yields an algorithm that is competitive and applicable to achieve optimal results in the well known one-way trading problem, and new variants of the original problem.</p>https://thesis.library.caltech.edu/id/eprint/11425Convex Relaxations for Graph and Inverse Eigenvalue Problems
https://resolver.caltech.edu/CaltechTHESIS:01152020-210801253
Authors: {'items': [{'email': 'utkancandogan@gmail.com', 'id': 'Candogan-Utkan-Onur', 'name': {'family': 'Candogan', 'given': 'Utkan Onur'}, 'orcid': '0000-0002-1416-4909', 'show_email': 'NO'}]}
Year: 2020
DOI: 10.7907/ZV0D-SW58
<p>This thesis is concerned with presenting convex optimization based tractable solutions for three fundamental problems:</p>
<p>1. <i>Planted subgraph problem</i>: Given two graphs, identifying the subset of vertices of the larger graph corresponding to the smaller one.</p>
<p>2. <i>Graph edit distance problem</i>: Given two graphs, calculating the number of edge/vertex additions and deletions required to transform one graph into the other.</p>
<p>3. <i>Affine inverse eigenvalue problem</i>: Given a subspace <b>ε</b> ⊂ 𝕊ⁿ and a vector of eigenvalues λ ∈ ℝⁿ, finding a symmetric matrix with spectrum λ contained in <b>ε</b>.</p>
<p>These combinatorial and algebraic problems frequently arise in various application domains such as social networks, computational biology, chemoinformatics, and control theory. Nevertheless, exactly solving them in practice is only possible for very small instances due to their complexity. For each of these problems, we introduce convex relaxations which succeed in providing exact or approximate solutions in a computationally tractable manner.</p>
<p>Our relaxations for the two graph problems are based on convex graph invariants, which are functions of graphs that do not depend on a particular labeling. One of these convex relaxations, coined the Schur-Horn orbitope, corresponds to the convex hull of all matrices with a given spectrum, and plays a prominent role in this thesis. Specifically, we utilize relaxations based on the Schur-Horn orbitope in the context of the planted subgraph problem and the graph edit distance problem. For both of these problems, we identify conditions under which the Schur-Horn orbitope based relaxations exactly solve the corresponding problem with overwhelming probability. Specifically, we demonstrate that these relaxations turn out to be particularly effective when the underlying graph has a spectrum comprised of few distinct eigenvalues with high multiplicities. In addition to relaxations based on the Schur-Horn orbitope, we also consider outer-approximations based on other convex graph invariants such as the stability number and the maximum-cut value for the graph edit distance problem. On the other hand, for the inverse eigenvalue problem, we investigate two relaxations arising from a sum of squares hierarchy. These relaxations have different approximation qualities, and accordingly induce different computational costs. We utilize our framework to generate solutions for, or certify unsolvability of the underlying inverse eigenvalue problem.</p>
<p>We particularly emphasize the computational aspect of our relaxations throughout this thesis. We corroborate the utility of our methods with various numerical experiments.</p>https://thesis.library.caltech.edu/id/eprint/13622Riemannian Optimization for Convex and Non-Convex Signal Processing and Machine Learning Applications
https://resolver.caltech.edu/CaltechTHESIS:06012020-120425051
Authors: {'items': [{'email': 'douik.ahmed@gmail.com', 'id': 'Douik-Ahmed', 'name': {'family': 'Douik', 'given': 'Ahmed'}, 'orcid': '0000-0001-7791-9443', 'show_email': 'NO'}]}
Year: 2020
DOI: 10.7907/jt3c-0m30
The performance of most algorithms for signal processing and machine learning applications highly depends on the underlying optimization algorithms. Multiple techniques have been proposed for solving convex and non-convex problems such as interior-point methods and semidefinite programming. However, it is well known that these algorithms are not ideally suited for large-scale optimization with a high number of variables and/or constraints. This thesis exploits a novel optimization method, known as Riemannian optimization, for efficiently solving convex and non-convex problems with signal processing and machine learning applications. Unlike most optimization techniques whose complexities increase with the number of constraints, Riemannian methods smartly exploit the structure of the search space, a.k.a., the set of feasible solutions, to reduce the embedded dimension and efficiently solve optimization problems in a reasonable time. However, such efficiency comes at the expense of universality as the geometry of each manifold needs to be investigated individually. This thesis explains the steps of designing first and second-order Riemannian optimization methods for smooth matrix manifolds through the study and design of optimization algorithms for various applications. In particular, the paper is interested in contemporary applications in signal processing and machine learning, such as community detection, graph-based clustering, phase retrieval, and indoor and outdoor location determination. Simulation results are provided to attest to the efficiency of the proposed methods against popular generic and specialized solvers for each of the above applications.https://thesis.library.caltech.edu/id/eprint/13758Universality Laws and Performance Analysis of the Generalized Linear Models
https://resolver.caltech.edu/CaltechTHESIS:06092020-005908250
Authors: {'items': [{'email': 'eabbasia@gmail.com', 'id': 'Abbasi-Ehsan', 'name': {'family': 'Abbasi', 'given': 'Ehsan'}, 'orcid': '0000-0002-0185-7933', 'show_email': 'NO'}]}
Year: 2020
DOI: 10.7907/873c-ej41
<p>In the past couple of decades, non-smooth convex optimization has emerged as a powerful tool for the recovery of structured signals (sparse, low rank, etc.) from noisy linear or non-linear measurements in a variety of applications in genomics, signal processing, wireless communications, machine learning, etc.. Taking advantage of the particular structure of the unknown signal of interest is critical since in most of these applications, the dimension <i>p</i> of the signal to be estimated is comparable, or even larger than the number of observations <i>n</i>. With the advent of Compressive Sensing there has been a very large number of theoretical results that study the estimation performance of non-smooth convex optimization in such a <i>high-dimensional setting</i>.</p>
<p>A popular approach for estimating an unknown signal β₀ ϵ ℝ<i>ᵖ</i> in a <i>generalized linear model</i>, with observations <b>y</b> = g(<b>X</b>β₀) ϵ ℝ<i>ⁿ</i>, is via solving the estimator β̂ = arg min<sub>β</sub> <i>L</i>(<b>y</b>, <b>X</b>β + <i>λf</i>(<i>β</i>). Here, <i>L</i>(•,•) is a loss function which is convex with respect to its second argument, and <i>f</i>(•) is a regularizer that enforces the structure of the unknown β₀. We first analyze the generalization error performance of this estimator, for the case where the entries of <b>X</b> are drawn <i>independently from real standard Gaussian</i> distribution. The <i>precise</i> nature of our analysis permits an accurate performance comparison between different instances of these estimators, and allows to optimally tune the hyperparameters based on the model parameters. We apply our result to some of the most popular cases of generalized linear models, such as M-estimators in linear regression, logistic regression and generalized margin maximizers in binary classification problems, and Poisson regression in count data models. The key ingredient of our proof is the <i>Convex Gaussian Min-max Theorem (CGMT)</i>, which is a tight version of the Gaussian comparison inequality proved by Gordon in 1988. Unfortunately, having real iid entries in the features matrix <b>X</b> is crucial in this theorem, and it cannot be naturally extended to other cases.</p>
<p>But for some special cases, we prove some universality properties and indirectly extend these results to more general designs of the features matrix <b>X</b>, where the entries are not necessarily real, independent, or identically distributed. This extension, enables us to analyze problems that CGMT was incapable of, such as models with quadratic measurements, phase-lift in phase retrieval, and data recovery in massive MIMO, and help us settle a few long standing open problems in these areas.</p>https://thesis.library.caltech.edu/id/eprint/13804Latent-Variable Modeling: Algorithms, Inference, and Applications
https://resolver.caltech.edu/CaltechTHESIS:09222019-132051506
Authors: {'items': [{'email': '\u200b armeen.taeb@gmail.com', 'id': 'Taeb-Armeen', 'name': {'family': 'Taeb', 'given': 'Armeen'}, 'orcid': '0000-0002-5647-3160', 'show_email': 'YES'}]}
Year: 2020
DOI: 10.7907/YRF1-7W29
<p>Many driving factors of physical systems are often latent or unobserved. Thus, understanding such systems crucially relies on accounting for the influence of the latent structure. This thesis makes advances in three aspects of latent-variable modeling: inference, algorithms, and applications. Specifically, we develop and explore latent-variable techniques that a) ensure interpretable and statistically significant models, b) can be efficiently optimized to identify best fit to data, and c) provide useful insights in real-world applications. The specific contributions of this thesis are:</p>
<p>1. We employ a latent-variable graphical modeling technique to develop the first state-wide statistical model of the California reservoir network. With this model, we precisely characterize the system-wide behavior of the network to hypothetical drought conditions, and proposed guidelines for more sustainable reservoir management.</p>
<p>2. Motivated by the previous application, we provide a geometric framework to assess the extent to which our latent variable model has learned true or false discoveries about the relevant physical phenomena. Our approach generalizes the classical notions of true and false discoveries in mathematical statistics that rely on the discrete structure of the decision space to settings where the decision space is continuous and more complicated. We highlight the utility of this viewpoint in problems involving subspace selection and low-rank estimation.</p>
<p>3. We propose a convex optimization procedure to fit a latent-variable graphical model for generalized linear models. This framework provides a flexible approach to model non-Gaussian variables including Poisson, Bernoulli, and exponential variables. A particularly novel aspect of our formulation is that it incorporates regularizers that are tailored to the type of latent variables.</p>
<p>4. We describe a computationally efficient framework to learn a latent-variable model with high-dimensional and non-iid data. This framework is based on factoriable precision operators that decouple the component associated with the observational dependencies and the component associated to interdependencies among the variables.</p>
<p>5. We propose a convex optimization technique to provide semantics to latent variables of a factor model. This approach is based on linking auxiliary variables -- chosen based on domain expertise -- to these latent variables.</p>https://thesis.library.caltech.edu/id/eprint/11799The Adaptive Charging Network Research Portal: Systems, Tools, and Algorithms
https://resolver.caltech.edu/CaltechTHESIS:05282021-174411678
Authors: {'items': [{'email': 'zach401@gmail.com', 'id': 'Lee-Zachary-Jordan', 'name': {'family': 'Lee', 'given': 'Zachary Jordan'}, 'orcid': '0000-0002-5358-2388', 'show_email': 'NO'}]}
Year: 2021
DOI: 10.7907/8eqg-e110
<p>Millions of electric vehicles (EVs) will enter service in the next decade, generating gigawatt-hours of additional energy demand. Charging these EVs cleanly, affordably, and without excessive stress on the grid will require advances in charging system design, hardware, monitoring, and control. Collectively, we refer to these advances as smart charging. While researchers have explored smart charging for over a decade, very few smart charging systems have been deployed in practice, leaving a sizeable gap between the research literature and the real world. In particular, we find that research is often based on simplified theoretical models. These simple models make analysis tractable but do not account for the complexities of physical systems. Moreover, researchers often lack the data needed to evaluate the performance of their algorithms on real workloads or apply techniques like machine learning. Even when promising algorithms are developed, they are rarely deployed since field tests can be costly and time-consuming.</p>
<p>The goal of this thesis is to develop systems, tools, and algorithms to bridge these gaps between theory and practice.</p>
<p>First, we describe the architecture of a first-of-its-kind smart charging system we call the Adaptive Charging Network (ACN).
Next, we use data and models from the ACN to develop a suite of tools to help researchers. These tools include ACN-Data, a public dataset of over 80,000 charging sessions; ACN-Sim, an open-source simulator based on realistic models; and ACN-Live, a platform for field testing algorithms on the ACN. Finally, we describe the algorithms we have developed using these tools. For example, we propose a practical and robust algorithm based on model predictive control, which can reduce infrastructure requirements by over 75%, increase operator profits by up to 3.4 times, and significantly reduce strain on the electric power grid. Other examples include a pricing scheme that fairly allocates costs to users considering time-of-use tariffs and demand charges and a data-driven approach to optimally size on-site solar generation with smart EV charging systems.</p>https://thesis.library.caltech.edu/id/eprint/14191Applications of Convex Analysis to Signomial and Polynomial Nonnegativity Problems
https://resolver.caltech.edu/CaltechTHESIS:05202021-194439071
Authors: {'items': [{'email': 'rjmurray201693@gmail.com', 'id': 'Murray-Riley-John', 'name': {'family': 'Murray', 'given': 'Riley John'}, 'orcid': '0000-0003-1461-6458', 'show_email': 'NO'}]}
Year: 2021
DOI: 10.7907/vn9x-xj10
<p>Here is a question that is easy to state, but often hard to answer:</p>
<p><i>Is this function nonnegative on this set?</i></p>
<p>When faced with such a question, one often makes appeals to known inequalities. One crafts arguments that are <i>sufficient</i> to establish the nonnegativity of the function, rather than determining the function's precise range of values. This thesis studies sufficient conditions for nonnegativity of signomials and polynomials. Conceptually, signomials may be viewed as generalized polynomials that feature arbitrary real exponents, but with variables restricted to the positive orthant.</p>
<p>Our methods leverage efficient algorithms for a type of convex optimization known as relative entropy programming (REP). By virtue of this integration with REP, our methods can help answer questions like the following:</p>
<p>Is there some function, in this particular space of functions, that is nonnegative on this set?</p>
<p>The ability to answer such questions is <i>extremely</i> useful in applied mathematics.
Alternative approaches in this same vein (e.g., methods for polynomials based on semidefinite programming)
have been used successfully as convex relaxation frameworks for nonconvex optimization, as mechanisms for analyzing dynamical systems, and even as tools for solving nonlinear partial differential equations.</p>
<p>This thesis builds from the <i>sums of arithmetic-geometric exponentials</i> or <i>SAGE</i> approach to signomial nonnegativity. The term "exponential" appears in the SAGE acronym because SAGE parameterizes signomials in terms of exponential functions.</p>
<p>Our first round of contributions concern the original SAGE approach. We employ basic techniques in convex analysis and convex geometry to derive structural results for spaces of SAGE signomials and exactness results for SAGE-based REP relaxations of nonconvex signomial optimization problems.
We frame our analysis primarily in terms of the coefficients of a signomial's basis expansion rather than in terms of signomials themselves.
The effect of this framing is that our results for signomials readily transfer to polynomials. In particular, we are led to define a new concept of <i>SAGE polynomials</i>. For sparse polynomials, this method offers an exponential efficiency improvement relative to certificates of nonnegativity obtained through semidefinite programming.</p>
<p>We go on to create the <i>conditional SAGE</i> methodology for exploiting convex substructure in constrained signomial nonnegativity problems.
The basic insight here is that since the standard relative entropy representation of SAGE signomials is obtained by a suitable application of convex duality, we are free to add additional convex constraints into the duality argument. In the course of explaining this idea we provide some illustrative examples in signomial optimization and analysis of chemical dynamics.</p>
<p>The majority of this thesis is dedicated to exploring fundamental questions surrounding conditional SAGE signomials. We approach these questions through analysis frameworks of <i>sublinear circuits</i> and <i>signomial rings</i>. These sublinear circuits generalize simplicial circuits of affine-linear matroids, and lead to rich modes of analysis for sets that are simultaneously convex in the usual sense and convex under a logarithmic transformation. The concept of signomial rings lets us develop a powerful signomial Positivstellensatz and an elementary signomial moment theory. The Positivstellensatz provides for an effective hierarchy of REP relaxations for approaching the value of a nonconvex signomial minimization problem from below, as well as a first-of-its-kind hierarchy for approaching the same value from above.</p>
<p>In parallel with our mathematical work, we have developed the sageopt python package. Sageopt drives all the examples and experiments used throughout this thesis, and has been used by engineers to solve high-degree polynomial optimization problems at scales unattainable by alternative methods.
We conclude this thesis with an explanation of how our theoretical results affected sageopt's design.</p>https://thesis.library.caltech.edu/id/eprint/14169Signals on Networks: Random Asynchronous and Multirate Processing, and Uncertainty Principles
https://resolver.caltech.edu/CaltechTHESIS:09242020-094028488
Authors: {'items': [{'email': 'ogteke@gmail.com', 'id': 'Teke-Oguzhan', 'name': {'family': 'Teke', 'given': 'Oguzhan'}, 'orcid': '0000-0002-1131-5206', 'show_email': 'NO'}]}
Year: 2021
DOI: 10.7907/44dx-3g83
<p>The processing of signals defined on graphs has been of interest for many years, and finds applications in a diverse set of fields such as sensor networks, social and economic networks, and biological networks. In graph signal processing applications, signals are not defined as functions on a uniform time-domain grid but they are defined as vectors indexed by the vertices of a graph, where the underlying graph is assumed to model the irregular signal domain. Although analysis of such networked models is not new (it can be traced back to the consensus problem studied more than four decades ago), such models are studied recently from the view-point of signal processing, in which the analysis is based on the "graph operator" whose eigenvectors serve as a Fourier basis for the graph of interest. With the help of graph Fourier basis, a number of topics from classical signal processing (such as sampling, reconstruction, filtering, etc.) are extended to the case of graphs.</p>
<p>The main contribution of this thesis is to provide new directions in the field of graph signal processing and provide further extensions of topics in classical signal processing. The first part of this thesis focuses on a random and asynchronous variant of "graph shift," i.e., localized communication between neighboring nodes. Since the dynamical behavior of randomized asynchronous updates is very different from standard graph shift (i.e., state-space models), this part of the thesis focuses on the convergence and stability behavior of such random asynchronous recursions. Although non-random variants of asynchronous state recursions (possibly with non-linear updates) are well-studied problems with early results dating back to the late 60's, this thesis considers the convergence (and stability) in the statistical mean-squared sense and presents the precise conditions for the stability by drawing parallels with switching systems. It is also shown that systems exhibit unexpected behavior under randomized asynchronicity: an unstable system (in the synchronous world) may be stabilized simply by the use of randomized asynchronicity. Moreover, randomized asynchronicity may result in a lower total computational complexity in certain parameter settings. The thesis presents applications of the random asynchronous model in the context of graph signal processing including an autonomous clustering of network of agents, and a node-asynchronous communication protocol that implements a given rational filter on the graph.</p>
<p>The second part of the thesis focuses on extensions of the following topics in classical signal processing to the case of graph: multirate processing and filter banks, discrete uncertainty principles, and energy compaction filters for optimal filter design. The thesis also considers an application to the heat diffusion over networks.</p>
<p>Multirate systems and filter banks find many applications in signal processing theory and implementations. Despite the possibility of extending 2-channel filter banks to bipartite graphs, this thesis shows that this relation cannot be generalized to <i>M</i>-channel systems on <i>M</i>-partite graphs. As a result, the extension of classical multirate theory to graphs is nontrivial, and such extensions cannot be obtained without certain mathematical restrictions on the graph. The thesis provides the necessary conditions on the graph such that fundamental building blocks of multirate processing remain valid in the graph domain. In particular, it is shown that when the underlying graph satisfies a condition called <i>M</i>-block cyclic property, classical multirate theory can be extended to the graphs.</p>
<p>The uncertainty principle is an essential mathematical concept in science and engineering, and uncertainty principles generally state that a signal cannot have an arbitrarily "short" description in the original basis and in the Fourier basis simultaneously. Based on the fact that graph signal processing proposes two different bases (i.e., vertex and the graph Fourier domains) to represent graph signals, this thesis shows that the total number of nonzero elements of a graph signal and its representation in the graph Fourier domain is lower bounded by a quantity depending on the underlying graph. The thesis also presents the necessary and sufficient condition for the existence of 2-sparse and 3-sparse eigenvectors of a connected graph. When such eigenvectors exist, the uncertainty bound is very low, tight, and independent of the global structure of the graph.</p>
<p>The thesis also considers the classical spectral concentration problem. In the context of polynomial graph filters, the problem reduces to the polynomial concentration problem studied more generally by Slepian in the 70's. The thesis studies the asymptotic behavior of the optimal solution in the case of narrow bandwidth. Different examples of graphs are also compared in order to show that the maximum energy compaction and the optimal filter depends heavily on the graph spectrum.</p>
<p>In the last part, the thesis considers the estimation of the starting time of a heat diffusion process from its noisy measurements when there is a single point source located on a known vertex of a graph with unknown starting time. In particular, the Cramér-Rao lower bound for the estimation problem is derived, and it is shown that for graphs with higher connectivity the problem has a larger lower bound making the estimation problem more difficult.</p>https://thesis.library.caltech.edu/id/eprint/13965Structured Signal Recovery from Nonlinear Measurements with Applications in Phase Retrieval and Linear Classification
https://resolver.caltech.edu/CaltechTHESIS:05172021-044724906
Authors: {'items': [{'email': 'fariborzsalehi93@gmail.com', 'id': 'Salehi-Fariborz', 'name': {'family': 'Salehi', 'given': 'Fariborz'}, 'orcid': '0000-0002-9679-1016', 'show_email': 'NO'}]}
Year: 2021
DOI: 10.7907/1c69-wq71
<p>Nonlinear models are widely used in signal processing, statistics, and machine learning to model real-world applications. A popular class of such models is the single-index model where the response variable is related to a linear combination of dependent variables through a link function. In other words, if x ∈ R<sup>p</sup> denotes the input signal, the posterior mean of the generated output y has the form, E[y|x] = ρ(x<sup>T</sup>w), where ρ :R → R is a known function (referred to as the link function), and w ∈ R<sup>p</sup> is the vector of unknown parameters. When ρ(•) is invertible, this class of models is called generalized linear models (GLMs). GLMs are commonly used in statistics and are often viewed as flexible generalizations of linear regression. Given n measurements (samples) from this model, D = {(x<sub>i</sub>, y<sub>i</sub>) | 1 ≤q i ≤ n}, the goal is to estimate the parameter vector w. While the model parameters are assumed to be unknown, in many applications these parameters follow certain structures (sparse, low-rank, group-sparse, etc.) The knowledge on this structure can be used to form more accurate estimators.</p>
<p>The main contribution of this thesis is to provide a precise performance analysis for convex optimization programs that are used for parameter estimation in two important classes of single-index models. These classes are: (1) phase retrieval in signal processing, and (2) binary classification in statistical learning.</p>
<p>The first class of models studied in this thesis is the phase retrieval problem, where the goal is to recover a discrete complex-valued signal from amplitudes of its linear combinations. Methods based on convex optimization have recently gained significant attentions in the literature. The conventional convex-optimization-based methods resort to the idea of lifting which makes them computationally inefficient. In addition to providing an analysis of the recovery threshold for the semidefinite-programming-based methods, this thesis studies the performance of a new convex relaxation for the phase retrieval problem, known as phasemax, which is computationally more efficient as it does not lift the signal to higher dimensions. Furthermore, to address the case of structured signals, regularized phasemax is introduced along with a precise characterization of the conditions for its perfect recovery in the asymptotic regime.</p>
<p>The next important application studied in this thesis is the binary classification in statistical learning. While classification models have been studied in the literature since 1950's, the understanding of their performance has been incomplete until very recently. Inspired by the maximum likelihood (ML) estimator in logistic models, we analyze a class of optimization programs that attempts to find the model parameters by minimizing an objective that consists of a loss function (which is often inspired by the ML estimator) and an additive regularization term that enforces our knowledge on the structure. There are two operating regimes for this problem depending on the separability of the training data set D. In the asymptotic regime, where the number of samples and the number of parameters grow to infinity, a phase transition phenomenon is demonstrated that happens at a certain over-parameterization ratio. We compute this phase transition for the setting where the underlying data is drawn from a Gaussian distribution.</p>
<p>In the case where the data is non-separable, the ML estimator is well-defined, and its attributes have been studied in the classical statistics. However, these classical results fail to provide reasonable estimate in the regime where the number of data points is proportional to the number of samples. One contribution of this thesis is to provide an exact analysis on the performance of the regularized logistic regression when the number of training data is proportional to the number of samples. When the data is separable (a.k.a. the interpolating regime), there exist multiple linear classifiers that perfectly fit the training data. In this regime, we introduce and analyze the performance of "extended margin maximizers" (EMMs). Inspired by the max-margin classifier, EMM classifiers simultaneously consider maximizing the margin and the structure of the parameter. Lastly, we discuss another generalization to the max-margin classifier, referred to as the robust max-margin classifier, that takes into account the perturbations by an adversary. It is shown that for a broad class of loss functions, gradient descent iterates (with proper step sizes) converge to the robust max-margin classifier.</p>https://thesis.library.caltech.edu/id/eprint/14150Cascading Failures in Power Systems: Modeling, Characterization, and Mitigation
https://resolver.caltech.edu/CaltechTHESIS:06032022-035416994
Authors: {'items': [{'email': 'liangch93@gmail.com', 'id': 'Liang-Chen', 'name': {'family': 'Liang', 'given': 'Chen'}, 'orcid': '0000-0002-0015-7206', 'show_email': 'YES'}]}
Year: 2022
DOI: 10.7907/8817-xy25
<p>Reliability is a critical goal for power systems. Due to the connectivity of power grids, an initial failure may trigger a cascade of failures and eventually lead to a large-scale blackout, causing significant economic and social impacts. Cascading failure analysis thus draws wide attention from power system practitioners and researchers. A well-known observation is that cascading failures in power systems propagate non-locally because of the complex mechanism of power grids. Such non-local propagation makes it particularly challenging to model, analyze and control the failure process. In this thesis, we tackle these challenges by establishing a mathematical theory to model and characterize failure patterns, discover structural properties of failure propagation, and design novel techniques for failure mitigation.</p>
<p>First, we propose a failure propagation model considering both fast-timescale system frequency dynamics and the slow-timescale line tripping process. This model provides mathematical justifications to the widely used static DC model and can be generalized to capture a variety of failure propagation patterns induced by different control mechanisms of the power grid. More importantly, this model provides flexibility to design real-time control algorithms for failure mitigation and localization.</p>
<p>Second, we provide a complete characterization of line failures under the static DC model. Our results unveil a deep connection between the power redistribution patterns and the network block decomposition. More specifically, we show that a non-cut line failure in a block will only impact the branch power flows on the transmission lines within the block. In contrast, a cut set line failure will propagate globally depending on both the power balancing rules and the network topological structure. Further, we discuss three types of interface networks to connect the sub-grids, all achieving better failure localization performance.</p>
<p>Third, we study corrective control algorithms for failure mitigation. We integrate a distributed frequency control strategy with the network block decomposition to provide provable failure mitigation and localization guarantees on line failures. This strategy operates on the frequency control timescale and supplements existing corrective mechanisms, improving grid reliability and operational efficiency. We further explore the failure mitigation approach with direct post-contingency injection adjustments. Specifically, we propose an optimization-based control method with strong structural properties, which is highly desirable in large-scale power networks.</p>https://thesis.library.caltech.edu/id/eprint/14939Optimization of Distribution Power Networks: from Single-Phase to Multi-Phase
https://resolver.caltech.edu/CaltechTHESIS:06012022-005449566
Authors: {'items': [{'email': 'fengyuzhou1994@gmail.com', 'id': 'Zhou-Fengyu', 'name': {'family': 'Zhou', 'given': 'Fengyu'}, 'orcid': '0000-0002-2639-6491', 'show_email': 'NO'}]}
Year: 2022
DOI: 10.7907/tg26-9857
<p>Distributed energy resources play an important role in today's distribution power system. The Optimal Power Flow (OPF) problem is fundamental in power systems as many important applications such as economic dispatch, battery displacement, unit commitment, and voltage control can be formulated as an OPF. A paradoxical observation is the problem's complexity in theory but simplicity in practice. On the one hand, the problem is well known to be non-convex and NP-hard, so it is likely that no simple algorithms can solve all problem instances efficiently. On the other hand, there are many known algorithms which perform extremely well in practice for both standard test cases and real-world systems. This thesis attempts to reconcile this seeming contradiction.</p>
<p>Specifically, this thesis focuses on two types of properties that may underlie the simplicity in practice of OPF problems. The first property is the exactness of relaxations, meaning that one can find a convex relaxation of the original non-convex problem such that the two problems share the same optimal solution. This property would allow us to convexify the non-convex problem without altering the optimal solution and cost. The second property is that all locally optimal solutions of the non-convex problem are also globally optimal. This property allows us to apply local algorithms such as gradient descent without being trapped at some spurious local optima. We focus on distribution systems with radial networks (i.e., the underlying graphs are trees). We consider both single-phase models and unbalanced multi-phase models, since most real-world distribution systems are multi-phase unbalanced, and distributed energy resources (DERs) can be connected in either Wye or Delta configurations.</p>
<p>The main results of this thesis are two-fold. In the first half, we propose a class of sufficient conditions for a non-convex problem to simultaneously have exact relaxation and no spurious local optima. Then we apply the result to single-phase system and conclude that if all buses have no injection lowerbounds, then both properties (exactness and global optimality) can be achieved. While the same condition is already known to be sufficient for exactness, our work is the first to extend it to global optimality. In the second half, we focus on the exactness property for multi-phase systems. For systems without Delta connections, the exactness can be guaranteed if 1) the binding constraints are sparse in the network at optimality; or 2) all nodal prices fall within a narrow range. Using the DC model as an approximation, we further analyze the OPF sensitivity and explain why nodal prices tend to be close to each other. In the presence of Delta connections, we conclude that the inexactness can be resolved by either postprocessing an optimal solution, or adding a new regularization term in the cost function. Both methods achieve global optimality for IEEE standard test cases.</p>https://thesis.library.caltech.edu/id/eprint/14656