Advisor Feed
https://feeds.library.caltech.edu/people/Hassibi-B/advisor.rss
A Caltech Library Repository Feedhttp://www.rssboard.org/rss-specificationpython-feedgenenFri, 08 Dec 2023 21:39:26 +0000Coherent Optical Array Receiver for PPM Signals Under Atmospheric Turbulence
https://resolver.caltech.edu/CaltechETD:etd-05252006-221314
Authors: Muñoz Fernández, Michela
Year: 2006
DOI: 10.7907/VDSA-SA42
<p>The performance of a coherent free-space optical communications system operating in the presence of turbulence is investigated. Maximum Likelihood Detection techniques are employed to optimally detect Pulse Position Modulated signals with a focal-plane detector array and to reconstruct the turbulence-degraded signals.</p>
<p>Laboratory equipment and experimental setup used to carry out these experiments at the Jet Propulsion Laboratory are described. The key components include two lasers operating at 1064 nm wavelength for use with coherent detection, a 16 element (4 X 4) InGaAs focal-plane detector array, and a data-acquisition and signal-processing assembly needed to sample and collect the data and analyze the results. The detected signals are combined using the least-mean-square (LMS) algorithm. In the first part of the experimental results we show convergence of the algorithm for experimentally obtained signal tones in the presence of atmospheric turbulence. The second part of the experimental results shows adaptive combining of experimentally obtained heterodyned pulse position modulated (PPM) signals with pulse-to-pulse coherence in the presence of simulated spatial distortions resembling atmospheric turbulence. The adaptively combined PPM signals are phased up via an LMS algorithm suitably optimized to operate with PPM in the presence of additive shot noise. A convergence analysis of the algorithm is presented, and results with both computer-simulated and experimentally obtained PPM signals are analyzed.</p>
<p>The third part of the experimental results, in which the main goal of this thesis is achieved, includes an investigation of the performance of the Coherent Optical Receiver Experiment (CORE) at JPL. Bit Error Rate (BER) results are presented for single and multichannel optical receivers where quasi shot noise-limited performance is achieved under simulated turbulence conditions using noncoherent postdetection processing techniques. Theoretical BER expressions are compared with experimentally obtained BER results, and array combining gains are presented. BER results are shown as a function of signal-to-noise ratio (SNR), photons per symbol, and photons per bit (PPB).</p>https://thesis.library.caltech.edu/id/eprint/2061Broadband Wireless Broadcast Channels: Throughput, Performance, and PAPR Reduction
https://resolver.caltech.edu/CaltechETD:etd-08292005-100440
Authors: Sharif, Masoud
Year: 2006
DOI: 10.7907/25JK-Z952
<p>The ever-growing demand for higher rates and better quality of service in cellular systems has attracted many researchers to study techniques to boost the capacity and improve the performance of cellular systems. The main candidates to increase the capacity are to use multiple antennas or to increase the bandwidth. This thesis attempts to solve a few challenges regarding scheduling schemes in the downlink of cellular networks, and the implementation of modulation schemes suited for wideband channels.</p>
<p>Downlink scheduling in cellular systems is known to be a bottleneck for future broadband wireless communications. Information theoretic results on broadcast channels provide the limits for the maximum achievable rates for each receiver and transmission schemes to achieve them. It turns out that the sum-rate capacity (sum-rate (or throughput) refers to the sum of the transmission rates to all users) of a multi-antenna broadcast channel heavily depends on the availability of channel state information (CSI) at the transmitter. Unfortunately, the dirty paper coding (DPC) scheme which achieves the capacity region is extremely computationally intensive especially in multiuser context. Furthermore, relying on the assumption that full CSI is available from all the n users may not be feasible in practice.</p>
<p>In the first part of the thesis, we obtain the scaling law of the sum-rate capacity for large n and for a homogeneous fading MIMO (multiple input multiple output) broadcast channel, and then propose a simple scheme that only requires little (partial) CSI and yet achieves the same scaling law. Another important issue in downlink scheduling is to maintain fairness among users with different distances to the transmitter. Interestingly, we prove that our scheduling scheme becomes fair provided that the number of transmit antennas is large enough. We further analyze the impact of using a throughput optimal scheduling on the delay in sending information to the users. Finally, we look into the problem of differentiated rate scheduling in which different users demand for different sets of rates. We obtain explicit scheduling schemes to achieve the rate constraints.</p>
<p>In the second part of the thesis, we focus on orthogonal frequency division multiplexing (OFDM), which is the most promising technique for broadband wireless channels (mainly due to its simplicity of channel equalization even in a severe multipath fading environment). The main disadvantage of this modulation, however, is its high peak to mean envelope power ratio (PMEPR). This is due to the fact that the OFDM signal consists of many (say n) harmonically related subcarriers which may, in the worst-case, add up constructively and lead to large peaks (of order n) in the signal.</p>
<p>Despite this worst-case performance, we show that when each subcarrier is chosen from some given constellation, the PMEPR behaves like log{n} almost surely, for large n. This implies that there exist almost full-rate codes with a PMEPR of log{n} for large n. We further prove that there exist codes with rate not vanishing to zero such that the PMEPR is less than a constant (independent of n). We also construct high rate codes with a guaranteed PMEPR of log{n}. Simulation results show that in a system with 128 subcarriers and using 16QAM, the PMEPR of a multicarrier signal can be reduced from 13.5 to 3.4 which is within 1.6dB of the PMEPR of a single carrier system.</p>https://thesis.library.caltech.edu/id/eprint/3264Distributed Estimation and Control in Networked Systems
https://resolver.caltech.edu/CaltechETD:etd-08172006-130145
Authors: Gupta, Vijay
Year: 2007
DOI: 10.7907/KWN2-X741
<p>Rapid advances in information processing, communication and sensing technologies have enabled more and more devices to be provided with embedded processors, networking capabilities and sensors. For the field of estimation and control, it is now possible to consider an architecture in which many simple components communicate and cooperate to achieve a joint team goal. This distributed (or networked) architecture promises much in terms of performance, reliability and simplicity of design; however, at the same time, it requires extending the traditional theories of control, communication and computation and, in fact, looking at a unified picture of the three fields. A systematic theory of how to design distributed systems is currently lacking.</p>
<p>This dissertation takes the first steps towards understanding the effects of imperfect information flow in distributed systems from an estimation and control perspective and coming up with new design principles to counter these effects. Designing networked systems is difficult because such systems challenge two basic assumptions of traditional control theory - presence of a central node with access to all the information about the system and perfect transmission of information among components. We formulate and solve many problems that deal with the removal of one, or both, of these assumptions. The chief idea explored in this dissertation is the joint design of information flow and the control law. While traditional control design has concentrated on calculating the optimal control input by assuming a particular information flow between the components, our approach seeks to synthesize the optimal information flow along with the optimal control law that satisfies the constraints of the information flow. Thus besides the question of 'What should an agent do?', the questions of 'Whom should an agent talk to?', 'What should an agent communicate?', 'When should an agent communicate?' and so on also have to be answered. The design of the information flow represents an important degree of freedom available to the system designer that has hitherto largely been ignored. As we demonstrate in the dissertation, the joint design of information flow and the optimal control input satisfying the constraints of that information flow yields large improvements in performance over simply trying to fit traditional design theories on distributed systems.</p>
<p>We begin by formulating a distributed control problem in which many agents in a formation need to cooperate to minimize a joint cost function. We provide numerical algorithms to synthesize the optimal constrained control law that involve solving linear equations only and hence are free from numerical issues plaguing the other approaches proposed in the literature. We then provide and analyze a model to understand the issue of designing the topology according to which the agents interact. The results are very surprising since there are cases when allowing communication to happen between two agents may, in fact, be detrimental to the performance.</p>
<p>We then move on to consider the effects of communication channels on control performance. To counter such effects, we propose the idea of encoding information for the purpose of estimation and control prior to transmission. Although information theoretic techniques are not possible in our problem, we are able to solve for a recursive yet optimal encoder / decoder structure in many cases. This information flow design oriented approach has unique advantages such as being optimal for any packet drop pattern, being able to include the effect of known but random delays easily, letting us escape the limits set by reliability for transmission of data across a network by using intermediate nodes as 'repeaters' similar to a digital communication network and so on.</p>
<p>We finally take a look at combining the effects of multiple sources of information and communication channels on estimation and control. We look at a distributed estimation problem in which, at every time step, only a subset out of many sensors can transmit information to the estimator. This is also a representative resource allocation problem. We propose the idea of stochastic communication patterns that allows us to include the effects of communication channels explicitly during system design. Thus, instead of tree-search based algorithms proposed in the literature, we provide stochastic scheduling algorithms that can take into account the random packet drop effect of the channels. We also consider a distributed control problem with switching topologies and solve for the optimal controller. The tools that we develop are applicable to many other scenarios and we demonstrate some of them in the dissertation.</p>
<p>Along the way, we look at many other related problems in the dissertation. As an example, we provide initial results about the issue of robustness of a distributed system design to a malfunctioning agent. This notion is currently lacking in the control and estimation community, but has to be a part of any effective theory for designing networked or distributed systems.</p>https://thesis.library.caltech.edu/id/eprint/3157Wireless Networks: New Models and Results
https://resolver.caltech.edu/CaltechETD:etd-10032006-113124
Authors: Gowaikar, Radhika
Year: 2007
DOI: 10.7907/9652-QK53
<p>Wireless communications have gained much currency in the last few decades. In this thesis we present results regarding several wireless communication systems, in particular, wireless networks.</p>
<p>For some time now, it has been known that in an ad hoc network in which nodes share the wireless medium, and the connection strengths between nodes follow a distance-based decay law, the throughput scales like O(√n), where n is the number of nodes. In Chapter 2 we introduce randomness in the connection strengths and examine the effects of this on the throughput. We assume that all the channels are drawn independently from a common distribution and are not governed by a distance-decay law. It turns out that the aggregate information flow depends strongly on the distribution from which the channel strengths are drawn. For certain distributions, a throughput of n/(log n)<sup>d</sup> with d>0 is possible, which is a significant improvement over the O(√n) results known previously. In Chapter 3, we generalize the network model to two-scale networks. This model incorporates the distance-decay law for nodes that are separated by large distances, while maintaining randomness in close neighborhoods of a node. For certain networks, we show that a throughput of the form n<sup>1/t-1</sup>/log<sup>2</sup>n is achievable, where t>2 is a parameter that depends on the distribution of the connection at the local scale and is independent of the decay law that operates at a global scale.</p>
<p>In Chapter 4, we consider a model of an erasure wireless network, in which every node is connected to certain other nodes by erasure links, on which packets or bits are lost with some probability and received accurately otherwise. Each node is constrained to send the same message on all outgoing channels, thus incorporating the broadcast feature, and we assume that there is no interference in the network, other than through the possible correlation of erasure occurrences. For such networks and in certain multicast scenarios, we obtain the precise capacity region. This region has a nice max-flow, min-cut interpretation and can be achieved using linear codes. We do require the side-information regarding erasure locations on all links to be available to the destinations. Thus, we have the capacity region for a non-trivial class of wireless networks.</p>
<p>Recent results for wireline networks show that in several scenarios, it is optimal to operate these networks by making each link error-free. In Chapter 5, we consider Gaussian networks with broadcast and interference, and erasure networks with broadcast, and show that in the presence of these wireless features, it is suboptimal to make each link or sub-network error-free. We then consider these networks with the constraint that each node is permitted to either retransmit the received information or decode it and retransmit the original source information. We propose a greedy algorithm that determines the optimal operation for each node, such that the rate achievable at the destination is maximized. Further, we present decentralized implementations of this algorithm that allow each node to determine for itself the optimal operation that it needs to perform.</p>
<p>In Chapter 6, we consider a point-to-point communication system, involving multiple antennas at the transmitter and the receiver. These systems can give high data rates provided we can perform optimum, or maximum-likelihood, decoding of the received message. This problem typically reduces to that of finding the lattice point closest to a given point x in N-dimensional space. This is an integer least-squares problem and is NP-complete. The sphere decoder is an algorithm that performs decoding in an efficient manner by searching for the closest point only within a spherical region around x. In Chapter 6, we propose an algorithm that performs decoding in a sub-optimal manner by pruning the search region based on the statistics of the problem. This algorithm offers significant computational savings relative to the sphere decoder and allows us to tradeoff performance with computational complexity. Bounds on the error performance as well the complexity are presented.</p>https://thesis.library.caltech.edu/id/eprint/3885Performance Limits and Design Issues in Wireless Networks
https://resolver.caltech.edu/CaltechETD:etd-11052006-173021
Authors: Farajidana, Amir
Year: 2007
DOI: 10.7907/PRMQ-0644
<p>The increasing utilization of networks, especially wireless networks, for different applications and in different aspects of modern life, has directed a great deal of attention towards the analysis and optimal design of networks. Distinguishing features of the wireless environment and the distributed nature of the network setup have raised many important challenges in finding the performance limits of different tasks such as communication, control, and computation over networks. There are also many design issues concerning the complexity and the robustness of wireless systems that should be addressed for a thorough understanding and an efficient operation of wireless networked systems. This thesis deals with a few of the challenges associated with the fundamental performance limits and optimal design of wireless networks.</p>
<p>In the first part, we analyze performance limits of two applications for a special class of wireless networks called wireless erasure networks. These networks incorporate some of the essential features of the wireless environment. We look at the performance limits of two applications over these networks. The first application is data transmission with two different traffic patterns, namely multicast and broadcast. The capacity region and the optimal coding scheme for the multicast scenario are found, and outer and inner bounds on the capacity region for the broadcast scenario are provided. The second application considered in this thesis is estimation and control of a dynamical process at a remote location connected through a wireless erasure network to a sensor observing the process. In this case, we characterize the minimum steady-state error and its dependency on the parameters of the network. The final problem considered in the first part of the thesis concerns power consumption (as a performance measure) in wireless networks. We propose and analyze a simple scheme based on the idea of distributed beamforming that saves us in terms of power consumption for dense sensor and ad-hoc networks. We quantify this gain compared to the case when nodes have isolated communications without participating in the network.</p>
<p>The second part of the thesis deals with two design issues in the downlink of cellular wireless networks. The first issue is related to quality of service provisioning in the downlink scenario. We investigate the problem of differentiated rate scheduling in which different users demand different sets of rates. We obtain explicit and practical scheduling schemes to achieve the rate constraints and at the same time maximize the throughput. These schemes are based on the idea of opportunistic beamforming, are simple, and require little amount of feedback to the transmitter. We further show that the throughput loss due to imposing the rate constraints is negligible for large systems.</p>
<p>The next issue considered in this thesis is the robustness of the capacity region of multiple antenna Gaussian broadcast channels to the channel estimation error at the transmitter and the users. These channels are mathematical models for the downlink of cellular systems. We provide an inner bound on the capacity region of these channels and show that this inner bound is equivalent to the capacity region of a dual multiple access channel with a noise covariance that depends on the transmit powers. This duality is explored to show the effect of the estimation error on the sum-rate for a large number of users and in the large power regime. Finally, a training-based scheme for the block fading multiple antenna broadcast channels is proposed.</p>https://thesis.library.caltech.edu/id/eprint/4413Asymptotic Analysis of Wireless Systems with Rayleigh Fading
https://resolver.caltech.edu/CaltechETD:etd-04252007-122857
Authors: Rao, Chaitanya Kumar
Year: 2007
DOI: 10.7907/YYD6-VP11
<p>This thesis looks at ways to improve either the reliability or the rate at which we can successfully transmit information over Rayleigh-fading wireless communication channels.</p>
<p>We study four wireless schemes, the first in the low signal-to-noise ratio (SNR) regime, the remaining three at high SNR. The analysis provides insights that apply to more general SNRs.</p>
<p>Firstly we investigate a point-to-point multiple antenna link at low SNR. At low SNR channel estimates can be unreliable, and therefore we assume the channel is unknown to both transmitter and receiver. Adopting a block-fading model we find the mutual information between transmitter and receiver up to second order in the SNR. The expression is valid for input distributions with regular behavior of fourth- and sixth-order moments, in particular most practical schemes. Subject to input-signal constraints, we determine the optimal signaling to maximize this mutual information.</p>
<p>We undertake high SNR analysis by finding the diversity-multiplexing gain trade-off of three further wireless systems with fading. Using techniques from existing works we find the optimal diversity-multiplexing gain trade-off for an M by N multiple antenna system with R single antenna relays. This uses a two-stage protocol in which the source first transmits to relays, then the relays multiply their received signal by a unitary matrix, before forwarding the result to the receiver. The trade-off is found to be equal to that of a multiple-input multiple-output (MIMO) link with R transmit and min{M,N} receive antennas.</p>
<p>Next we consider a network with two source-destination pairs (an interference channel) and establish relationships between the rate and diversity achievable by certain schemes. We show through two more schemes how cooperation amongst the nodes achieves a higher diversity, but with a reduced rate of the system. These schemes can easily be generalized from two to m source-destination pairs.</p>
<p>A final scheme is considered where n relay nodes are added to the m source-destination pairs, which act to cancel interference in an aim to increase diversity. The outage behavior of this scheme is analyzed and it is shown that for sufficiently many relay nodes, a diversity linear in n can be obtained.</p>https://thesis.library.caltech.edu/id/eprint/1499Optimization Algorithms in Wireless and Quantum Communications
https://resolver.caltech.edu/CaltechETD:etd-12032007-113628
Authors: Stojnic, Mihailo
Year: 2008
DOI: 10.7907/D6RN-ZD88
<p>Since the first communication systems were developed, the scientific community has been witnessing attempts to increase the amount of information that can be transmitted. In the last 10--15 years there has been a tremendous amount of research towards developing multi-antenna systems which would hopefully provide high-data-rate transmissions. However, increasing the overall amount of transmitted information increases the complexity of the necessary signal processing. A large portion of this thesis deals with several important issues in signal processing of multi-antenna systems. In almost every particular case the goal is to develop a technique/algorithm so that the overall complexity of the signal processing is significantly decreased.</p>
<p>In the first part of the thesis a very important problem of signal detection in MIMO (multiple-input multiple-output) systems is considered. The problem is analyzed in two different scenarios: when the transmission medium (channel) 1) is known and 2) is unknown at the receiver. The former case is often called coherent and the later non-coherent MIMO detection. Both cases usually amount to solving highly complex NP-hard combinatorial optimization problems. For the coherent case we develop a significant improvement of the traditional sphere decoder algorithm commonly used for this type of detection. An interesting connection between the new improved algorithm and the H-infinity estimation theory is established, and the performance improvement over the standard sphere decoder is demonstrated. For the non-coherent case we develop a counterpart to the standard sphere decoder, the so-called out-sphere decoder. The complexity of the algorithm is viewed as a random variable; its expected value is analyzed and shown to be significantly smaller than the one of the overall exhaustive search. In the non-coherent case, in addition to the complexity analysis of the exact out-sphere decoder, we analyze the performance loss of a suboptimal technique. We show that only a moderate loss of a few dbs in power required at the transmitter will occur if a polynomial algorithm based on the semi-definite relaxation is used in place of any exact technique (which of course is not known to be polynomial).</p>
<p>In the second part of the thesis we consider a few problems that arise in wireless broadcast channels. Namely, we consider the problem of the information symbol vector design at the transmitter. A polynomial linear precoding technique is constructed. It enables achieving data rates very close to the ones achieved with DPC (dirty paper coding) technique. Additionally, for another suboptimal polynomial scheme (the so-called nulling and cancelling), we show that it asymptotically achieves the same data rate as the optimal, exponentially complex, DPC.</p>
<p>In the last part of the thesis we consider a quantum counterpart of the signal detection from classical communication. In quantum systems the signals are quantum states and the quantum detection problem amounts to designing measurement operators which have to satisfy certain quantum mechanics laws. A specific type of quantum detection called unambiguous detection, which has numerous applications including quantum filtering, has recently attracted a lot of attention in the research community. We develop a general framework for numerically solving this problem using the tools from the convex optimization theory. Furthermore, in the special case where the two quantum states are of rank 2, we construct an explicit analytical solution for the measurement operators.</p>
<p>At the end we would like to emphasize that the contribution of this thesis goes beyond the specific problems mentioned here. Most algorithmic optimization techniques developed in this paper are generally applicable. While it is a fact that our results were originally motivated by wireless and quantum communications applications, we believe that the developed techniques will find applications in many different areas where similar optimization problems appear.</p>
https://thesis.library.caltech.edu/id/eprint/4747Compressive Sensing for Sparse Approximations: Constructions, Algorithms, and Analysis
https://resolver.caltech.edu/CaltechTHESIS:10262009-081233260
Authors: Xu, Weiyu
Year: 2010
DOI: 10.7907/F63K-GT12
<p>Compressive sensing is an emerging research field that has applications in signal processing, error correction, medical imaging, seismology, and many more other areas. It promises to efficiently recover a sparse signal vector via a much smaller number of linear measurements than its dimension. Naturally, how to design these linear measurements, how to construct the original high-dimensional signals efficiently and accurately, and how to analyze the sparse signal recovery algorithms are important issues in the developments of compressive sensing. This thesis is devoted to addressing these fundamental issues in the field of compressive sensing.</p>
<p>In compressive sensing, random measurement matrices are generally used and ℓ₁ minimization algorithms often use linear programming or other optimization methods to recover the sparse signal vectors. But explicitly constructible measurement matrices providing performance guarantees were elusive and ℓ₁ minimization algorithms are often very demanding in computational complexity for applications involving very large problem dimensions. In chapter 2, we propose and discuss a compressive sensing scheme with deterministic performance guarantees using deterministic explicitly constructible expander graph-based measurement matrices and show that the sparse signal recovery can be achieved with linear complexity. This is the first of such a kind of compressive sensing scheme with linear decoding complexity, deterministic performance guarantees of linear sparsity recovery, and deterministic explicitly constructible measurement matrices.</p>
<p>The popular and powerful ℓ₁ minimization algorithms generally give better sparsity recovery performances than known greedy decoding algorithms. In chapter 3, starting from a necessary and sufficient null-space condition for achieving a certain signal recovery accuracy, using high-dimensional geometry, we give a unified <i>null-space Grassmann angle</i>-based analytical framework for compressive sensing. This new framework gives sharp quantitative trade-offs between the signal sparsity and the recovery accuracy of the ℓ₁ optimization for approximately sparse signal. Our results concern the fundamental "balancedness" properties of linear subspaces and so may be of independent mathematical interest.</p>
<p>The conventional approach to compressed sensing assumes no prior information on the unknown signal other than the fact that it is sufficiently sparse over a particular basis. In many applications, however, additional prior information is available. In chapter 4, we will consider a particular model for the sparse signal that assigns a probability of being zero or nonzero to each entry of the unknown vector. The standard compressed sensing model is therefore a special case where these probabilities are all equal. Following the introduction of the <i>null-space Grassmann angle</i>-based analytical framework in this thesis, we are able to characterize the optimal recoverable sparsity thresholds using weighted ℓ₁ minimization algorithms with the prior information.</p>
<p>The roles of ℓ₁ minimization algorithm in recovering sparse signals from incomplete measurements are now well understood, and sharp recoverable sparsity thresholds for ℓ₁ minimization have been obtained. The iterative reweighted ℓ₁ minimization algorithms or related algorithms have been empirically observed to boost the recoverable sparsity thresholds for certain types of signals, but no rigorous theoretical results have been established to prove this fact. In chapter 5, we try to provide a theoretical foundation for analyzing the iterative reweighted ℓ₁ algorithms. In particular, we show that for a nontrivial class of signals, the iterative reweighted ℓ₁ minimization can indeed deliver recoverable sparsity thresholds larger than the ℓ₁ minimization. Again, our results are based on the null-space Grassmann angle-based analytical framework.</p>
<p>Evolving from compressive sensing problems, where we are interested in recovering sparse vector signals from compressed linear measurements, we will turn our attention to recovering matrices of low rank from compressed linear measurements in chapter 6, which is a challenging problem that arises in many applications in machine learning, control theory, and discrete geometry. This class of optimization problems is NP-HARD, and for most practical problems there are no efficient algorithms that yield exact solutions. A popular heuristic replaces the rank function with the nuclear norm of the decision variable and has been shown to provide the optimal low rank solution in a variety of scenarios. We analytically assess the practical performance of this heuristic for finding the minimum rank matrix subject to linear constraints. We start from the characterization of a necessary and sufficient condition that determines when this heuristic finds the minimum rank solution. We then obtain probabilistic bounds on the matrix dimensions and rank and the number of constraints, such that our conditions for success are satisfied for almost all linear constraint sets as the matrix dimensions tend to infinity. Empirical evidence shows that these probabilistic bounds provide accurate predictions of the heuristic's performance in non-asymptotic scenarios.</p>https://thesis.library.caltech.edu/id/eprint/5329Estimation Using Quantized Innovations for Wireless Sensor Networks
https://resolver.caltech.edu/CaltechTHESIS:06242010-072954504
Authors: Norris, Noele R.
Year: 2010
DOI: 10.7907/BXRJ-MT98
Recent advancements in integrated small scale micro-electromechanical system technology has created cheap, low power-consuming sensors that can be used in wireless sensor networks, an increasingly popular technology because of its potentially diverse applications. However, sensor networks have many constraints, such as limited bandwidth and power, which have inspired a considerable amount of research for the development of energy efficient detection and estimation algorithms using quantized observations. Though optimal estimation algorithms using quantized innovations have been recently developed to tackle this problem, bounds are not available on the error of the resulting optimal filter. Because tight bounds on the estimation error are essential in determining the stabilizability of the corresponding closed loop dynamical system and thus the applicability of a filter to a specific system, this project focuses on developing error bounds from a close study of the filtering algorithms. Initial attempts were unable to show that the estimation error of a system using quantized innovations followed a Ricatti recursion. Thus, a number of different algorithms and coder-estimator pairs were then analyzed to determine performance and to better understand means of proving stabilizability. Our primary goal is to have a better understanding of the evolution of the lower and upper bounds of estimation errors under measurement quantization, so that filters with verifiable performance specifications can be systematically designed for particular dynamical systems.https://thesis.library.caltech.edu/id/eprint/5957Entropy Region and Network Information Theory
https://resolver.caltech.edu/CaltechTHESIS:06092011-140731576
Authors: Shadbakht, Sormeh
Year: 2011
DOI: 10.7907/P8ZB-4D40
<p>This dissertation takes a step toward a general framework for solving network information theory problems by studying the capacity region of networks through the entropy region.</p>
<p>We first show that the capacity of a large class of acyclic memoryless multiuser information theory problems can be formulated as convex optimization over the region of entropy vectors of the network random variables. This capacity characterization is universal, and is advantageous over previous formulations in that it is single letter. Besides, it is significant as it reveals the fundamental role of the entropy region in determining the capacity of network information theory problems.</p>
<p>With this viewpoint, the rest of the thesis is dedicated to the study of the entropy region, and its consequences for networks. A full characterization of the entropy region has proven to be a very challenging problem, and thus, we mostly consider inner bound constructions. For discrete random variables, our approaches include characterization of entropy vectors with a lattice-derived probability distribution, the entropy region of binary random variables, and the linear representable region. Through these characterizations, and using matroid representability results, we study linear coding capacity of networks in different scenarios (e.g., binary operations in a network, or networks with two sources).</p>
<p>We also consider continuous random variables by studying the entropy region of jointly Gaussian random variables. In particular, we determine the sufficiency of Gaussian random variables for characterizing the entropy region of 3 random variables in general. For more than 3 random variables, we point out the set of minimal necessary and sufficient conditions for a vector to be an entropy vector of jointly Gaussian random variables.</p>
<p>Finally, in the absence of a full analytical characterization of the entropy region, it is desirable to be able to perform numerical optimization over this space. In this regard, we propose a certain Monte Carlo method that enables one to numerically optimize entropy functions of discrete random variables, and also the achievable rates in wired networks. This method can be further adjusted for decentralized operation of networks. The promise of this technique is shown through various simulations of several interesting network problems.</p>
https://thesis.library.caltech.edu/id/eprint/6514Random Matrix Recursions in Estimation, Control, and Adaptive Filtering
https://resolver.caltech.edu/CaltechTHESIS:06022011-214438378
Authors: Vakili, Ali
Year: 2011
DOI: 10.7907/HCKN-7W53
<p>This dissertation is devoted to the study of estimation and control over systems that can be described by linear time-varying state-space models. Examples of such systems are encountered frequently in systems theory, e.g., wireless sensor networks, adaptive filtering, distributed control, etc. Recent developments in distributed catastrophe surveillance, smart transportation, and power grid control systems further motivate such a study.</p>
<p>While linear time-invariant systems are well-understood, there is no general theory that captures various aspects of time-varying counterparts. With little exception, tackling these problems normally boils down to studying time-varying linear or non-linear recursive matrix equations, known as Lyapunov and Riccati recursions that are notoriously hard to analyze. We employ the theory of random matrices to elucidate different facets of these recursions and answer several important questions about the performance, stability, and convergence of estimation and control over such systems.</p>
<p>We make two general assumptions. First, we assume that the coefficient matrices are drawn from jointly stationary matrix-valued random processes. The stationarity assumption hardly restricts the analysis since almost all cases of practical interest fall into this category. We further assume that the state vector size, n, is relatively large. The law of large numbers however guarantees fast convergence to the asymptotic results for n being as small as 10. Under these assumptions, we develop a framework capable of characterizing steady-state and transient behavior of adaptive filters and control and estimation over communication networks. This framework proves promising by successfully tackling several problems for the first time in the literature.</p>
<p>We first study random Lyapunov recursions and characterize their transient and steady-state behavior. Lyapunov recursions appear in several classes of adaptive filters and also as lower bounds of random Riccati recursions in distributed Kalman filtering. We then look at random Riccati recursions whose nonlinearity makes them much more complicated to study. We investigate standard recursive-least-squares (RLS) filtering and extend our analysis beyond the standard case to filtering with multiple measurements, as well as the case of intermittent measurements. Finally, we study Kalman filtering with intermittent observations, which is frequently used to model wireless sensor networks. In all of these cases we obtain interesting universal laws that depend on the structure of the problem, rather than specific model parameters. We verify the accuracy of our results through various simulations for systems with as few as 10 states.</p>
https://thesis.library.caltech.edu/id/eprint/6495Peer Effects in Social Networks: Search, Matching Markets, and Epidemics
https://resolver.caltech.edu/CaltechTHESIS:05222012-145639265
Authors: Bodine-Baron, Elizabeth Anne
Year: 2012
DOI: 10.7907/GDV2-YF12
<p>Social network analysis emerged as an important area in sociology in the early 1930s, marking a shift from looking at individual attribute data to examining the relationships between people and groups. Surveying many different types of real-world networks, researchers quickly found that different types of social networks tend to share a common set of structural characteristics, including small diameter, high clustering, and heavy-tailed degree distributions. Moving beyond real networks, in the 1990s researchers began to propose random network models to explain these commonly observed social network structures. These models laid the foundation for investigation into problems where the underlying network plays a key role, from the spread of information and disease, to the design of distributed communication and search algorithms, to mechanism design and public policy. Here we focus on the role of peer effects in social networks. Through this lens, we develop a mathematically tractable random network model incorporating searchability, propose a novel way to model and analyze two-sided matching markets with externalities, model and calculate the cost of an epidemic spreading on a complex network, and examine the impact of conforming and non-conforming peer effects in vaccination decisions on public health policy.</p>
<p>Throughout this work, the goal is to bring together knowledge and techniques from diverse fields like sociology, engineering, and economics, exploiting our understanding of social network structure and generative models to understand deeper problems that — without this knowledge — could be intractable. Instead of crippling our analysis, social network characteristics allow us to reach deeper insights about the interaction between a particular problem and the network underlying it.</p>https://thesis.library.caltech.edu/id/eprint/7064Distributed Control and Computing: Optimal Estimation, Error Correcting Codes, and Interactive Protocols
https://resolver.caltech.edu/CaltechTHESIS:05302012-155108063
Authors: Sukhavasi, Ravi Teja
Year: 2012
DOI: 10.7907/7431-FH32
Emerging applications of networked control and distributed computing are characterized by decentralization of information and the need to exchange it over potentially unreliable communication networks. This results in novel interactive communication scenarios that are incompatible with conventional information and coding theoretic approaches. To address this gap, through the early and late 1990's, a new information theoretic notion called anytime reliability and a new coding paradigm called tree codes were proposed. Although the central role of tree codes in several interactive communication problems such as distributed control and computing has been well understood, there have been no practical constructions till date. For the first time, we have an explicit ensemble of linear tree codes with efficient encoding and decoding for the class of erasure channels. In the process, we have developed novel non-asymptotic sufficient conditions on the kind of communication reliability required to stabilize control systems over noisy channels. We also study the application of tree codes to interactive protocols over erasure networks and illustrate their benefits through the example of average consensus.https://thesis.library.caltech.edu/id/eprint/7100Combinatorial Regression and Improved Basis Pursuit for Sparse Estimation
https://resolver.caltech.edu/CaltechTHESIS:06072012-001523622
Authors: Khajehnejad, M. Amin
Year: 2012
DOI: 10.7907/04J6-Y832
<p>Sparse representations accurately model many real-world data sets. Some form of sparsity is conceivable in almost every practical application, from image and video processing, to spectral sensing in radar detection, to bio-computation and genomic signal processing. Modern statistics and estimation theory have come up with ways for efficiently accounting for sparsity in enhanced information retrieval systems. In particular, \emph{compressed sensing} and \emph{matrix rank minimization} are two newly born branches of dimensionality reduction techniques, with very promising horizons. Compressed sensing addresses the reconstruction of sparse signals from ill-conditioned linear measurements, a mathematical problem that arises in practical applications in one of the following forms: model fitting (regression), analog data compression, sub-Nyquist sampling, and data privacy. Low-rank matrix estimation addresses the reconstruction of multi-dimensional data (matrices) with strong coherence properties (low rank) under restricted sensing. This model is motivated by modern problems in machine learning, dynamic systems, and quantum computing.</p>
<p>This thesis provides an in-depth study of recent developments in the fields of compressed sensing and matrix rank minimization, and sets forth new directions for improved sparse recovery techniques. The contributions are threefold: the design of combinatorial structures for sparse encoding, the development of improved recovery algorithms, and extension of sparse vector recovery techniques to other problems.</p>
<p>We propose combinatorial structures for the measurement matrix that facilitate compressing sparse analog signal representations with better guarantees than any of the currently existing architectures. Our constructions are mostly deterministic and are based on ideas from expander graphs, LDPC error-correcting codes and combinatorial separators.</p>
<p>We propose novel reconstruction algorithms that are amenable to the combinatorial structures we study, and have various advantages over the conventional convex optimization techniques for sparse recovery. In addition, we separately study the convex optimization Basis Pursuit method for compressed sensing, and propose regularization schemes that expand the success domain for such algorithms. Our studies contain rigorous analysis, numerical simulations, and examples from practical applications.</p>
<p>Lastly, we extend some of our proposed techniques to low-rank matrix estimation and channel coding. These generalizations lead to the development of a novel and fast reconstruction algorithm for matrix rank minimization, and a modified regularized linear-programming-based decoding algorithm for detecting codewords of a linear LDPC code during an erroneous communication.</p>
https://thesis.library.caltech.edu/id/eprint/7142Random Propagation in Complex Systems: Nonlinear Matrix Recursions and Epidemic Spread
https://resolver.caltech.edu/CaltechTHESIS:05232014-172754261
Authors: Ahn, Hyoung Jun
Year: 2014
DOI: 10.7907/MC7M-EE22
This dissertation studies long-term behavior of random Riccati recursions and mathematical epidemic model. Riccati recursions are derived from Kalman filtering. The error covariance matrix of Kalman filtering satisfies Riccati recursions. Convergence condition of time-invariant Riccati recursions are well-studied by researchers. We focus on time-varying case, and assume that regressor matrix is random and identical and independently distributed according to given distribution whose probability distribution function is continuous, supported on whole space, and decaying faster than any polynomial. We study the geometric convergence of the probability distribution. We also study the global dynamics of the epidemic spread over complex networks for various models. For instance, in the discrete-time Markov chain model, each node is either healthy or infected at any given time. In this setting, the number of the state increases exponentially as the size of the network increases. The Markov chain has a unique stationary distribution where all the nodes are healthy with probability 1. Since the probability distribution of Markov chain defined on finite state converges to the stationary distribution, this Markov chain model concludes that epidemic disease dies out after long enough time. To analyze the Markov chain model, we study nonlinear epidemic model whose state at any given time is the vector obtained from the marginal probability of infection of each node in the network at that time. Convergence to the origin in the epidemic map implies the extinction of epidemics. The nonlinear model is upper-bounded by linearizing the model at the origin. As a result, the origin is the globally stable unique fixed point of the nonlinear model if the linear upper bound is stable. The nonlinear model has a second fixed point when the linear upper bound is unstable. We work on stability analysis of the second fixed point for both discrete-time and continuous-time models. Returning back to the Markov chain model, we claim that the stability of linear upper bound for nonlinear model is strongly related with the extinction time of the Markov chain. We show that stable linear upper bound is sufficient condition of fast extinction and the probability of survival is bounded by nonlinear epidemic map.https://thesis.library.caltech.edu/id/eprint/8391An Integrated Design Approach to Power Systems: From Power Flows to Electricity Markets
https://resolver.caltech.edu/CaltechTHESIS:06012014-040224456
Authors: Bose, Subhonmesh
Year: 2014
DOI: 10.7907/FRGW-AF26
Power system is at the brink of change. Engineering needs, economic forces and environmental factors are the main drivers of this change. The vision is to build a smart electrical grid and a smarter market mechanism around it to fulfill mandates on clean energy. Looking at engineering and economic issues in isolation is no longer an option today; it needs an integrated design approach. In this thesis, I shall revisit some of the classical questions on the engineering operation of power systems that deals with the nonconvexity of power flow equations. Then I shall explore some issues of the interaction of these power flow equations on the electricity markets to address the fundamental issue of market power in a deregulated market environment. Finally, motivated by the emergence of new storage technologies, I present an interesting result on the investment decision problem of placing storage over a power network. The goal of this study is to demonstrate that modern optimization and game theory can provide unique insights into this complex system. Some of the ideas carry over to applications beyond power systems.https://thesis.library.caltech.edu/id/eprint/8458Convex Relaxation for Low-Dimensional Representation: Phase Transitions and Limitations
https://resolver.caltech.edu/CaltechTHESIS:08182014-091546460
Authors: Oymak, Samet
Year: 2015
DOI: 10.7907/Z9S46PWX
<p>There is a growing interest in taking advantage of possible patterns and structures in data so as to extract the desired information and overcome the curse of dimensionality. In a wide range of applications, including computer vision, machine learning, medical imaging, and social networks, the signal that gives rise to the observations can be modeled to be approximately sparse and exploiting this fact can be very beneficial. This has led to an immense interest in the problem of efficiently reconstructing a sparse signal from limited linear observations. More recently, low-rank approximation techniques have become prominent tools to approach problems arising in machine learning, system identification and quantum tomography.</p>
<p>In sparse and low-rank estimation problems, the challenge is the inherent intractability of the objective function, and one needs efficient methods to capture the low-dimensionality of these models. Convex optimization is often a promising tool to attack such problems. An intractable problem with a combinatorial objective can often be "relaxed" to obtain a tractable but almost as powerful convex optimization problem. This dissertation studies convex optimization techniques that can take advantage of low-dimensional representations of the underlying high-dimensional data. We provide provable guarantees that ensure that the proposed algorithms will succeed under reasonable conditions, and answer questions of the following flavor:</p>
<UL>
<LI> For a given number of measurements, can we reliably estimate the true signal?</LI>
<LI> If so, how good is the reconstruction as a function of the model parameters?</LI>
</UL>
<p>More specifically, i) Focusing on linear inverse problems, we generalize the classical error bounds known for the least-squares technique to the lasso formulation, which incorporates the signal model. ii) We show that intuitive convex approaches do not perform as well as expected when it comes to signals that have multiple low-dimensional structures simultaneously. iii) Finally, we propose convex relaxations for the graph clustering problem and give sharp performance guarantees for a family of graphs arising from the so-called stochastic block model. We pay particular attention to the following aspects. For i) and ii), we aim to provide a general geometric framework, in which the results on sparse and low-rank estimation can be obtained as special cases. For i) and iii), we investigate the precise performance characterization, which yields the right constants in our bounds and the true dependence between the problem parameters.</p>https://thesis.library.caltech.edu/id/eprint/8635Information-Theoretic Studies and Capacity Bounds: Group Network Codes and Energy Harvesting Communication Systems
https://resolver.caltech.edu/CaltechTHESIS:04272015-133555770
Authors: Mao, Wei
Year: 2015
DOI: 10.7907/Z9ZS2TFB
Network information theory and channels with memory are two important but difficult frontiers of information theory. In this two-parted dissertation, we study these two areas, each comprising one part. For the first area we study the so-called entropy vectors via finite group theory, and the network codes constructed from finite groups. In particular, we identify the smallest finite group that violates the Ingleton inequality, an inequality respected by all linear network codes, but not satisfied by all entropy vectors. Based on the analysis of this group we generalize it to several families of Ingleton-violating groups, which may be used to design good network codes. Regarding that aspect, we study the network codes constructed with finite groups, and especially show that linear network codes are embedded in the group network codes constructed with these Ingleton-violating families. Furthermore, such codes are strictly more powerful than linear network codes, as they are able to violate the Ingleton inequality while linear network codes cannot. For the second area, we study the impact of memory to the channel capacity through a novel communication system: the energy harvesting channel. Different from traditional communication systems, the transmitter of an energy harvesting channel is powered by an exogenous energy harvesting device and a finite-sized battery. As a consequence, each time the system can only transmit a symbol whose energy consumption is no more than the energy currently available. This new type of power supply introduces an unprecedented input constraint for the channel, which is random, instantaneous, and has memory. Furthermore, naturally, the energy harvesting process is observed causally at the transmitter, but no such information is provided to the receiver. Both of these features pose great challenges for the analysis of the channel capacity. In this work we use techniques from channels with side information, and finite state channels, to obtain lower and upper bounds of the energy harvesting channel. In particular, we study the stationarity and ergodicity conditions of a surrogate channel to compute and optimize the achievable rates for the original channel. In addition, for practical code design of the system we study the pairwise error probabilities of the input sequences.https://thesis.library.caltech.edu/id/eprint/8834Convex Programming-Based Phase Retrieval: Theory and Applications
https://resolver.caltech.edu/CaltechTHESIS:05312016-051759406
Authors: Jaganathan, Kishore
Year: 2016
DOI: 10.7907/Z9C82775
<p>Phase retrieval is the problem of recovering a signal from its Fourier magnitude. This inverse problem arises in many areas of engineering and applied physics, and has been studied for nearly a century. Due to the absence of Fourier phase, the available information is incomplete in general. Classic identifiability results state that phase retrieval of one-dimensional signals is impossible, and that phase retrieval of higher-dimensional signals is almost surely possible under mild conditions. However, there are no efficient recovery algorithms with theoretical guarantees. Classic algorithms are based on the method of alternating projections. These algorithms do not have theoretical guarantees, and have limited recovery abilities due to the issue of convergence to local optima.</p>
<p>Recently, there has been a renewed interest in phase retrieval due to technological advances in measurement systems and theoretical developments in structured signal recovery. In particular, it is now possible to obtain specific kinds of additional magnitude-only information about the signal, depending on the application. The premise is that, by carefully redesigning the measurement process, one could potentially overcome the issues of phase retrieval. To this end, another approach could be to impose certain kinds of prior on the signal, depending on the application. On the algorithmic side, convex programming based approaches have played a key role in modern phase retrieval, inspired by their success in provably solving several quadratic constrained problems.</p>
<p>In this work, we study several variants of phase retrieval using modern tools, with focus on applications like X-ray crystallography, diffraction imaging, optics, astronomy and radar. In the one-dimensional setup, we first develop conditions, which when satisfied, allow unique reconstruction. Then, we develop efficient recovery algorithms based on convex programming, and provide theoretical guarantees. The theory and algorithms we develop are independent of the dimension of the signal, and hence can be used in all the aforementioned applications. We also perform a comparative numerical study of the convex programming and the alternating projection based algorithms. Numerical simulations clearly demonstrate the superior ability of the convex programming based methods, both in terms of successful recovery in the noiseless setting and stable reconstruction in the noisy setting.</p>https://thesis.library.caltech.edu/id/eprint/9814Algebraic Techniques in Coding Theory: Entropy Vectors, Frames, and Constrained Coding
https://resolver.caltech.edu/CaltechTHESIS:09042015-171723764
Authors: Thill, Matthew David
Year: 2016
DOI: 10.7907/Z9F18WNW
<p>The study of codes, classically motivated by the need to communicate information reliably in the presence of error, has found new life in fields as diverse as network communication, distributed storage of data, and even has connections to the design of linear measurements used in compressive sensing. But in all contexts, a code typically involves exploiting the algebraic or geometric structure underlying an application. In this thesis, we examine several problems in coding theory, and try to gain some insight into the algebraic structure behind them.</p>
<p>The first is the study of the entropy region - the space of all possible vectors of joint entropies which can arise from a set of discrete random variables. Understanding this region is essentially the key to optimizing network codes for a given network. To this end, we employ a group-theoretic method of constructing random variables producing so-called "group-characterizable" entropy vectors, which are capable of approximating any point in the entropy region. We show how small groups can be used to produce entropy vectors which violate the Ingleton inequality, a fundamental bound on entropy vectors arising from the random variables involved in linear network codes. We discuss the suitability of these groups to design codes for networks which could potentially outperform linear coding.</p>
<p>The second topic we discuss is the design of frames with low coherence, closely related to finding spherical codes in which the codewords are unit vectors spaced out around the unit sphere so as to minimize the magnitudes of their mutual inner products. We show how to build frames by selecting a cleverly chosen set of representations of a finite group to produce a "group code" as described by Slepian decades ago. We go on to reinterpret our method as selecting a subset of rows of a group Fourier matrix, allowing us to study and bound our frames' coherences using character theory. We discuss the usefulness of our frames in sparse signal recovery using linear measurements.</p>
<p>The final problem we investigate is that of coding with constraints, most recently motivated by the demand for ways to encode large amounts of data using error-correcting codes so that any small loss can be recovered from a small set of surviving data. Most often, this involves using a systematic linear error-correcting code in which each parity symbol is constrained to be a function of some subset of the message symbols. We derive bounds on the minimum distance of such a code based on its constraints, and characterize when these bounds can be achieved using subcodes of Reed-Solomon codes.</p>https://thesis.library.caltech.edu/id/eprint/9141Recovering Structured Signals in High Dimensions via Non-Smooth Convex Optimization: Precise Performance Analysis
https://resolver.caltech.edu/CaltechTHESIS:06032016-144604076
Authors: Thrampoulidis, Christos
Year: 2016
DOI: 10.7907/Z998850V
<p>The typical scenario that arises in modern large-scale inference problems is one where the ambient dimension of the unknown signal is very large (e.g., high-resolution images, recommendation systems), yet its desired properties lie in some low-dimensional structure such as, sparsity or low-rankness. In the past couple of decades, non-smooth convex optimization methods have emerged as a powerful tool to extract those structures, since they are often computationally efficient, and also they offer enough flexibility while simultaneously being amenable to performance analysis. Especially, since the advent of Compressed Sensing (CS) there has been significant progress towards this direction. One of the key ideas is that random linear measurements offer an efficient way to acquire structured signals. When the measurement matrix has entries iid from a wide class of distributions (including Gaussians), a series of recent works have established a complete and transparent theory that precisely captures the performance in the noiseless setting. In the more practical scenario of noisy measurements the performance analysis task becomes significantly more challenging and corresponding precise and unifying results have hitherto remained scarce. The available class of optimization methods, often referred to as regularized M-estimators, is now richer; additional factors (e.g., the noise distribution, the loss function, and the regularizer parameter) and several different measures of performance (e.g., squared-error, probability of support recovery) need to be taken into account.</p>
<p>This thesis develops a novel analytical framework that overcomes these challenges, and establishes {precise asymptotic performance guarantees for regularized M-estimators under Gaussian measurement matrices. In particular, the framework allows for a unifying analysis among different instances (such as the Generalized LASSO, and the LAD, to name a few) and accounts for a wide class of performance measures. Among others, we show results on the mean-squared-error of the Generalized-LASSO method and make insightful connections to the classical theory of ordinary least squares and to noiseless CS. Empirical evidence is presented that suggests the Gaussian assumption is not necessary. Beyond iid measurement matrices, motivated by practical considerations, we study certain classes of random matrices with orthogonal rows and establish their superior performance when compared to Gaussians.</p>
<p>A prominent application of this generic theory is on the analysis of the bit-error rate (BER) of the popular convex-relaxation of the Maximum Likelihood decoder for recovering BPSK signals in a massive Multiple Input Multiple Output setting. Our precise BER analysis allows comparison of these schemes to the unattainable Matched-filter bound, and further suggests means to provably boost their performance. </p>
<p>The last challenge is to evaluate the performance under non-linear measurements. For the Generalized LASSO, it is shown that this is (asymptotically) equivalent to the one under noisy linear measurements with appropriately scaled variance. This encompasses state-of-the art theoretical results of one-bit CS , and is also used to prove that the optimal quantizer of the measurements that minimizes the estimation error of the Generalized LASSO is the celebrated Lloyd-Max quantizer.</p>
<p>The framework is based on Gaussian process methods; in particular, on a new strong and tight version of a classical comparison inequality (due to Gordon, 1988) in the presence of additional convexity assumptions. We call this the Convex Gaussian Min-max Theorem (CGMT).</p>https://thesis.library.caltech.edu/id/eprint/9836Optimization and Control of Power Flow in Distribution Networks
https://resolver.caltech.edu/CaltechTHESIS:12092015-021431773
Authors: Farivar, Masoud
Year: 2016
DOI: 10.7907/Z9JW8BSM
<p>Climate change is arguably the most critical issue facing our generation and the next. As we move towards a sustainable future, the grid is rapidly evolving with the integration of more and more renewable energy resources and the emergence of electric vehicles. In particular, large scale adoption of residential and commercial solar photovoltaics (PV) plants is completely changing the traditional slowly-varying unidirectional power flow nature of distribution systems. High share of intermittent renewables pose several technical challenges, including voltage and frequency control. But along with these challenges, renewable generators also bring with them millions of new DC-AC inverter controllers each year. These fast power electronic devices can provide an unprecedented opportunity to increase energy efficiency and improve power quality, if combined with well-designed inverter control algorithms. The main goal of this dissertation is to develop scalable power flow optimization and control methods that achieve system-wide efficiency, reliability, and robustness for power distribution networks of future with high penetration of distributed inverter-based renewable generators.</p>
<p>Proposed solutions to power flow control problems in the literature range from fully centralized to fully local ones. In this thesis, we will focus on the two ends of this spectrum. In the first half of this thesis (chapters 2 and 3), we seek optimal solutions to voltage control problems provided a centralized architecture with complete information. These solutions are particularly important for better understanding the overall system behavior and can serve as a benchmark to compare the performance of other control methods against. To this end, we first propose a branch flow model (BFM) for the analysis and optimization of radial and meshed networks. This model leads to a new approach to solve optimal power flow (OPF) problems using a two step relaxation procedure, which has proven to be both reliable and computationally efficient in dealing with the non-convexity of power flow equations in radial and weakly-meshed distribution networks. We will then apply the results to fast time- scale inverter var control problem and evaluate the performance on real-world circuits in Southern California Edison’s service territory.</p>
<p>The second half (chapters 4 and 5), however, is dedicated to study local control approaches, as they are the only options available for immediate implementation on today’s distribution networks that lack sufficient monitoring and communication infrastructure. In particular, we will follow a reverse and forward engineering approach to study the recently proposed piecewise linear volt/var control curves. It is the aim of this dissertation to tackle some key problems in these two areas and contribute by providing rigorous theoretical basis for future work.</p>https://thesis.library.caltech.edu/id/eprint/9317Error-Correcting Codes for Networks, Storage and Computation
https://resolver.caltech.edu/CaltechTHESIS:06092017-013147699
Authors: Halbawi, Wael
Year: 2017
DOI: 10.7907/Z9J67F08
<p>The advent of the information age has bestowed upon us three challenges related to the way we deal with data. Firstly, there is an unprecedented demand for transmitting data at high rates. Secondly, the massive amounts of data being collected from various sources needs to be stored across time. Thirdly, there is a need to process the data collected and perform computations on it in order to extract meaningful information out of it. The interconnected nature of modern systems designed to perform these tasks has unraveled new difficulties when it comes to ensuring their resilience against sources of performance degradation. In the context of network communication and distributed data storage, system-level noise and adversarial errors have to be combated with efficient error correction schemes. In the case of distributed computation, the heterogeneous nature of computing clusters can potentially diminish the speedups promised by parallel algorithms, calling for schemes that mitigate the effect of slow machines and communication delay.</p>
<p>This thesis addresses the problem of designing efficient fault tolerance schemes for the three scenarios just described. In the network communication setting, a family of multiple-source multicast networks that employ linear network coding is considered for which capacity-achieving distributed error-correcting codes, based on classical algebraic constructions, are designed. The codes require no coordination between the source nodes and are end to end: except for the source nodes and the destination node, the operation of the network remains unchanged.</p>
<p>In the context of data storage, balanced error-correcting codes are constructed so that the encoding effort required is balanced out across the storage nodes. In particular, it is shown that for a fixed row weight, any cyclic Reed-Solomon code possesses a generator matrix in which the number of nonzeros is the same across the columns. In the balanced and sparsest case, where each row of the generator matrix is a minimum distance codeword, the maximal encoding time over the storage nodes is minimized, a property that is appealing in write-intensive settings. Analogous constructions are presented for a locally recoverable code construction due to Tamo and Barg.</p>
<p>Lastly, the problem of mitigating stragglers in a distributed computation setup is addressed, where a function of some dataset is computed in parallel. Using Reed-Solomon coding techniques, a scheme is proposed that allows for the recovery of the function under consideration from the minimum number of machines possible. The only assumption made on the function is that it is additively separable, which renders the scheme useful in distributed gradient descent implementations. Furthermore, a theoretical model for the run time of the scheme is presented. When the return time of the machines is modeled probabilistically, the model can be used to optimally pick the scheme's parameters so that the expected computation time is minimized. The recovery is performed using an algorithm that runs in quadratic time and linear space, a notable improvement compared to state-of-the-art schemes.</p>
<p>The unifying theme of the three scenarios is the construction of error-correcting codes whose encoding functions adhere to certain constraints. It is shown that in many cases, these constraints can be satisfied by classical constructions. As a result, the schemes presented are deterministic, operate over small finite fields and can be decoded using efficient algorithms.</p>https://thesis.library.caltech.edu/id/eprint/10328Graph Clustering: Algorithms, Analysis and Query Design
https://resolver.caltech.edu/CaltechTHESIS:09222017-130217881
Authors: Korlakai Vinayak, Ramya
Year: 2018
DOI: 10.7907/Z9RR1WFK
<p>A wide range of applications in engineering as well as the natural and social sciences have datasets that are unlabeled. Clustering plays a major role in exploring structure in such unlabeled datasets. Owing to the heterogeneity in the applications and the types of datasets available, there are plenty of clustering objectives and algorithms. In this thesis we focus on two such clustering problems: <i>Graph Clustering</i> and <i>Crowdsourced Clustering</i>.</p>
<p>In the first part, we consider the problem of graph clustering and study convex-optimization-based clustering algorithms. Datasets are often messy -- ridden with noise, outliers (items that do not belong to any clusters), and missing data. Therefore, we are interested in algorithms that are robust to such discrepancies. We present and analyze convex-optimization-based clustering algorithms which aim to recover the low-rank matrix that encodes the underlying cluster structure for two clustering objectives: <i>clustering partially observed graphs</i> and <i>clustering similarity matrices with outliers</i>. Using block models as generative models, we characterize the performance of these convex clustering algorithms. In particular, we provide <i>explicit bounds</i>, without any large unknown constants, on the problem parameters that determine the success and failure of these convex approaches.</p>
<p>In the second part, we consider the problem of crowdsourced clustering -- the task of clustering items using answers from non-expert crowd workers who can answer similarity comparison queries. Since the workers are not experts, they provide noisy answers. Further, due to budget constraints, we cannot make all possible comparisons between items in the dataset. Thus, it is important to <i>design queries that can reduce the noise in the responses</i> and <i>design algorithms that can work with noisy and partial data</i>. We demonstrate that random triangle queries (where three items are compared per query) provide less noisy data as well as greater quantity of data, for a fixed query budget, as compared to random edge queries (where two items are compared per query). We extend the analysis of convex clustering algorithms to show that the exact recovery guarantees hold for triangle queries despite involving dependent edges. In addition to random querying strategies, we also present a novel <i>active querying</i> algorithm that is guaranteed to find all the clusters regardless of their sizes and without the knowledge of any parameters as long as the workers are better than random guessers. We also provide a tight upper bound on the number of queries made by the proposed active querying algorithm. Apart from providing theoretical guarantees for the clustering algorithms we also apply our algorithms to real datasets.</p>https://thesis.library.caltech.edu/id/eprint/10447Universality Laws and Performance Analysis of the Generalized Linear Models
https://resolver.caltech.edu/CaltechTHESIS:06092020-005908250
Authors: Abbasi, Ehsan
Year: 2020
DOI: 10.7907/873c-ej41
<p>In the past couple of decades, non-smooth convex optimization has emerged as a powerful tool for the recovery of structured signals (sparse, low rank, etc.) from noisy linear or non-linear measurements in a variety of applications in genomics, signal processing, wireless communications, machine learning, etc.. Taking advantage of the particular structure of the unknown signal of interest is critical since in most of these applications, the dimension <i>p</i> of the signal to be estimated is comparable, or even larger than the number of observations <i>n</i>. With the advent of Compressive Sensing there has been a very large number of theoretical results that study the estimation performance of non-smooth convex optimization in such a <i>high-dimensional setting</i>.</p>
<p>A popular approach for estimating an unknown signal β₀ ϵ ℝ<i>ᵖ</i> in a <i>generalized linear model</i>, with observations <b>y</b> = g(<b>X</b>β₀) ϵ ℝ<i>ⁿ</i>, is via solving the estimator β̂ = arg min<sub>β</sub> <i>L</i>(<b>y</b>, <b>X</b>β + <i>λf</i>(<i>β</i>). Here, <i>L</i>(•,•) is a loss function which is convex with respect to its second argument, and <i>f</i>(•) is a regularizer that enforces the structure of the unknown β₀. We first analyze the generalization error performance of this estimator, for the case where the entries of <b>X</b> are drawn <i>independently from real standard Gaussian</i> distribution. The <i>precise</i> nature of our analysis permits an accurate performance comparison between different instances of these estimators, and allows to optimally tune the hyperparameters based on the model parameters. We apply our result to some of the most popular cases of generalized linear models, such as M-estimators in linear regression, logistic regression and generalized margin maximizers in binary classification problems, and Poisson regression in count data models. The key ingredient of our proof is the <i>Convex Gaussian Min-max Theorem (CGMT)</i>, which is a tight version of the Gaussian comparison inequality proved by Gordon in 1988. Unfortunately, having real iid entries in the features matrix <b>X</b> is crucial in this theorem, and it cannot be naturally extended to other cases.</p>
<p>But for some special cases, we prove some universality properties and indirectly extend these results to more general designs of the features matrix <b>X</b>, where the entries are not necessarily real, independent, or identically distributed. This extension, enables us to analyze problems that CGMT was incapable of, such as models with quadratic measurements, phase-lift in phase retrieval, and data recovery in massive MIMO, and help us settle a few long standing open problems in these areas.</p>https://thesis.library.caltech.edu/id/eprint/13804Riemannian Optimization for Convex and Non-Convex Signal Processing and Machine Learning Applications
https://resolver.caltech.edu/CaltechTHESIS:06012020-120425051
Authors: Douik, Ahmed
Year: 2020
DOI: 10.7907/jt3c-0m30
The performance of most algorithms for signal processing and machine learning applications highly depends on the underlying optimization algorithms. Multiple techniques have been proposed for solving convex and non-convex problems such as interior-point methods and semidefinite programming. However, it is well known that these algorithms are not ideally suited for large-scale optimization with a high number of variables and/or constraints. This thesis exploits a novel optimization method, known as Riemannian optimization, for efficiently solving convex and non-convex problems with signal processing and machine learning applications. Unlike most optimization techniques whose complexities increase with the number of constraints, Riemannian methods smartly exploit the structure of the search space, a.k.a., the set of feasible solutions, to reduce the embedded dimension and efficiently solve optimization problems in a reasonable time. However, such efficiency comes at the expense of universality as the geometry of each manifold needs to be investigated individually. This thesis explains the steps of designing first and second-order Riemannian optimization methods for smooth matrix manifolds through the study and design of optimization algorithms for various applications. In particular, the paper is interested in contemporary applications in signal processing and machine learning, such as community detection, graph-based clustering, phase retrieval, and indoor and outdoor location determination. Simulation results are provided to attest to the efficiency of the proposed methods against popular generic and specialized solvers for each of the above applications.https://thesis.library.caltech.edu/id/eprint/13758Linear Codes with Constrained Generator Matrices
https://resolver.caltech.edu/CaltechTHESIS:05242021-223430388
Authors: Yildiz, Hikmet
Year: 2021
DOI: 10.7907/qz6m-wp22
<p>Designing good error correcting codes whose generator matrix has a support constraint, i.e., one for which only certain entries of the generator matrix are allowed to be nonzero, has found many recent applications, including in distributed coding and storage, linear network coding, multiple access networks, and weakly secure data exchange. The dual problem, where the parity check matrix has a support constraint, comes up in the design of locally repairable codes. The central problem here is to design codes with the largest possible minimum distance, subject to the given support constraint on the generator matrix. When the distance metric is the Hamming distance, the codes of interest are Reed-Solomon codes, for which case, the problem was formulated as the "GM-MDS conjecture." In the rank metric case, the same problem can be considered for Gabidulin codes. This thesis provides solutions to these problems and discusses the remaining open problems.</p>https://thesis.library.caltech.edu/id/eprint/14172Large-Scale Intelligent Systems: From Network Dynamics to Optimization Algorithms
https://resolver.caltech.edu/CaltechTHESIS:11182020-210226861
Authors: Azizan Ruhi, Navid
Year: 2021
DOI: 10.7907/j494-j572
<p>The expansion of large-scale technological systems such as electrical grids, transportation networks, health care systems, telecommunication networks, the Internet (of things), and other societal networks has created numerous challenges and opportunities at the same time. These systems are often not yet as robust, efficient, sustainable, or smart as we would want them to be. Fueled by the massive amounts of data generated by all these systems, and with the recent advances in making sense out of data, there is a strong desire to make them more intelligent. However, developing <i>large-scale intelligent systems</i> is a multifaceted problem, involving several major challenges. First, large-scale systems typically exhibit <i>complex dynamics</i> due to the large number of entities interacting over a network. Second, because the system is composed of many interacting entities, that make decentralized (and often self-interested) decisions, one has to properly design <i>incentives and markets</i> for such systems. Third, the massive computational needs caused by the scale of the system necessitate performing computations in a <i>distributed</i> fashion, which in turn requires devising new algorithms. Finally, one has to create algorithms that can <i>learn from</i> the copious amounts of data and generalize well. This thesis makes several contributions related to each of these four challenges.</p>
<p>Analyzing and understanding the network dynamics exhibited in societal systems is crucial for developing systems that are robust and efficient. In Part I of this thesis, we study one of the most important families of network dynamics, namely, that of <i>epidemics</i>, or <i>spreading processes</i>. Studying such processes is relevant for understanding and controlling the spread of, e.g., contagious diseases among people, ideas or fake news in online social networks, computer viruses in computer networks, or cascading failures in societal networks. We establish several results on the exact Markov chain model and the nonlinear "mean-field" approximations for various kinds of epidemics (i.e., SIS, SIRS, SEIRS, SIV, SEIV, and their variants).</p>
<p>Designing incentives and markets for large-scale systems is critical for their efficient operation and ensuring an alignment between the agents' decentralized decisions and the global goals of the system. To that end, in Part II of this thesis, we study these issues in markets with <i>non-convex</i> costs as well as <i>networked</i> markets, which are of vital importance for, e.g., the smart grid. We propose novel pricing schemes for such markets, which satisfy all the desired market properties. We also reveal issues in the current incentives for distributed energy resources, such as renewables, and design optimization algorithms for efficient management of aggregators of such resources.</p>
<p>With the growing amounts of data generated by large-scale systems, and the fact that the data may already be dispersed across many units, it is becoming increasingly necessary to run computational tasks in a distributed fashion. Part III concerns developing algorithms for distributed computation. We propose a novel consensus-based algorithm for the task of solving large-scale <i>systems of linear equations</i>, which is one of the most fundamental problems in linear algebra, and a key step at the heart of many algorithms in scientific computing, machine learning, and beyond. In addition, in order to deal with the issue of heterogeneous delays in distributed computation, caused by slow machines, we develop a new <i>coded computation</i> technique. In both cases, the proposed methods offer significant speed-ups relative to the existing approaches.</p>
<p>Over the past decade, <i>deep learning</i> methods have become the most successful learning algorithms in a wide variety of tasks. However, the reasons behind their success (as well as their failures in some respects) are largely unexplained. It is widely believed that the success of deep learning is not just due to the deep architecture of the models, but also due to the behavior of the optimization algorithms, such as stochastic gradient descent (SGD), used for training them. In Part IV of this thesis, we characterize several properties, such as minimax optimality and implicit regularization, of SGD, and more generally, of the family of <i>stochastic mirror descent (SMD)</i>. While SGD performs an implicit regularization, this regularization can be effectively controlled using SMD with a proper choice of mirror, which in turn can improve the generalization error.</p>https://thesis.library.caltech.edu/id/eprint/14001Structured Signal Recovery from Nonlinear Measurements with Applications in Phase Retrieval and Linear Classification
https://resolver.caltech.edu/CaltechTHESIS:05172021-044724906
Authors: Salehi, Fariborz
Year: 2021
DOI: 10.7907/1c69-wq71
<p>Nonlinear models are widely used in signal processing, statistics, and machine learning to model real-world applications. A popular class of such models is the single-index model where the response variable is related to a linear combination of dependent variables through a link function. In other words, if x ∈ R<sup>p</sup> denotes the input signal, the posterior mean of the generated output y has the form, E[y|x] = ρ(x<sup>T</sup>w), where ρ :R → R is a known function (referred to as the link function), and w ∈ R<sup>p</sup> is the vector of unknown parameters. When ρ(•) is invertible, this class of models is called generalized linear models (GLMs). GLMs are commonly used in statistics and are often viewed as flexible generalizations of linear regression. Given n measurements (samples) from this model, D = {(x<sub>i</sub>, y<sub>i</sub>) | 1 ≤q i ≤ n}, the goal is to estimate the parameter vector w. While the model parameters are assumed to be unknown, in many applications these parameters follow certain structures (sparse, low-rank, group-sparse, etc.) The knowledge on this structure can be used to form more accurate estimators.</p>
<p>The main contribution of this thesis is to provide a precise performance analysis for convex optimization programs that are used for parameter estimation in two important classes of single-index models. These classes are: (1) phase retrieval in signal processing, and (2) binary classification in statistical learning.</p>
<p>The first class of models studied in this thesis is the phase retrieval problem, where the goal is to recover a discrete complex-valued signal from amplitudes of its linear combinations. Methods based on convex optimization have recently gained significant attentions in the literature. The conventional convex-optimization-based methods resort to the idea of lifting which makes them computationally inefficient. In addition to providing an analysis of the recovery threshold for the semidefinite-programming-based methods, this thesis studies the performance of a new convex relaxation for the phase retrieval problem, known as phasemax, which is computationally more efficient as it does not lift the signal to higher dimensions. Furthermore, to address the case of structured signals, regularized phasemax is introduced along with a precise characterization of the conditions for its perfect recovery in the asymptotic regime.</p>
<p>The next important application studied in this thesis is the binary classification in statistical learning. While classification models have been studied in the literature since 1950's, the understanding of their performance has been incomplete until very recently. Inspired by the maximum likelihood (ML) estimator in logistic models, we analyze a class of optimization programs that attempts to find the model parameters by minimizing an objective that consists of a loss function (which is often inspired by the ML estimator) and an additive regularization term that enforces our knowledge on the structure. There are two operating regimes for this problem depending on the separability of the training data set D. In the asymptotic regime, where the number of samples and the number of parameters grow to infinity, a phase transition phenomenon is demonstrated that happens at a certain over-parameterization ratio. We compute this phase transition for the setting where the underlying data is drawn from a Gaussian distribution.</p>
<p>In the case where the data is non-separable, the ML estimator is well-defined, and its attributes have been studied in the classical statistics. However, these classical results fail to provide reasonable estimate in the regime where the number of data points is proportional to the number of samples. One contribution of this thesis is to provide an exact analysis on the performance of the regularized logistic regression when the number of training data is proportional to the number of samples. When the data is separable (a.k.a. the interpolating regime), there exist multiple linear classifiers that perfectly fit the training data. In this regime, we introduce and analyze the performance of "extended margin maximizers" (EMMs). Inspired by the max-margin classifier, EMM classifiers simultaneously consider maximizing the margin and the structure of the parameter. Lastly, we discuss another generalization to the max-margin classifier, referred to as the robust max-margin classifier, that takes into account the perturbations by an adversary. It is shown that for a broad class of loss functions, gradient descent iterates (with proper step sizes) converge to the robust max-margin classifier.</p>https://thesis.library.caltech.edu/id/eprint/14150Regret-Optimal Control
https://resolver.caltech.edu/CaltechTHESIS:06062022-021936716
Authors: Goel, Gautam
Year: 2022
DOI: 10.7907/8t6d-4t53
<p>Optimal controllers are usually designed to minimize cost under the assumption that the disturbance they encounter is drawn from some specific class. For example, in H₂ control the disturbance is assumed to be generated by a stochastic process and the controller is designed to minimize its expected cost, while in H<sub>∞</sub> control the disturbance is assumed to be generated adversarially and the controller is designed to minimize its worst-case cost. This approach suffers from an obvious drawback: a controller which encounters a disturbance which falls outside of the class it was designed to handle may perform poorly. This observation naturally motivates the design of adaptive controllers which dynamically adjust their control strategy as they causally observe the disturbance instead of blindly following a prescribed strategy.</p>
<p>Inspired by online learning, we propose <i>data-dependent regret</i> as a criterion for controller design. In the regret-optimal control paradigm, causal controllers are designed to minimize regret against a hypothetical <i>optimal noncausal controller</i>, which selects the cost-minimizing sequence of control actions given noncausal access to the disturbance sequence. Controllers with low regret retain a performance guarantee irrespective of how the disturbance is generated; it is this universality which makes our approach an attractive alternative to traditional H₂ and H<sub>∞</sub> control. The regret of the causal controller is bounded by some measure of the complexity of the disturbance sequence; we consider several different complexity measures, including the energy of the disturbance sequence, which measures the size of the disturbance, and the pathlength of the disturbance, which measures its variation over time. We also consider the alternative metric of <i>competitive ratio</i>, which is the worst-case ratio between the cost incurred by the causal controller and the cost incurred by the optimal noncausal controller. This metric can also be viewed as a special case of data-dependent regret, where the complexity measure is simply the offline optimal cost. For each of these complexity measures, we derive a corresponding control algorithm with optimal data-dependent regret. The key technique we introduce is an operator-theoretic reduction from regret-optimal control to H<sub>∞</sub> control; each of the regret-optimal controllers we obtain can be interpreted as an H<sub>∞</sub> controller in a synthetic system of larger dimension. We also extend regret-optimal control to the more challenging <i>measurement-feedback</i> setting, where the online controller must choose control actions without directly observing the disturbance sequence, using only noisy linear measurements of the state.</p>
<p>We show that the competitive controller can be arbitrarily well-approximated by the class of disturbance-action-controller (DAC) policies. The convexity of this class of policies makes it amenable to online optimization via a reduction to online convex optimization with memory, and this class has hence attracted much recent attention in online learning. Using our approximation result, we show how to obtain algorithms which achieve the "best-of-both-worlds": sublinear policy regret against DAC policies and approximate competitive ratio. These performance guarantees can even be extended to the "adaptive control" setting, where the controller does not know the system dynamics ahead of time and must perform online system identification.</p>
<p>We present numerical experiments in a linear dynamical system which demonstrate how the performance of regret-optimal controllers varies as a function of the complexity of the disturbance. We extend regret-optimal control to nonlinear dynamical systems using model-predictive control (MPC) and present experiments which suggest that regret-optimal control is a promising approach to adapting to model error in nonlinear control.</p>https://thesis.library.caltech.edu/id/eprint/14947