Monograph records
https://feeds.library.caltech.edu/people/Hassibi-B/monograph.rss
A Caltech Library Repository Feedhttp://www.rssboard.org/rss-specificationpython-feedgenenFri, 08 Dec 2023 12:14:41 +0000Optimal LQG Control Across a Packet-Dropping Link
https://resolver.caltech.edu/CaltechCDSTR:2004.007
Authors: Gupta, Vijay; Spanos, Demetri; Hassibi, Babak; Murray, Richard M.
Year: 2004
We examine optimal Linear Quadratic Gaussian control for a system in which communication between the sensor (output of the plant) and the controller occurs across a packet-dropping link. We extend the familiar LQG separation principle to this problem that allows us to solve this problem using a standard LQR state-feedback design, along with an optimal algorithm for propagating and using the information across the unreliable link. We present one such optimal algorithm, which consists of a Kalman Filter at the sensor side of the link, and a switched linear filter at the controller side. Our design does not assume any statistical model of the packet drop events, and is thus optimal for an arbitrary packet drop pattern. Further, the solution is appealing from a practical point of view because it can be implemented as a small modification of an existing LQG control design.https://authors.library.caltech.edu/records/s9j8v-mpz83A Sub-optimal Algorithm to Synthesize Control Laws for a Network of Dynamic Agents
https://resolver.caltech.edu/CaltechCDSTR:2004.006
Authors: Gupta, Vijay; Hassibi, Babak; Murray, Richard M.
Year: 2004
We study the synthesis problem of an LQR controller when the matrix describing the control law is constrained to lie in a particular vector space. Our motivation is the use of such control laws to stabilize networks of autonomous agents in a decentralized fashion; with the information flow being dictated by the constraints of a pre-specified topology. In this paper, we consider the finite-horizon version of the problem and provide both a computationally intensive optimal solution and a sub-optimal solution that is computationally more tractable. Then we apply the technique to the decentralized vehicle formation control problem and show that the loss in performance due to the use of the sub-optimal solution is not huge; however the topology can have a large effect on performance.https://authors.library.caltech.edu/records/4978t-yv173On a stochastic sensor selection algorithm with applications in sensor scheduling and sensor coverage
https://resolver.caltech.edu/CaltechCDSTR:2004.008
Authors: Gupta, Vijay; Chung, Timothy H.; Hassibi, Babak; Murray, Richard M.
Year: 2004
In this note we consider the following problem. Suppose a set of sensors is jointly trying to estimate a process. One sensor takes a measurement at every time step and the measurements are then exchanged among all the sensors. What is the sensor schedule that results in the minimum error covariance? We describe a stochastic sensor selection strategy that is easy to implement and is computationally tractable. The problem described above comes up in many domains out of which we discuss two. In the sensor selection problem, there are multliple sensors that cannot operate simultaneously (e.g., sonars in the same frequency band). Thus measurements need to be scheduled. In the sensor coverage problem, a geographical area needs to be covered by mobile sensors each with limited range. Thus from every position, the sensors obtain a different view-point of the area and the sensors need to optimize their positions. The algorithm is applied to these problems and illustrated through simple examples.https://authors.library.caltech.edu/records/27qye-ckm71The Squared-Error of Generalized LASSO: A Precise Analysis
https://resolver.caltech.edu/CaltechAUTHORS:20150121-073901094
Authors: Oymak, Samet; Thrampoulidis, Christos; Hassibi, Babak
Year: 2013
DOI: 10.48550/arXiv.1311.0830
We consider the problem of estimating an unknown signal x_0 from noisy linear observations y = Ax_0 + z ∈ ℝ^m. In many practical instances, x_0 has a certain structure that can be captured by a structure inducing convex
function f(•). For example, ℓ_1 norm can be used to encourage a sparse solution. To estimate x_0 with the aid of f(•), we consider the well-known LASSO method and provide sharp characterization of its performance. Our study falls under a generic framework, where the entries of the measurement matrix A and the noise vector
z have zero-mean normal distributions
with variances 1 and σ^2, respectively. For the LASSO estimator x^*, we ask: "What is the precise estimation error as a function of the noise level σ, the number of observations m and the structure of the signal?". In particular, we attempt to calculate the Normalized Square Error (NSE) defined as (∥x^* - x_0∥)^2_2)/σ^2. We show that, the structure of the signal x_0 and choice of the function f(•) enter the error formulae through the summary parameters D_f(x_0, ℝ^+) and D_f(x_0,λ), which are defined as the "Gaussian squared-distances" to the subdifferential cone and to the λ-scaled subdifferential of f at x_0, respectively. The first estimator assumes a-priori knowledge of f(x_0) and is given by arg min_x {∥y-Ax∥_2 subject to f(x) ≤ f(x_0)}. We prove that its worse case NSE is achieved when σ → 0 and concentrates around (D_f(x_0,ℝ^+))/(m-D_f(x_0,ℝ^+). Secondly, we consider arg min_x {∥y-Ax∥_2 + λf(x)}, for some penalty parameter λ ≥ 0. This time, the NSE formula depends on the choice of λ and is given by (D_f(x_0,λ)/(m-D_f(x_0,λ) over a range of λ. The last estimator is arg min_x {1/2∥y-Ax∥^2_2 + στf(x)}. We establish a mapping between this and the second estimator and propose a formula for its NSE. As useful side results, we find explicit formulae for the optimal estimation performance and the optimal penalty parameters λ_(best) and τ_best). Finally, for a number of important structured signal classes, we translate our abstract formulae to closed-form upper bounds on the NSE.https://authors.library.caltech.edu/records/0p3fd-p1239A Tight Version of the Gaussian min-max theorem in the Presence of Convexity
https://resolver.caltech.edu/CaltechAUTHORS:20150120-073025721
Authors: Thrampoulidis, Christos; Oymak, Samet; Hassibi, Babak
Year: 2015
DOI: 10.48550/arXiv.1408.4837
Gaussian comparison theorems are useful tools in probability theory; they are
essential ingredients in the classical proofs of many results in empirical
processes and extreme value theory. More recently, they have been used
extensively in the analysis of underdetermined linear inverse problems. A
prominent role in the study of those problems is played by Gordon's Gaussian
min-max theorem. It has been observed that the use of the Gaussian min-max
theorem produces results that are often tight. Motivated by recent work due to
M. Stojnic, we argue explicitly that the theorem is tight under additional
convexity assumptions. To illustrate the usefulness of the result we provide an
application example from the field of noisy linear inverse problems.https://authors.library.caltech.edu/records/139kb-ezw50The Cost of an Epidemic over a Complex Network: A Random Matrix Approach
https://resolver.caltech.edu/CaltechAUTHORS:20150121-074956983
Authors: Bose, Subhonmesh; Bodine-Baron, Elizabeth; Hassibi, Babak; Wierman, Adam
Year: 2015
DOI: 10.48550/arXiv.1309.2236
In this paper we quantify the total economic impact of an epidemic over a
complex network using tools from random matrix theory. Incorporating the direct
and indirect costs of infection, we calculate the disease cost in the large
graph limit for an SIS (Susceptible - Infected - Susceptible) infection
process. We also give an upper bound on this cost for arbitrary finite graphs
and illustrate both calculated costs using extensive simulations on random and
real-world networks. We extend these calculations by considering the total
social cost of an epidemic, accounting for both the immunization and disease
costs for various immunization strategies and determining the optimal
immunization. Our work focuses on the transient behavior of the epidemic, in
contrast to previous research, which typically focuses on determining the
steady-state system equilibrium.https://authors.library.caltech.edu/records/r0fwf-k8w72Simple Bounds for Noisy Linear Inverse Problems with Exact Side Information
https://resolver.caltech.edu/CaltechAUTHORS:20150121-072935827
Authors: Oymak, Samet; Thrampoulidis, Christos; Hassibi, Babak
Year: 2015
DOI: 10.48550/arXiv.1312.0641
This paper considers the linear inverse problem where we wish to estimate a structured signal x_0 from its corrupted observations. When the problem is ill-posed, it is natural to associate a convex function f(·) with the structure of the signal. For example, ℓ_1 norm can be used for sparse signals. To carry out the estimation, we consider two well-known convex programs: 1) Second order cone program (SOCP), and, 2) Lasso. Assuming Gaussian measurements, we show that, if precise information about the value f(x_0)
or the ℓ_2-norm of the noise is available, one can do a particularly good job at estimation. In particular, the
reconstruction error becomes proportional to the "sparsity" of the signal rather than to the ambient dimension of the noise vector. We connect our results to the existing literature and provide a discussion on their relation to the standard least-squares problem. Our error bounds are non-asymptotic and sharp, they apply to arbitrary convex functions and do not assume any distribution
on the noise.https://authors.library.caltech.edu/records/wn519-r4504Recovering Jointly Sparse Signals via Joint Basis Pursuit
https://resolver.caltech.edu/CaltechAUTHORS:20150126-075526346
Authors: Oymak, Samet; Hassibi, Babak
Year: 2015
DOI: 10.48550/arXiv.1202.3531
This work considers recovery of signals that are sparse over two bases. For instance, a signal might be sparse in both time and frequency, or a matrix can
be low rank and sparse simultaneously. To facilitate recovery, we consider minimizing the sum of the ℓ_1-norms that correspond to each basis, which is a tractable convex approach. We find novel optimality conditions which indicates a gain over traditional approaches where ℓ_1 minimization is done over only one basis. Next, we analyze these optimality conditions for the particular case of time-frequency bases. Denoting sparsity in the first and second bases by k_1,k_2 respectively, we show that, for a general class of signals, using this approach, one requires as small as O(max{k_1,k_2} log log n) measurements for successful recovery hence overcoming the classical requirement of Θ(min{k_1,k_2} log (n/(min{k_1,k_2})) for ℓ
_1 minimization when k_1 ≈ k_2. Extensive simulations show that, our analysis is approximately tight.https://authors.library.caltech.edu/records/rzzyr-yfp60Error Correcting Codes for Distributed Control
https://resolver.caltech.edu/CaltechAUTHORS:20150127-072035730
Authors: Sukhavasi, Ravi Teja; Hassibi, Babak
Year: 2015
DOI: 10.48550/arXiv.1112.4236
The problem of stabilizing an unstable plant over a noisy communication link is an increasingly important one that arises in applications of networked control systems. Although the work of Schulman and Sahai over the past two
decades, and their development of the notions of "tree codes" and "anytime capacity", provides the theoretical framework for studying such problems, there has been scant practical progress in this area because explicit
constructions of tree codes with efficient encoding and decoding did not exist. To stabilize an unstable plant driven by bounded noise over a noisy channel one
needs real-time encoding and real-time decoding and a reliability which increases exponentially with decoding delay, which is what tree codes
guarantee. We prove that linear tree codes occur with high probability and, for erasure channels, give an explicit construction with an expected decoding
complexity that is constant per time instant. We give novel sufficient conditions on the rate and reliability required of the tree codes to stabilize vector plants and argue that they are asymptotically tight. This work takes an
important step towards controlling plants over noisy channels, and we demonstrate the efficacy of the method through several examples.https://authors.library.caltech.edu/records/00hgb-7nf27Finding Dense Clusters via "Low Rank + Sparse" Decomposition
https://resolver.caltech.edu/CaltechAUTHORS:20150127-073728199
Authors: Oymak, Samet; Hassibi, Babak
Year: 2015
DOI: 10.48550/arXiv.1104.5186v1
Finding "densely connected clusters" in a graph is in general an important and well studied problem in the literature. It has various applications in pattern recognition, social networking and data mining. Recently, Ames and Vavasis have suggested a novel method
for finding cliques in a graph by using convex optimization over the adjacency matrix of the graph. Also, there has been recent advances in decomposing a given matrix into its "low rank" and "sparse" components. In this paper, inspired by these results, we view "densely connected clusters" as imperfect cliques, where imperfections correspond missing edges, which are relatively sparse. We analyze the problem
in a probabilistic setting and aim to detect disjointly planted clusters. Our main result basically suggests that, one can find dense clusters in a graph, as long as the clusters are sufficiently large. We conclude by
discussing possible extensions and future research directions.https://authors.library.caltech.edu/records/5b90s-sfe90Sparse Recovery of Positive Signals with Minimal Expansion
https://resolver.caltech.edu/CaltechAUTHORS:20150130-072455533
Authors: Khajehnejad, M. Amin; Dimakis, Alexandros G.; Xu, Weiyu; Hassibi, Babak
Year: 2015
DOI: 10.48550/arXiv.0902.4045
We investigate the sparse recovery problem of reconstructing a high-dimensional non-negative sparse vector from lower dimensional linear measurements. While much work has focused on dense measurement matrices, sparse measurement schemes are crucial in applications, such as DNA microarrays and sensor networks, where dense measurements are not practically feasible. One possible construction uses the adjacency matrices of expander graphs, which often leads to recovery algorithms much more efficient than ℓ_1 minimization. However, to date, constructions based on expanders have required very high expansion coefficients which can potentially make the construction of such graphs difficult and the size of the recoverable sets small. In this paper, we construct sparse measurement matrices for the recovery of non-negative vectors, using perturbations of the adjacency matrix of an expander graph with much smaller expansion coefficient. We present a necessary and sufficient condition for ℓ_1 optimization to successfully recover the unknown vector and obtain expressions for the recovery threshold. For certain classes of measurement matrices, this necessary and sufficient condition is further equivalent to the existence of a "unique" vector in the constraint set, which opens the door to alternative algorithms to ℓ_1 minimization. We further show that the minimal expansion we use is necessary for any graph for which sparse recovery is possible and that therefore our construction is tight. We finally present a novel recovery algorithm that exploits expansion and is much faster than ℓ_1 optimization. Finally, we demonstrate through theoretical bounds, as well as simulation, that our method is robust to noise and approximate sparsity.https://authors.library.caltech.edu/records/f1v7j-ht714New Null Space Results and Recovery Thresholds for Matrix Rank Minimization
https://resolver.caltech.edu/CaltechAUTHORS:20150129-072018732
Authors: Oymak, Samet; Hassibi, Babak
Year: 2015
DOI: 10.48550/arXiv.1011.6326
Nuclear norm minimization (NNM) has recently gained significant attention for its use in rank minimization problems. Similar to compressed sensing, using null space characterizations, recovery thresholds for NNM have been studied in. However simulations show that the thresholds are far from optimal, especially in the low rank region. In this paper we apply the recent analysis of Stojnic for compressed sensing to the null space conditions of NNM. The resulting thresholds are significantly better and in particular our weak threshold appears to match with simulation results. Further our curves suggest for any rank growing linearly with matrix size n we need only three times of oversampling (the model complexity) for weak recovery. Similar to we analyze the conditions for weak, sectional and strong thresholds. Additionally a separate analysis is given for special case of positive semidefinite matrices. We conclude by discussing simulation results and future research directions.https://authors.library.caltech.edu/records/z0z8h-65144Compressive Sensing over the Grassmann Manifold: a Unified Geometric Framework
https://resolver.caltech.edu/CaltechAUTHORS:20150129-073448689
Authors: Xu, Weiyu; Hassibi, Babak
Year: 2015
DOI: 10.48550/arXiv.1005.3729
ℓ_1 minimization is often used for finding the sparse solutions of an under-determined linear system. In this paper we focus on finding sharp performance bounds on recovering approximately sparse signals using ℓ_1 minimization, possibly under noisy measurements. While the restricted isometry property is powerful for the analysis of recovering approximately sparse signals with noisy measurements, the known bounds on the achievable sparsity (The "sparsity" in this paper means the size of the set of nonzero or significant elements in a signal vector.) level can be quite loose. The neighborly polytope analysis which yields sharp bounds for ideally sparse signals cannot be readily generalized to approximately sparse signals. Starting from a necessary and sufficient condition, the "balancedness" property of linear subspaces, for achieving a certain signal recovery accuracy, we give a unified null space Grassmann angle-based geometric framework for analyzing the performance of ℓ_1 minimization. By investigating the "balancedness" property, this unified framework characterizes sharp quantitative tradeoffs between the considered sparsity and the recovery accuracy of the ℓ_1 optimization. As a consequence, this generalizes the neighborly polytope result for ideally sparse signals. Besides the robustness in the "strong" sense for all sparse signals, we also discuss the notions of "weak" and "sectional" robustness. Our results concern fundamental properties of linear subspaces and so may be of independent mathematical interest.https://authors.library.caltech.edu/records/cag96-jyt96Scheduling for Distributed Sensor Networks
https://resolver.caltech.edu/CaltechAUTHORS:20150211-071040084
Authors: Gupta, Vijay; Chung, Timothy H.; Hassibi, Babak; Murray, Richard M.
Year: 2015
We examine the problem of distributed estimation when only one sensor can take a measurement per time step. The measurements are then exchanged among the sensors. The problem is motivated by the use of sonar range-finders used by the vehicles on the Caltech Multi-Vehicle Wireless Testbed. We solve for the optimal recursive estimation algorithm when the sensor switching schedule is given. Then we investigate several approaches for determining an optimal sensor switching strategy. We see that this problem involves searching a tree in general and propose and analyze two strategies for pruning the tree to keep the computation limited. The first is a sliding window strategy motivated by the Viterbi algorithm, and the second one uses thresholding. We also study a technique that employs choosing the sensors randomly from a probability distribution which can then be optimized. The performance of the algorithms are illustrated with the help of numerical examples.https://authors.library.caltech.edu/records/1phkj-1vd63Precise Error Analysis of the ℓ_2-LASSO
https://resolver.caltech.edu/CaltechAUTHORS:20150302-080245100
Authors: Thrampoulidis, Christos; Panahi, Ashkan; Guo, Daniel; Hassibi, Babak
Year: 2015
DOI: 10.48550/arXiv.1502.04977
A classical problem that arises in numerous signal processing applications asks for the reconstruction of an unknown, k-sparse signal x_0∈R^n from underdetermined, noisy, linear measurements y=Ax_0 + z ∈ R^m. One standard approach is to solve the following convex program x^=arg min_x ∥y−Ax∥_2 + λ∥x∥_1, which is known as the ℓ2-LASSO. We assume that the entries of the sensing matrix A and of the noise vector z are i.i.d Gaussian with variances 1/m and σ2. In the large system limit when the problem dimensions grow to infinity, but in constant rates, we precisely characterize the limiting behavior of the normalized squared-error ∥x^−x_0∥^2_2/σ^2. Our numerical illustrations validate our theoretical predictions.https://authors.library.caltech.edu/records/vgdky-tz751Fully-diverse multiple-antenna signal constellations and fixed-point-free Lie groups
https://resolver.caltech.edu/CaltechAUTHORS:20150302-171200624
Authors: Hassibi, Babak; Khorrami, Mohammad
Year: 2015
A group of unitary matrices is called fixed-point-free (fpf) if all non-identity elements of the
group have no eigenvalues at unity. Such groups are useful in multiple-antenna communications,
especially in multiple-antenna differential modulation, since they constitute a fully-diverse constellation.
In [1] all finite fpf groups have been cla8Hified. In this note we consider infinite groups and,
in particular, their most interesting case; Lie groups. Two such fpf Lie groups are currently widely
used in communications: the group of unit modulus scalars, from which various phase modulation
schemes, such as QPSK, are derived, and the 2 x 2 orthogonal designs of Alamouti, on which many
two-transmit-antenna schemes are based. In Lie-group-theoretic jargon these are referred to as
U(1) and SU(2). A natural question is whether there exist other fpf Lie groups. We answer this
question in the negative: U(1) and SU(2) are all there are.https://authors.library.caltech.edu/records/5dcf7-tat17A Krein Space Interpretation of the Kalman-Yakubovich-Popov Lemma
https://resolver.caltech.edu/CaltechAUTHORS:20150218-071435753
Authors: Hassibi, Babak; Kailath, Thomas
Year: 2015
In this note we give a Krein space interpretation of the celebrated Kalman-Yakubovich-Popov (KYP) Lemma by introducing state-space models driven by inputs that lie in an indefinite-metric space. Such state-space models can be considered as generalizations of standard stochastic statespace models driven by stationary stochastic processes (that lie in a definite, or so-called Hilbert, space). In this framework, the KYP lemma corresponds to a certain decomposition in Krein space.https://authors.library.caltech.edu/records/367nt-wke89Robustness Analysis of a List Decoding Algorithm for Compressed Sensing
https://resolver.caltech.edu/CaltechAUTHORS:20150223-073119058
Authors: Chowdhury, Mainak; Oymak, Samet; Khajehnejad, M. Amin; Hassibi, Babak
Year: 2015
We analyze the noise robustness of sparse signal recon-struction based on the compressive sensing equivalent of a list-decoding algorithm for Reed Solomon codes -the Coppersmith-Sudan algorithm. We use results from the per-turbation analysis of singular subspaces of matrices to prove the existence of bounds for the noise levels (in the measure-ments) below which the error in the recovered signal (with respect to the original sparse signal) will be guaranteed to be upper bounded. Numerical simulations have been presented which compare the experimental recovery probability to the theoretical lower bound.https://authors.library.caltech.edu/records/zqte4-h5n32Stochastic Linear Bandits with Hidden Low Rank Structure
https://resolver.caltech.edu/CaltechAUTHORS:20190327-085817695
Authors: Lale, Sahin; Azizzadenesheli, Kamyar; Anandkumar, Anima; Hassibi, Babak
Year: 2019
DOI: 10.48550/arXiv.1901.09490
High-dimensional representations often have a lower dimensional underlying structure. This is particularly the case in many decision making settings. For example, when the representation of actions is generated from a deep neural network, it is reasonable to expect a low-rank structure whereas conventional structures like sparsity are not valid anymore. Subspace recovery methods, such as Principle Component Analysis (PCA) can find the underlying low-rank structures in the feature space and reduce the complexity of the learning tasks. In this work, we propose Projected Stochastic Linear Bandit (PSLB), an algorithm for high dimensional stochastic linear bandits (SLB) when the representation of actions has an underlying low-dimensional subspace structure. PSLB deploys PCA based projection to iteratively find the low rank structure in SLBs. We show that deploying projection methods assures dimensionality reduction and results in a tighter regret upper bound that is in terms of the dimensionality of the subspace and its properties, rather than the dimensionality of the ambient space. We modify the image classification task into the SLB setting and empirically show that, when a pre-trained DNN provides the high dimensional feature representations, deploying PSLB results in significant reduction of regret and faster convergence to an accurate model compared to state-of-art algorithm.https://authors.library.caltech.edu/records/3emwx-f8t15Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization
https://resolver.caltech.edu/CaltechAUTHORS:20190402-101505900
Authors: Azizan, Navid; Hassibi, Babak
Year: 2019
DOI: 10.48550/arXiv.1806.00952
Stochastic descent methods (of the gradient and mirror varieties) have become increasingly popular in optimization. In fact, it is now widely recognized that the success of deep learning is not only due to the special deep architecture of the models, but also due to the behavior of the stochastic descent methods used, which play a key role in reaching "good" solutions that generalize well to unseen data. In an attempt to shed some light on why this is the case, we revisit some minimax properties of stochastic gradient descent (SGD) for the square loss of linear models---originally developed in the 1990's---and extend them to general stochastic mirror descent (SMD) algorithms for general loss functions and nonlinear models. In particular, we show that there is a fundamental identity which holds for SMD (and SGD) under very general conditions, and which implies the minimax optimality of SMD (and SGD) for sufficiently small step size, and for a general class of loss functions and general nonlinear models. We further show that this identity can be used to naturally establish other properties of SMD (and SGD), namely convergence and implicit regularization for over-parameterized linear models (in what is now being called the "interpolating regime"), some of which have been shown in certain cases in prior literature. We also argue how this identity can be used in the so-called "highly over-parameterized" nonlinear setting (where the number of parameters far exceeds the number of data points) to provide insights into why SMD (and SGD) may have similar convergence and implicit regularization properties for deep learning.https://authors.library.caltech.edu/records/n5r8g-zqt60Stable Blind Deconvolution over the Reals from Additional Autocorrelations
https://resolver.caltech.edu/CaltechAUTHORS:20190402-102240889
Authors: Walk, Philipp; Hassibi, Babak
Year: 2019
DOI: 10.48550/arXiv.1710.07879
Recently the one-dimensional time-discrete blind deconvolution problem was shown to be solvable uniquely, up to a global phase, by a semi-definite program for almost any signal, provided its autocorrelation is known. We will show in this work that under a sufficient zero separation of the corresponding signal in the z−domain, a stable reconstruction against additive noise is possible. Moreover, the stability constant depends on the signal dimension and on the signals magnitude of the first and last coefficients. We give an analytical expression for this constant by using spectral bounds of Vandermonde matrices.https://authors.library.caltech.edu/records/8sx97-pmt87Noncoherent Short-Packet Communication via Modulation on Conjugated Zeros
https://resolver.caltech.edu/CaltechAUTHORS:20190402-101721380
Authors: Walk, Philipp; Jung, Peter; Hassibi, Babak
Year: 2019
DOI: 10.48550/arXiv.1805.07876
We introduce a novel blind (noncoherent) communication scheme, called modulation on conjugate-reciprocal zeros (MOCZ), to reliably transmit short binary packets over unknown finite impulse response systems as used, for example, to model underspread wireless multipath channels. In MOCZ, the information is modulated onto the zeros of the transmitted signals z−transform. In the absence of additive noise, the zero structure of the signal is perfectly preserved at the receiver, no matter what the channel impulse response (CIR) is. Furthermore, by a proper selection of the zeros, we show that MOCZ is not only invariant to the CIR, but also robust against additive noise. Starting with the maximum-likelihood estimator, we define a low complexity and reliable decoder and compare it to various state-of-the art noncoherent schemes.https://authors.library.caltech.edu/records/5r4qb-94711Algorithms for Optimal Control with Fixed-Rate Feedback
https://resolver.caltech.edu/CaltechAUTHORS:20190402-085532839
Authors: Khina, Anatoly; Nakahira, Yorie; Su, Yu; Yildiz, Hikmet; Hassibi, Babak
Year: 2019
DOI: 10.48550/arXiv.1809.04917
We consider a discrete-time linear quadratic Gaussian networked control setting where the (full information) observer and controller are separated by a fixed-rate noiseless channel. The minimal rate required to stabilize such a system has been well studied. However, for a given fixed rate, how to quantize the states so as to optimize performance is an open question of great theoretical and practical significance. We concentrate on minimizing the control cost for first-order scalar systems. To that end, we use the Lloyd-Max algorithm and leverage properties of logarithmically-concave functions and sequential Bayesian filtering to construct the optimal quantizer that greedily minimizes the cost at every time instant. By connecting the globally optimal scheme to the problem of scalar successive refinement, we argue that its gain over the proposed greedy algorithm is negligible. This is significant since the globally optimal scheme is often computationally intractable. All the results are proven for the more general case of disturbances with logarithmically-concave distributions and rate-limited time-varying noiseless channels. We further extend the framework to event-triggered control by allowing to convey information via an additional "silent symbol", i.e., by avoiding transmitting bits; by constraining the minimal probability of silence we attain a tradeoff between the transmission rate and the control cost for rates below one bit per sample.https://authors.library.caltech.edu/records/zfnrd-9r247Efficient and Robust Compressed Sensing using High-Quality Expander Graphs
https://resolver.caltech.edu/CaltechAUTHORS:20190702-111100896
Authors: Jafarpour, Sina; Xu, Weiyu; Hassibi, Babak; Calderbank, Robert
Year: 2019
DOI: 10.48550/arXiv.0806.3802
Expander graphs have been recently proposed to construct efficient compressed sensing algorithms. In particular, it has been shown that any n-dimensional vector that is k-sparse (with k≪n) can be fully recovered using O(k log n/k)measurements and only O(k log n) simple recovery iterations. In this paper we improve upon this result by considering expander graphs with expansion coefficient beyond 3/4 and show that, with the same number of measurements, only O(k) recovery iterations are required, which is a significant improvement when n is large. In fact, full recovery can be accomplished by at most 2k very simple iterations. The number of iterations can be made arbitrarily close to k, and the recovery algorithm can be implemented very efficiently using a simple binary search tree. We also show that by tolerating a small penalty on the number of measurements, and not on the number of recovery iterations, one can use the efficient construction of a family of expander graphs to come up with explicit measurement matrices for this method. We compare our result with other recently developed expander-graph-based methods and argue that it compares favorably both in terms of the number of required measurements and in terms of the recovery time complexity. Finally we will show how our analysis extends to give a robust algorithm that finds the position and sign of the k significant elements of an almost k-sparse signal and then, using very simple optimization techniques, finds in sublinear time a k-sparse signal which approximates the original signal with very high precision.https://authors.library.caltech.edu/records/gtnsm-awa96The Power of Linear Controllers in LQR Control
https://resolver.caltech.edu/CaltechAUTHORS:20200214-105627925
Authors: Goel, Gautam; Hassibi, Babak
Year: 2020
DOI: 10.48550/arXiv.2002.02574
The Linear Quadratic Regulator (LQR) framework considers the problem of regulating a linear dynamical system perturbed by environmental noise. We compute the policy regret between three distinct control policies: i) the optimal online policy, whose linear structure is given by the Ricatti equations; ii) the optimal offline linear policy, which is the best linear state feedback policy given the noise sequence; and iii) the optimal offline policy, which selects the globally optimal control actions given the noise sequence. We fully characterize the optimal offline policy and show that it has a recursive form in terms of the optimal online policy and future disturbances. We also show that cost of the optimal offline linear policy converges to the cost of the optimal online policy as the time horizon grows large, and consequently the optimal offline linear policy incurs linear regret relative to the optimal offline policy, even in the optimistic setting where the noise is drawn i.i.d from a known distribution. Although we focus on the setting where the noise is stochastic, our results also imply new lower bounds on the policy regret achievable when the noise is chosen by an adaptive adversary.https://authors.library.caltech.edu/records/0q07n-d8417Regret Minimization in Partially Observable Linear Quadratic Control
https://resolver.caltech.edu/CaltechAUTHORS:20200214-105620768
Authors: Lale, Sahin; Azizzadenesheli, Kamyar; Hassibi, Babak; Anandkumar, Anima
Year: 2020
DOI: 10.48550/arXiv.2002.00082
We study the problem of regret minimization in partially observable linear quadratic control systems when the model dynamics are unknown a priori. We propose ExpCommit, an explore-then-commit algorithm that learns the model Markov parameters and then follows the principle of optimism in the face of uncertainty to design a controller. We propose a novel way to decompose the regret and provide an end-to-end sublinear regret upper bound for partially observable linear quadratic control. Finally, we provide stability guarantees and establish a regret upper bound of O(T^(2/3)) for ExpCommit, where T is the time horizon of the problem.https://authors.library.caltech.edu/records/0zn0k-sk261Logarithmic Regret Bound in Partially Observable Linear Dynamical Systems
https://resolver.caltech.edu/CaltechAUTHORS:20200402-131032742
Authors: Lale, Sahin; Azizzadenesheli, Kamyar; Hassibi, Babak; Anandkumar, Anima
Year: 2020
DOI: 10.48550/arXiv.2003.11227
We study the problem of adaptive control in partially observable linear dynamical systems. We propose a novel algorithm, adaptive control online learning algorithm (AdaptOn), which efficiently explores the environment, estimates the system dynamics episodically and exploits these estimates to design effective controllers to minimize the cumulative costs. Through interaction with the environment, AdaptOn deploys online convex optimization to optimize the controller while simultaneously learning the system dynamics to improve the accuracy of controller updates. We show that when the cost functions are strongly convex, after T times step of agent-environment interaction, AdaptOn achieves regret upper bound of polylog(T). To the best of our knowledge, AdaptOn is the first algorithm which achieves polylog(T) regret in adaptive control of unknown partially observable linear dynamical systems which includes linear quadratic Gaussian (LQG) control.https://authors.library.caltech.edu/records/t42qj-40094Explore More and Improve Regret in Linear Quadratic Regulators
https://resolver.caltech.edu/CaltechAUTHORS:20201106-120155157
Authors: Lale, Sahin; Azizzadenesheli, Kamyar; Hassibi, Babak; Anandkumar, Anima
Year: 2020
DOI: 10.48550/arXiv.2007.12291
Stabilizing the unknown dynamics of a control system and minimizing regret in control of an unknown system are among the main goals in control theory and reinforcement learning. In this work, we pursue both these goals for adaptive control of linear quadratic regulators (LQR). Prior works accomplish either one of these goals at the cost of the other one. The algorithms that are guaranteed to find a stabilizing controller suffer from high regret, whereas algorithms that focus on achieving low regret assume the presence of a stabilizing controller at the early stages of agent-environment interaction. In the absence of such a stabilizing controller, at the early stages, the lack of reasonable model estimates needed for (i) strategic exploration and (ii) design of controllers that stabilize the system, results in regret that scales exponentially in the problem dimensions. We propose a framework for adaptive control that exploits the characteristics of linear dynamical systems and deploys additional exploration in the early stages of agent-environment interaction to guarantee sooner design of stabilizing controllers. We show that for the classes of controllable and stabilizable LQRs, where the latter is a generalization of prior work, these methods achieve O(√T) regret with a polynomial dependence in the problem dimensions.https://authors.library.caltech.edu/records/1f8f5-1sv28Robustifying Binary Classification to Adversarial Perturbation
https://resolver.caltech.edu/CaltechAUTHORS:20201109-152308798
Authors: Salehi, Fariborz; Hassibi, Babak
Year: 2020
DOI: 10.48550/arXiv.2010.15391
Despite the enormous success of machine learning models in various applications, most of these models lack resilience to (even small) perturbations in their input data. Hence, new methods to robustify machine learning models seem very essential. To this end, in this paper we consider the problem of binary classification with adversarial perturbations. Investigating the solution to a min-max optimization (which considers the worst-case loss in the presence of adversarial perturbations) we introduce a generalization to the max-margin classifier which takes into account the power of the adversary in manipulating the data. We refer to this classifier as the "Robust Max-margin" (RM) classifier. Under some mild assumptions on the loss function, we theoretically show that the gradient descent iterates (with sufficiently small step size) converge to the RM classifier in its direction. Therefore, the RM classifier can be studied to compute various performance measures (e.g. generalization error) of binary classification with adversarial perturbations.https://authors.library.caltech.edu/records/nwp8b-8r627The Performance Analysis of Generalized Margin Maximizer (GMM) on Separable Data
https://resolver.caltech.edu/CaltechAUTHORS:20201109-155538204
Authors: Salehi, Fariborz; Abbasi, Ehsan; Hassibi, Babak
Year: 2020
DOI: 10.48550/arXiv.2010.15379
Logistic models are commonly used for binary classification tasks. The success of such models has often been attributed to their connection to maximum-likelihood estimators. It has been shown that gradient descent algorithm, when applied on the logistic loss, converges to the max-margin classifier (a.k.a. hard-margin SVM). The performance of the max-margin classifier has been recently analyzed. Inspired by these results, in this paper, we present and study a more general setting, where the underlying parameters of the logistic model possess certain structures (sparse, block-sparse, low-rank, etc.) and introduce a more general framework (which is referred to as "Generalized Margin Maximizer", GMM). While classical max-margin classifiers minimize the 2-norm of the parameter vector subject to linearly separating the data, GMM minimizes any arbitrary convex function of the parameter vector. We provide a precise analysis of the performance of GMM via the solution of a system of nonlinear equations. We also provide a detailed study for three special cases: (1) ℓ₂-GMM that is the max-margin classifier, (2) ℓ₁-GMM which encourages sparsity, and (3) ℓ_∞-GMM which is often used when the parameter vector has binary entries. Our theoretical results are validated by extensive simulation results across a range of parameter values, problem instances, and model structures.https://authors.library.caltech.edu/records/ydr5t-x7j06Regret-optimal control in dynamic environments
https://resolver.caltech.edu/CaltechAUTHORS:20201109-155541657
Authors: Goel, Gautam; Hassibi, Babak
Year: 2020
DOI: 10.48550/arXiv.2010.10473
We consider the control of linear time-varying dynamical systems from the perspective of regret minimization. Unlike most prior work in this area, we focus on the problem of designing an online controller which competes with the best dynamic sequence of control actions selected in hindsight, instead of the best controller in some specific class of controllers. This formulation is attractive when the environment changes over time and no single controller achieves good performance over the entire time horizon. We derive the structure of the regret-optimal online controller via a novel reduction to H_∞ control and present a clean data-dependent bound on its regret. We also present numerical simulations which confirm that our regret-optimal controller significantly outperforms the H₂ and H_∞ controllers in dynamic environments.https://authors.library.caltech.edu/records/hcvyz-5em08A Matrix Completion Approach to Linear Index Coding Problem
https://resolver.caltech.edu/CaltechAUTHORS:20210105-133410359
Authors: Esfahanizadeh, Homa; Lahouti, Farshad; Hassibi, Babak
Year: 2021
DOI: 10.48550/arXiv.1408.3046
In this paper, a general algorithm is proposed for rate analysis and code design of linear index coding problems. Specifically a solution for minimum rank matrix completion problem over finite fields representing the linear index coding problem is devised in order to find the optimum transmission rate given vector length and size of the field. The new approach can be applied to both scalar and vector linear index coding.https://authors.library.caltech.edu/records/0edg7-b6d72Regret-optimal Estimation and Control
https://resolver.caltech.edu/CaltechAUTHORS:20210719-210206675
Authors: Goel, Gautam; Hassibi, Babak
Year: 2021
DOI: 10.48550/arXiv.2106.12097
We consider estimation and control in linear time-varying dynamical systems from the perspective of regret minimization. Unlike most prior work in this area, we focus on the problem of designing causal estimators and controllers which compete against a clairvoyant noncausal policy, instead of the best policy selected in hindsight from some fixed parametric class. We show that the regret-optimal estimator and regret-optimal controller can be derived in state-space form using operator-theoretic techniques from robust control and present tight, data-dependent bounds on the regret incurred by our algorithms in terms of the energy of the disturbances. Our results can be viewed as extending traditional robust estimation and control, which focuses on minimizing worst-case cost, to minimizing worst-case regret. We propose regret-optimal analogs of Model-Predictive Control (MPC) and the Extended Kalman Filter (EKF) for systems with nonlinear dynamics and present numerical experiments which show that our regret-optimal algorithms can significantly outperform standard approaches to estimation and control.https://authors.library.caltech.edu/records/xvjtx-6b761Explicit Regularization via Regularizer Mirror Descent
https://resolver.caltech.edu/CaltechAUTHORS:20220714-212452558
Authors: Azizan, Navid; Lale, Sahin; Hassibi, Babak
Year: 2022
DOI: 10.48550/arXiv.arXiv.2202.10788
Despite perfectly interpolating the training data, deep neural networks (DNNs) can often generalize fairly well, in part due to the "implicit regularization" induced by the learning algorithm. Nonetheless, various forms of regularization, such as "explicit regularization" (via weight decay), are often used to avoid overfitting, especially when the data is corrupted. There are several challenges with explicit regularization, most notably unclear convergence properties. Inspired by convergence properties of stochastic mirror descent (SMD) algorithms, we propose a new method for training DNNs with regularization, called regularizer mirror descent (RMD). In highly overparameterized DNNs, SMD simultaneously interpolates the training data and minimizes a certain potential function of the weights. RMD starts with a standard cost which is the sum of the training loss and a convex regularizer of the weights. Reinterpreting this cost as the potential of an "augmented" overparameterized network and applying SMD yields RMD. As a result, RMD inherits the properties of SMD and provably converges to a point "close" to the minimizer of this cost. RMD is computationally comparable to stochastic gradient descent (SGD) and weight decay, and is parallelizable in the same manner. Our experimental results on training sets with various levels of corruption suggest that the generalization performance of RMD is remarkably robust and significantly better than both SGD and weight decay, which implicitly and explicitly regularize the ℓ₂ norm of the weights. RMD can also be used to regularize the weights to a desired weight vector, which is particularly relevant for continual learning.https://authors.library.caltech.edu/records/6y975-1rh51Optimal Competitive-Ratio Control
https://resolver.caltech.edu/CaltechAUTHORS:20220714-224600473
Authors: Sabag, Oron; Lale, Sahin; Hassibi, Babak
Year: 2022
DOI: 10.48550/arXiv.arXiv.2206.01782
Inspired by competitive policy designs approaches in online learning, new control paradigms such as competitive-ratio and regret-optimal control have been recently proposed as alternatives to the classical H₂ and H_∞ approaches. These competitive metrics compare the control cost of the designed controller against the cost of a clairvoyant controller, which has access to past, present, and future disturbances in terms of ratio and difference, respectively. While prior work provided the optimal solution for the regret-optimal control problem, in competitive-ratio control, the solution is only provided for the sub-optimal problem. In this work, we derive the optimal solution to the competitive-ratio control problem. We show that the optimal competitive ratio formula can be computed as the maximal eigenvalue of a simple matrix, and provide a state-space controller that achieves the optimal competitive ratio. We conduct an extensive numerical study to verify this analytical solution, and demonstrate that the optimal competitive-ratio controller outperforms other controllers on several large scale practical systems. The key techniques that underpin our explicit solution is a reduction of the control problem to a Nehari problem, along with a novel factorization of the clairvoyant controller's cost. We reveal an interesting relation between the explicit solutions that now exist for both competitive control paradigms by formulating a regret-optimal control framework with weight functions that can also be utilized for practical purposes.https://authors.library.caltech.edu/records/gmgwb-xsd37How to Query An Oracle? Efficient Strategies to Label Data
https://resolver.caltech.edu/CaltechAUTHORS:20220804-201317566
Authors: Lahouti, Farshad; Kostina, Victoria; Hassibi, Babak
Year: 2022
DOI: 10.48550/arXiv.2110.02341
We consider the basic problem of querying an expert oracle for labeling a dataset in machine learning. This is typically an expensive and time consuming process and therefore, we seek ways to do so efficiently. The conventional approach involves comparing each sample with (the representative of) each class to find a match. In a setting with N equally likely classes, this involves N/2 pairwise comparisons (queries per sample) on average. We consider a k-ary query scheme with k ≥ 2 samples in a query that identifies (dis)similar items in the set while effectively exploiting the associated transitive relations. We present a randomized batch algorithm that operates on a round-by-round basis to label the samples and achieves a query rate of O(N/k²). In addition, we present an adaptive greedy query scheme, which achieves an average rate of ≈0.2N queries per sample with triplet queries. For the proposed algorithms, we investigate the query rate performance analytically and with simulations. Empirical studies suggest that each triplet query takes an expert at most 50% more time compared with a pairwise query, indicating the effectiveness of the proposed k-ary query schemes. We generalize the analyses to nonuniform class distributions when possible.https://authors.library.caltech.edu/records/0ekyz-60y62Reducing the LQG Cost with Minimal Communication
https://resolver.caltech.edu/CaltechAUTHORS:20220804-201321456
Authors: Sabag, Oron; Tian, Peida; Kostina, Victoria; Hassibi, Babak
Year: 2022
DOI: 10.48550/arXiv.2109.12246
We study the linear quadratic Gaussian (LQG) control problem, in which the controller's observation of the system state is such that a desired cost is unattainable. To achieve the desired LQG cost, we introduce a communication link from the observer (encoder) to the controller. We investigate the optimal trade-off between the improved LQG cost and the consumed communication (information) resources, measured with the conditional directed information, across all encoding-decoding policies. The main result is a semidefinite programming formulation for that optimization problem in the finite-horizon scenario, which applies to time-varying linear dynamical systems. This result extends a seminal work by Tanaka et al., where the only information the controller knows about the system state arrives via a communication channel, to the scenario where the controller has also access to a noisy observation of the system state. As part of our derivation to show the optimiality of an encoder that transmits a memoryless Gaussian measurement of the state, we show that the presence of the controller's observations at the encoder can not reduce the minimal directed information. For time-invariant systems, where the optimal policy may be time-varying, we show in the infinite-horizon scenario that the optimal policy is time-invariant and can be computed explicitly from a solution of a finite-dimensional semidefinite programming. The results are demonstrated via examples that show that even low-quality measurements can have a significant impact on the required communication resources.https://authors.library.caltech.edu/records/6a9zd-tqs90Stochastic Mirror Descent in Average Ensemble Models
https://resolver.caltech.edu/CaltechAUTHORS:20221222-234253993
Authors: Kargin, Taylan; Salehi, Fariborz; Hassibi, Babak
Year: 2022
DOI: 10.48550/arXiv.2210.15323
The stochastic mirror descent (SMD) algorithm is a general class of training algorithms, which includes the celebrated stochastic gradient descent (SGD), as a special case. It utilizes a mirror potential to influence the implicit bias of the training algorithm. In this paper we explore the performance of the SMD iterates on mean-field ensemble models. Our results generalize earlier ones obtained for SGD on such models. The evolution of the distribution of parameters is mapped to a continuous time process in the space of probability distributions. Our main result gives a nonlinear partial differential equation to which the continuous time process converges in the asymptotic regime of large networks. The impact of the mirror potential appears through a multiplicative term that is equal to the inverse of its Hessian and which can be interpreted as defining a gradient flow over an appropriately defined Riemannian manifold. We provide numerical simulations which allow us to study and characterize the effect of the mirror potential on the performance of networks trained with SMD for some binary classification problems.https://authors.library.caltech.edu/records/xp0fs-v1t65Feedback capacity of Gaussian channels with memory
https://resolver.caltech.edu/CaltechAUTHORS:20221222-234257392
Authors: Sabag, Oron; Kostina, Victoria; Hassibi, Babak
Year: 2022
DOI: 10.48550/arXiv.2207.10580
We consider the feedback capacity of a MIMO channel whose channel output is given by a linear state-space model driven by the channel inputs and a Gaussian process. The generality of our state-space model subsumes all previous studied models such as additive channels with colored Gaussian noise, and channels with an arbitrary dependence on previous channel inputs or outputs. The main result is a computable feedback capacity expression that is given as a convex optimization problem subject to a detectability condition. We demonstrate the capacity result on the auto-regressive Gaussian noise channel, where we show that even a single time-instance delay in the feedback reduces the feedback capacity significantly in the stationary regime. On the other hand, for large regression parameters (in the non-stationary regime), the feedback capacity can be approached with delayed feedback. Finally, we show that the detectability condition is satisfied for scalar models and conjecture that it is true for MIMO models.https://authors.library.caltech.edu/records/dfkyw-qq111A construction of entropic vectors
https://resolver.caltech.edu/CaltechAUTHORS:20150209-072805888
Authors: Hassibi, Babak; Shadbakht, Sormeh
Year: 2023
The problem of determining the region of entropic vectors is a central one in information theory. Recently, there has been a great deal of interest in the development of non-Shannon information inequalities, which provide outer bounds to the aforementioned region; however, there has been less recent work on developing inner bounds. This paper develops an inner bound that applies to any number of random variables and which is tight for 2 and 3 random variables (the only cases where the entropy region is known). The construction is based on probability distributions generated by a lattice. The region is shown to be a polytope generated by a set of linear inequalities. It can therefore be used to compute an inner bound on the information-theoretic capacity region for a wide class of network problems using linear programming.https://authors.library.caltech.edu/records/9y8mk-f5k22