Monograph records
https://feeds.library.caltech.edu/people/Owhadi-H/monograph.rss
A Caltech Library Repository Feedhttp://www.rssboard.org/rss-specificationpython-feedgenenTue, 16 Apr 2024 14:05:11 +0000Optimal Uncertainty Quantification
https://resolver.caltech.edu/CaltechAUTHORS:20111012-113158874
Authors: {'items': [{'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'H.'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Scovel-C', 'name': {'family': 'Scovel', 'given': 'C.'}, 'orcid': '0000-0001-7757-3411'}, {'id': 'Sullivan-T-J', 'name': {'family': 'Sullivan', 'given': 'T. J.'}}, {'id': 'McKerns-M', 'name': {'family': 'McKerns', 'given': 'M.'}}, {'id': 'Ortiz-M', 'name': {'family': 'Ortiz', 'given': 'M.'}, 'orcid': '0000-0001-5877-4824'}]}
Year: 2011
DOI: 10.7907/TTW6-QD19
We propose a rigorous framework for Uncertainty Quantification (UQ) in which
the UQ objectives and the assumptions/information set are brought to the forefront.
This framework, which we call Optimal Uncertainty Quantification (OUQ), is based
on the observation that, given a set of assumptions and information about the problem,
there exist optimal bounds on uncertainties: these are obtained as extreme
values of well-defined optimization problems corresponding to extremizing probabilities
of failure, or of deviations, subject to the constraints imposed by the scenarios
compatible with the assumptions and information. In particular, this framework
does not implicitly impose inappropriate assumptions, nor does it repudiate relevant
information.
Although OUQ optimization problems are extremely large, we show that under
general conditions, they have finite-dimensional reductions. As an application,
we develop Optimal Concentration Inequalities (OCI) of Hoeffding and McDiarmid
type. Surprisingly, contrary to the classical sensitivity analysis paradigm, these results
show that uncertainties in input parameters do not necessarily propagate to
output uncertainties.
In addition, a general algorithmic framework is developed for OUQ and is tested
on the Caltech surrogate model for hypervelocity impact, suggesting the feasibility
of the framework for important complex systems.https://authors.library.caltech.edu/records/5j8b4-b5n05Non-intrusive and structure preserving multiscale integration of stiff ODEs, SDEs and Hamiltonian systems with hidden slow dynamics via flow averaging
https://resolver.caltech.edu/CaltechAUTHORS:20111012-110532817
Authors: {'items': [{'id': 'Tao-Molei', 'name': {'family': 'Tao', 'given': 'Molei'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Marsden-J-E', 'name': {'family': 'Marsden', 'given': 'Jerrold E.'}}]}
Year: 2011
DOI: 10.7907/QZNP-SR14
We introduce a new class of integrators for stiff ODEs as well as SDEs. An
example of subclass of systems that we treat are ODEs and SDEs that are sums of
two terms one of which has large coefficients. These integrators are (i) Multiscale:
they are based on
ow averaging and so do not resolve the fast variables but rather
employ step-sizes determined by slow variables (ii) Basis: the method is based on
averaging the
ow of the given dynamical system (which may have hidden slow and
fast processes) instead of averaging the instantaneous drift of assumed separated
slow and fast processes. This bypasses the need for identifying explicitly (or numerically)
the slow or fast variables. (iii) Non intrusive: A pre-existing numerical
scheme resolving the microscopic time scale can be used as a black box and turned
into one of the integrators in this paper by simply turning the large coefficients on
over a microscopic timescale and off during a mesoscopic timescale. (iv) Convergent
over two scales: strongly over slow processes and in the sense of measures over fast
ones. We introduce the related notion of two scale
ow convergence and analyze
the convergence of these integrators under the induced topology. (v) Structure preserving: For stiff Hamiltonian systems (possibly on manifolds), they are symplectic,
time-reversible, and symmetric (under the group action leaving the Hamiltonian invariant)
in all variables. They are explicit and apply to arbitrary stiff potentials
(that need not be quadratic). Their application to the Fermi-Pasta-Ulam problems
shows accuracy and stability over 4 orders of magnitude of time scales. For
stiff Langevin equations, they are symmetric (under a group action), time-reversible
and Boltzmann-Gibbs reversible, quasi-symplectic on all variables and conformally
symplectic with isotropic friction.https://authors.library.caltech.edu/records/vgmnz-ees44Localized bases for finite dimensional homogenization
approximations with non-separated scales and high-contrast
https://resolver.caltech.edu/CaltechAUTHORS:20111012-113719601
Authors: {'items': [{'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Zhang-Lei', 'name': {'family': 'Zhang', 'given': 'Lei'}, 'orcid': '0000-0001-9031-4318'}]}
Year: 2011
We construct finite-dimensional approximations of solution spaces of divergence
form operators with L^∞-coefficients. Our method does not rely on concepts of
ergodicity or scale-separation, but on the property that the solution of space of
these operators is compactly embedded in H^1 if source terms are in the unit ball
of L^2 instead of the unit ball of H^−1. Approximation spaces are generated by
solving elliptic PDEs on localized sub-domains with source terms corresponding to
approximation bases for H^2. The H^1-error estimates show that O(h^−d)-dimensional
spaces with basis elements localized to sub-domains of diameter O(h^∞ ln 1/h) (with α ∈ [1/2 , 1)) result in an O(h^(2−2α) accuracy for elliptic, parabolic and hyperbolic problems.
For high-contrast media, the accuracy of the method is preserved provided that
localized sub-domains contain buffer zones of width O(h^α ln 1/h ) where the contrast
of the medium remains bounded. The proposed method can naturally be generalized
to vectorial equations (such as elasto-dynamics).https://authors.library.caltech.edu/records/2nmzy-qwc92Flux Norm Approach to Homogenization Problems with non-separated Scales
https://resolver.caltech.edu/CaltechAUTHORS:20111012-105135181
Authors: {'items': [{'id': 'Berlyand-L', 'name': {'family': 'Berlyand', 'given': 'Leonid'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}]}
Year: 2011
DOI: 10.7907/T5DC-SN48
We consider linear divergence-form scalar elliptic equations and vectorial equations for elasticity with rough (L^∞(Ω), Ω ⊂ ℝ^d ) coefficients a(x) that, in particular,
model media with non-separated scales and high contrast in material properties.
While the homogenization of PDEs with periodic or ergodic coefficients and well
separated scales is now well understood, we consider here the most general case
of arbitrary bounded coefficients. For such problems we introduce explicit finite
dimensional approximations of solutions with controlled error estimates, which we
refer to as homogenization approximations. In particular, this approach allows one
to analyze a given medium directly without introducing the mathematical concept
of an ∈ family of media as in classical periodic homogenization. We define the flux
norm as the L^2 norm of the potential part of the fluxes of solutions, which is equivalent to the usual H^1-norm. We show that in the flux norm, the error associated with
approximating, in a properly defined finite-dimensional space, the set of solutions
of the aforementioned PDEs with rough coefficients is equal to the error associated
with approximating the set of solutions of the same type of PDEs with smooth coefficients in a standard space (e.g., piecewise polynomial). We refer to this property
as the transfer property. A simple application of this property is the construction
of finite dimensional approximation spaces with errors independent of the regularity
and contrast of the coefficients and with optimal and explicit convergence rates.
This transfer property also provides an alternative to the global harmonic change
of coordinates for the homogenization of elliptic operators that can be extended to
elasticity equations. The proofs of these homogenization results are based on a new
class of elliptic inequalities which play the same role in our approach as the div-curl
lemma in classical homogenization.https://authors.library.caltech.edu/records/ewe01-61q87Discrete Geometric Structures in Homogenization and Inverse Homogenization with Application to EIT
https://resolver.caltech.edu/CaltechAUTHORS:20111011-163848887
Authors: {'items': [{'id': 'Desbrun-M', 'name': {'family': 'Desbrun', 'given': 'Mathieu'}, 'orcid': '0000-0003-3424-6079'}, {'id': 'Donaldson-R-D', 'name': {'family': 'Donaldson', 'given': 'Roger D.'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}]}
Year: 2011
DOI: 10.7907/XR8W-EA85
We introduce a new geometric approach for the homogenization and
inverse homogenization of the divergence form elliptic operator with rough
conductivity coefficients σ(x) in dimension two. We show that conductivity coefficients are in one-to-one correspondence with divergence-free
matrices and convex functions s(x) over the domain Ω. Although homogenization is a non-linear and non-injective operator when applied directly
to conductivity coefficients, homogenization becomes a linear interpolation operator over triangulations of
Ω when re-expressed using convex
functions, and is a volume averaging operator when re-expressed with
divergence-free matrices. We explicitly give the transformations which
map conductivity coefficients into divergence-free matrices and convex
functions, as well as their respective inverses. Using optimal weighted Delaunay triangulations for linearly interpolating convex functions, we apply
this geometric framework to obtain an optimally robust homogenization
algorithm for arbitrary rough coefficients, extending the global optimality of Delaunay triangulations with respect to a discrete Dirichlet energy
to weighted Delaunay triangulations. Next, we consider inverse homogenization, that is, the recovery of the microstructure from macroscopic
information, a problem which is known to be both non-linear and severly
ill-posed. We show how to decompose this reconstruction into a linear ill-posed problem and a well-posed non-linear problem. We apply this new
geometric approach to Electrical Impedance Tomography (EIT) in dimension two. It is known that the EIT problem admits at most one isotropic
solution. If an isotropic solution exists, we show how to compute it from
any conductivity having the same boundary
Dirichlet-to-Neumann map.
This is of practical importance since the EIT problem always admits a
unique solution in the space of divergence-free matrices and is stable with
respect to G-convergence in that space (this property fails for isotropic
matrices). As such, we suggest that the space of convex functions is the
natural space to use to parameterize solutions of the EIT problem.https://authors.library.caltech.edu/records/36dys-2ya70The optimal uncertainty algorithm in the mystic framework
https://resolver.caltech.edu/CaltechAUTHORS:20160224-080348129
Authors: {'items': [{'id': 'McKerns-M', 'name': {'family': 'McKerns', 'given': 'M.'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'H.'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Scovel-C', 'name': {'family': 'Scovel', 'given': 'C.'}, 'orcid': '0000-0001-7757-3411'}, {'id': 'Sullivan-T-J', 'name': {'family': 'Sullivan', 'given': 'T. J.'}}, {'id': 'Ortiz-M', 'name': {'family': 'Ortiz', 'given': 'M.'}, 'orcid': '0000-0001-5877-4824'}]}
Year: 2016
DOI: 10.48550/arXiv.1202.1055
We have recently proposed a rigorous framework for Uncertainty Quantification (UQ) in which UQ objectives and assumption/information set are brought into
the forefront, providing a framework for the communication and comparison of UQ
results. In particular, this framework does not implicitly impose inappropriate assumptions nor does it repudiate relevant information.
This framework, which we call Optimal Uncertainty Quantification (OUQ), is
based on the observation that given a set of assumptions and information, there
exist bounds on uncertainties obtained as values of optimization problems and that
these bounds are optimal. It provides a uniform environment for the optimal solution of the problems of validation, certification, experimental design, reduced order
modeling, prediction, extrapolation, all under aleatoric and epistemic uncertainties.
OUQ optimization problems are extremely large, and even though under general
conditions they have finite-dimensional reductions, they must often be solved numerically. This general algorithmic framework for OUQ has been implemented in the
mystic optimization framework. We describe this implementation, and demonstrate
its use in the context of the Caltech surrogate model for hypervelocity impact.https://authors.library.caltech.edu/records/qffgv-kme54Temperature and Friction Accelerated Sampling of Boltzmann-Gibbs Distribution
https://resolver.caltech.edu/CaltechAUTHORS:20160224-090906859
Authors: {'items': [{'id': 'Tao-Molei', 'name': {'family': 'Tao', 'given': 'Molei'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Marsden-J-E', 'name': {'family': 'Marsden', 'given': 'Jerrold E.'}}]}
Year: 2016
DOI: 10.48550/arXiv.1007.0995
This paper is concerned with tuning friction and temperature in Langevin dynamics for fast sampling from the canonical ensemble. We show that near-optimal acceleration is achieved by choosing friction so that the local quadratic approximation of the Hamiltonian is a critical damped oscillator. The system is also over-heated and cooled down to its final temperature. The performances of different cooling schedules are analyzed as functions of total simulation time.https://authors.library.caltech.edu/records/js4rg-7dp08Structure preserving Stochastic Impulse Methods for stiff Langevin systems with a uniform global error of order 1 or 1/2 on position
https://resolver.caltech.edu/CaltechAUTHORS:20160224-085934570
Authors: {'items': [{'id': 'Tao-Molei', 'name': {'family': 'Tao', 'given': 'Molei'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Marsden-J-E', 'name': {'family': 'Marsden', 'given': 'Jerrold E.'}}]}
Year: 2016
DOI: 10.48550/arXiv.1006.4657
Impulse methods are generalized to a family of integrators for Langevin systems with quadratic stiff potentials and arbitrary soft potentials. Uniform error bounds (independent
from stiff parameters) are obtained on integrated positions allowing for coarse integration steps. The resulting integrators are explicit and structure preserving (quasi-symplectic for Langevin systems).https://authors.library.caltech.edu/records/0qces-vpr96Stochastic Variational Partitioned Runge-Kutta Integrators for Constrained Systems
https://resolver.caltech.edu/CaltechAUTHORS:20160224-100431574
Authors: {'items': [{'id': 'Bou-Rabee-N-M', 'name': {'family': 'Bou-Rabee', 'given': 'Nawaf'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}]}
Year: 2016
DOI: 10.48550/arXiv.0709.2222
Stochastic variational integrators for constrained, stochastic mechanical systems are developed in this paper. The main results of the paper are twofold: an equivalence is established between a stochastic Hamilton-Pontryagin (HP) principle in generalized coordinates and constrained coordinates via Lagrange multipliers, and variational partitioned Runge-Kutta (VPRK) integrators are extended to this class of systems. Among these integrators are first and second-order strongly convergent RATTLE-type integrators. We prove strong order of accuracy of the methods provided. The paper also reviews the deterministic treatment of VPRK integrators from the HP viewpoint.https://authors.library.caltech.edu/records/je0wb-97q10Metric based up-scaling
https://resolver.caltech.edu/CaltechAUTHORS:20160224-094627612
Authors: {'items': [{'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Zhang-Lei', 'name': {'family': 'Zhang', 'given': 'Lei'}, 'orcid': '0000-0001-9031-4318'}]}
Year: 2016
DOI: 10.48550/arXiv.0505223
We consider divergence form elliptic operators in dimension n ≥ 2 with L∞ coefficients. Although solutions of these operators are only Hölder continuous, we show that they are differentiable (C1,α) with respect to harmonic
coordinates. It follows that numerical homogenization can be extended to situations where the medium has no ergodicity at small scales and is characterized by a continuum of scales by transferring a new metric in addition
to traditional averaged (homogenized) quantities from subgrid scales into computational scales and error bounds can be given. This numerical homogenization method can also be used as a compression tool for differential operators.https://authors.library.caltech.edu/records/v5dr0-dv971Ergodicity of Langevin Processes with Degenerate Diffusion in Momentums
https://resolver.caltech.edu/CaltechAUTHORS:20160224-103320707
Authors: {'items': [{'id': 'Bou-Rabee-N-M', 'name': {'family': 'Bou-Rabee', 'given': 'Nawaf'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}]}
Year: 2016
DOI: 10.48550/arXiv.0710.4259
This paper introduces a geometric method for proving ergodicity of degenerate noise driven stochastic processes. The driving noise is assumed to be an arbitrary Levy process with non-degenerate diffusion component (but that may be applied to a single degree of freedom of the system). The geometric conditions are the approximate controllability of the process the fact that there exists a point in the phase space where the interior of the image of a point via a secondarily randomized version of the driving noise is non void. The paper applies the method to prove ergodicity of a sliding disk governed by Langevin-type equations (a simple stochastic rigid body system). The paper shows that a key feature of this Langevin process is that even though the diffusion and drift matrices associated to the momentums are degenerate, the system is still at uniform temperature.https://authors.library.caltech.edu/records/ska9y-fwf53Equivalence of concentration inequalities for linear and non-linear functions
https://resolver.caltech.edu/CaltechAUTHORS:20160224-082333411
Authors: {'items': [{'id': 'Sullivan-T-J', 'name': {'family': 'Sullivan', 'given': 'T. J.'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'H.'}, 'orcid': '0000-0002-5677-1600'}]}
Year: 2016
DOI: 10.48550/arXiv.1009.4913
We consider a random variable X that takes values in a (possibly infinite-dimensional) topological vector space X. We show that, with respect to an appropriate "normal distance" on X, concentration inequalities for linear and non-linear functions of X are equivalent. This normal distance corresponds naturally to the concentration rate in classical concentration results such as Gaussian concentration and concentration on the Euclidean and Hamming cubes. Under suitable assumptions on the roundness of the sets of interest, the concentration inequalities so obtained are asymptotically optimal in the high-dimensional limit.https://authors.library.caltech.edu/records/vbh75-mwe93Conditioning Gaussian measure on Hilbert space
https://resolver.caltech.edu/CaltechAUTHORS:20160224-065740350
Authors: {'items': [{'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Scovel-C', 'name': {'family': 'Scovel', 'given': 'Clint'}, 'orcid': '0000-0001-7757-3411'}]}
Year: 2016
DOI: 10.48550/arXiv.1506.04208
For a Gaussian measure on a separable Hilbert space with covariance operator C, we show that the family of conditional measures associated with conditioning on a
closed subspace S^⊥ are Gaussian with covariance operator the short S(C) of the operator C to S. We provide two proofs. The first uses the theory of Gaussian Hilbert
spaces and a characterization of the shorted operator by Andersen and Trapp. The second uses recent developments by Corach, Maestripieri and Stojanoff on the relationship
between the shorted operator and C-symmetric oblique projections onto S^⊥. To obtain the assertion when such projections do not exist, we develop an approximation result for the shorted operator by showing, for any positive operator A, how to construct a sequence of approximating operators A^n which possess A^n- symmetric oblique projections onto S^⊥ such that the sequence of shorted operators S(A^n) converges to S(A) in the weak operator topology. This result combined with the martingale convergence of random variables associated with the corresponding approximations C^n establishes the main assertion in general. Moreover, it in turn strengthens the approximation theorem for shorted operator when the operator is trace class; then the sequence of shorted operators S(A^n) converges to S(A) in trace norm.https://authors.library.caltech.edu/records/r4cp9-5jy65Brittleness of Bayesian inference and new Selberg formulas
https://resolver.caltech.edu/CaltechAUTHORS:20160224-073833523
Authors: {'items': [{'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Scovel-C', 'name': {'family': 'Scovel', 'given': 'Clint'}, 'orcid': '0000-0001-7757-3411'}]}
Year: 2016
DOI: 10.48550/arXiv.1304.7046
The incorporation of priors in the Optimal Uncertainty Quantification (OUQ) framework reveals brittleness in Bayesian inference; a model may share an arbitrarily
large number of finite-dimensional marginals with, or be arbitrarily close (in Prokhorov or total variation metrics) to, the data-generating distribution and still make the largest possible prediction error after conditioning on an arbitrarily large number of samples. The initial purpose of this paper is to unwrap this brittleness mechanism by providing (i) a quantitative version of the Brittleness Theorem of and (ii) a detailed and comprehensive analysis of its application to the revealing example of estimating the mean of a random variable on the unit interval [0, 1] using priors that exactly capture the distribution of an arbitrarily large number of Hausdorff moments. However, in doing so, we discovered that the free parameter associated with Markov and Kreĩn's canonical representations of truncated Hausdorff moments generates reproducing
kernel identities corresponding to reproducing kernel Hilbert spaces of polynomials. Furthermore, these reproducing identities lead to biorthogonal systems of Selberg integral formulas.
This process of discovery appears to be generic: whereas Karlin and Shapley used Selberg's integral formula to first compute the volume of the Hausdorff moment space
(the polytope defined by the first n moments of a probability measure on the interval [0, 1]), we observe that the computation of that volume along with higher order moments of the uniform measure on the moment space, using different finite-dimensional representations of subsets of the infinite-dimensional set of probability measures on [0, 1] representing the first n moments, leads to families of equalities corresponding to classical and new Selberg identities.https://authors.library.caltech.edu/records/bgn30-19p88Ballistic Transport at Uniform Temperature
https://resolver.caltech.edu/CaltechAUTHORS:20160224-101438724
Authors: {'items': [{'id': 'Bou-Rabee-N-M', 'name': {'family': 'Bou-Rabee', 'given': 'Nawaf'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}]}
Year: 2016
DOI: 10.48550/arXiv.0710.1565
A paradigm for isothermal, mechanical rectification of stochastic uctuations is introduced in this paper. The central idea is to transform energy injected by random
perturbations into rigid-body rotational kinetic energy. The prototype considered in this paper is a mechanical system consisting of a set of rigid bodies in interaction
through magnetic fields. The system is stochastically forced by white noise and dissipative through mechanical friction. The Gibbs-Boltzmann distribution at a specific
temperature defines the unique invariant measure under the
flow of this stochastic process and allows us to define "the temperature" of the system. This measure is also
ergodic and strongly mixing. Although the system does not exhibit global directed motion, it is shown that global ballistic motion is possible (the mean-squared displacement
grows like t^2). More precisely, although work cannot be extracted from thermal energy by the second law of thermodynamics, it is shown that ballistic transport from thermal energy is possible. In particular, the dynamics is characterized by a meta-stable state in which the system exhibits directed motion over random time scales. This phenomenon is caused by interaction of three attributes of the system: a non at (yet bounded) potential energy landscape, a rigid body effect (coupling translational momentum and angular momentum through friction) and the degeneracy of the noise/friction tensor on the momentums (the fact that noise is not applied to all degrees of freedom).https://authors.library.caltech.edu/records/0171g-e5p13Universal Scalable Robust Solvers from Computational Information Games and fast eigenspace adapted Multiresolution Analysis
https://resolver.caltech.edu/CaltechAUTHORS:20170710-085210757
Authors: {'items': [{'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Scovel-C', 'name': {'family': 'Scovel', 'given': 'Clint'}, 'orcid': '0000-0001-7757-3411'}]}
Year: 2017
DOI: 10.48550/arXiv.1703.10761
We show how the discovery of robust scalable numerical solvers for arbitrary bounded linear operators can be automated as a Game Theory problem by reformulating the process of computing with partial information and limited resources as that of playing underlying hierarchies of adversarial information games. When the solution space is a Banach space B endowed with a quadratic norm ∥⋅∥, the optimal measure (mixed strategy) for such games (e.g. the adversarial recovery of u ∈ B, given partial measurements [ϕ_i,u] with ϕ_i ∈ B^∗, using relative error in ∥⋅∥-norm as a loss) is a centered Gaussian field ξ solely determined by the norm ∥⋅∥, whose conditioning (on measurements) produces optimal bets. When measurements are hierarchical, the process of conditioning this Gaussian field produces a hierarchy of elementary bets (gamblets). These gamblets generalize the notion of Wavelets and Wannier functions in the sense that they are adapted to the norm ∥⋅∥ and induce a multi-resolution decomposition of B that is adapted to the eigensubspaces of the operator defining the norm ∥⋅∥. When the operator is localized, we show that the resulting gamblets are localized both in space and frequency and introduce the Fast Gamblet Transform (FGT) with rigorous accuracy and (near-linear) complexity estimates. As the FFT can be used to solve and diagonalize arbitrary PDEs with constant coefficients, the FGT can be used to decompose a wide range of continuous linear operators (including arbitrary continuous linear bijections from H^s_0 to H^(−s) or to L^2) into a sequence of independent linear systems with uniformly bounded condition numbers and leads to O(NpolylogN) solvers and eigenspace adapted Multiresolution Analysis (resulting in near linear complexity approximation of all eigensubspaces).https://authors.library.caltech.edu/records/hkp94-1wn28On testing the simulation theory
https://resolver.caltech.edu/CaltechAUTHORS:20190109-112815344
Authors: {'items': [{'id': 'Campbell-T', 'name': {'family': 'Campbell', 'given': 'Tom'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Sauvageauz-J', 'name': {'family': 'Sauvageauz', 'given': 'Joe'}}, {'id': 'Watkinson-D', 'name': {'family': 'Watkinson', 'given': 'David'}}]}
Year: 2019
DOI: 10.48550/arXiv.1703.00058
Can the theory that reality is a simulation be tested? We investigate this question based on the assumption that if the system performing the simulation is finite (i.e. has limited resources), then to achieve low computational complexity, such a system would, as in a video game, render content (reality) only at the moment that information becomes available for observation by a player and not at the moment of detection by a machine (that would be part of the simulation and whose detection would also be part of the internal computation performed by the Virtual Reality server before rendering content to the player). Guided by this principle we describe conceptual wave/particle duality experiments aimed at testing the simulation theory.https://authors.library.caltech.edu/records/91xq4-c7r09Kernel Mode Decomposition and programmable/interpretable regression networks
https://resolver.caltech.edu/CaltechAUTHORS:20190923-153747161
Authors: {'items': [{'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Scovel-Clint', 'name': {'family': 'Scovel', 'given': 'Clint'}, 'orcid': '0000-0001-7757-3411'}, {'id': 'Yoo-Gene-Ryan', 'name': {'family': 'Yoo', 'given': 'Gene Ryan'}, 'orcid': '0000-0002-5319-5599'}]}
Year: 2019
DOI: 10.48550/arXiv.1907.08592
Mode decomposition is a prototypical pattern recognition problem that can be addressed from the (a priori distinct) perspectives of numerical approximation, statistical inference and deep learning. Could its analysis through these combined perspectives be used as a Rosetta stone for deciphering mechanisms at play in deep learning? Motivated by this question we introduce programmable and interpretable regression networks for pattern recognition and address mode decomposition as a prototypical problem. The programming of these networks is achieved by assembling elementary modules decomposing and recomposing kernels and data. These elementary steps are repeated across levels of abstraction and interpreted from the equivalent perspectives of optimal recovery, game theory and Gaussian process regression (GPR). The prototypical mode/kernel decomposition module produces an optimal approximation (w₁,w₂,⋯,w_m) of an element (v₁,v₂,…,v_m) of a product of Hilbert subspaces of a common Hilbert space from the observation of the sum v:=v₁+⋯+v_m. The prototypical mode/kernel recomposition module performs partial sums of the recovered modes w_i based on the alignment between each recovered mode w_i and the data v. We illustrate the proposed framework by programming regression networks approximating the modes v_i=a_i(t)y_i(θ_i(t)) of a (possibly noisy) signal ∑_iv_i when the amplitudes a_i, instantaneous phases θ_i and periodic waveforms y_i may all be unknown and show near machine precision recovery under regularity and separation assumptions on the instantaneous amplitudes a_i and frequencies θ_i. The structure of some of these networks share intriguing similarities with convolutional neural networks while being interpretable, programmable and amenable to theoretical analysis.https://authors.library.caltech.edu/records/rsq9f-6w197Competitive Mirror Descent
https://resolver.caltech.edu/CaltechAUTHORS:20201106-120218966
Authors: {'items': [{'id': 'Schäfer-F', 'name': {'family': 'Schäfer', 'given': 'Florian'}}, {'id': 'Anandkumar-A', 'name': {'family': 'Anandkumar', 'given': 'Anima'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}]}
Year: 2020
DOI: 10.48550/arXiv.2006.10179
Constrained competitive optimization involves multiple agents trying to minimize conflicting objectives, subject to constraints. This is a highly expressive modeling language that subsumes most of modern machine learning. In this work we propose competitive mirror descent (CMD): a general method for solving such problems based on first order information that can be obtained by automatic differentiation. First, by adding Lagrange multipliers, we obtain a simplified constraint set with an associated Bregman potential. At each iteration, we then solve for the Nash equilibrium of a regularized bilinear approximation of the full problem to obtain a direction of movement of the agents. Finally, we obtain the next iterate by following this direction according to the dual geometry induced by the Bregman potential. By using the dual geometry we obtain feasible iterates despite only solving a linear system at each iteration, eliminating the need for projection steps while still accounting for the global nonlinear structure of the constraint set. As a special case we obtain a novel competitive multiplicative weights algorithm for problems on the positive cone.https://authors.library.caltech.edu/records/tpvre-65819Learning dynamical systems from data: a simple cross-validation perspective
https://resolver.caltech.edu/CaltechAUTHORS:20201109-155527819
Authors: {'items': [{'id': 'Hamzi-Boumediene', 'name': {'family': 'Hamzi', 'given': 'Boumediene'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}]}
Year: 2020
DOI: 10.48550/arXiv.2007.05074
Regressing the vector field of a dynamical system from a finite number of observed states is a natural way to learn surrogate models for such systems. We present variants of cross-validation (Kernel Flows [31] and its variants based on Maximum Mean Discrepancy and Lyapunov exponents) as simple approaches for learning the kernel used in these emulators.https://authors.library.caltech.edu/records/sshn9-4d642Do ideas have shape? Plato's theory of forms as the continuous limit of artificial neural networks
https://resolver.caltech.edu/CaltechAUTHORS:20201109-155524397
Authors: {'items': [{'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}]}
Year: 2020
DOI: 10.48550/arXiv.2008.03920
We show that ResNets converge, in the infinite depth limit, to a generalization of image registration algorithms. In this generalization, images are replaced by abstractions (ideas) living in high dimensional RKHS spaces, and material points are replaced by data points. Whereas computational anatomy aligns images via deformations of the material space, this generalization aligns ideas by via transformations of their RKHS. This identification of ResNets as idea registration algorithms has several remarkable consequences. The search for good architectures can be reduced to that of good kernels, and we show that the composition of idea registration blocks with reduced equivariant multi-channel kernels (introduced here) recovers and generalizes CNNs to arbitrary spaces and groups of transformations. Minimizers of L2 regularized ResNets satisfy a discrete least action principle implying the near preservation of the norm of weights and biases across layers. The parameters of trained ResNets can be identified as solutions of an autonomous Hamiltonian system defined by the activation function and the architecture of the ANN. Momenta variables provide a sparse representation of the parameters of a ResNet. The registration regularization strategy provides a provably robust alternative to Dropout for ANNs. Pointwise RKHS error estimates lead to deterministic error estimates for ANNs.https://authors.library.caltech.edu/records/5mydb-afj74Decision Theoretic Bootstrapping
https://resolver.caltech.edu/CaltechAUTHORS:20210323-130821498
Authors: {'items': [{'id': 'Tavallali-Peyman', 'name': {'family': 'Tavallali', 'given': 'Peyman'}, 'orcid': '0000-0001-7166-5489'}, {'id': 'Hamze-Bajgiran-Hamed', 'name': {'family': 'Hamze Bajgiran', 'given': 'Hamed'}}, {'id': 'Esaid-Danial-J', 'name': {'family': 'Esaid', 'given': 'Danial J.'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}]}
Year: 2021
DOI: 10.48550/arXiv.2103.09982
The design and testing of supervised machine learning models combine two fundamental distributions: (1) the training data distribution (2) the testing data distribution. Although these two distributions are identical and identifiable when the data set is infinite; they are imperfectly known (and possibly distinct) when the data is finite (and possibly corrupted) and this uncertainty must be taken into account for robust Uncertainty Quantification (UQ). We present a general decision-theoretic bootstrapping solution to this problem: (1) partition the available data into a training subset and a UQ subset (2) take m subsampled subsets of the training set and train m models (3) partition the UQ set into n sorted subsets and take a random fraction of them to define n corresponding empirical distributions μ_j (4) consider the adversarial game where Player I selects a model i∈{1,…,m}, Player II selects the UQ distribution μ_j and Player I receives a loss defined by evaluating the model i against data points sampled from μ_j (5) identify optimal mixed strategies (probability distributions over models and UQ distributions) for both players. These randomized optimal mixed strategies provide optimal model mixtures and UQ estimates given the adversarial uncertainty of the training and testing distributions represented by the game. The proposed approach provides (1) some degree of robustness to distributional shift in both the distribution of training data and that of the testing data (2) conditional probability distributions on the output space forming aleatory representations of the uncertainty on the output as a function of the input variable.https://authors.library.caltech.edu/records/1b23n-q5002Uncertainty Quantification of the 4th kind; optimal posterior accuracy-uncertainty tradeoff with the minimum enclosing ball
https://resolver.caltech.edu/CaltechAUTHORS:20220524-180308552
Authors: {'items': [{'id': 'Hamze-Bajgiran-Hamed', 'name': {'family': 'Hamze Bajgiran', 'given': 'Hamed'}}, {'id': 'Batlle-Franch-Pau', 'name': {'family': 'Batlle Franch', 'given': 'Pau'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Scovel-Clint', 'name': {'family': 'Scovel', 'given': 'Clint'}, 'orcid': '0000-0001-7757-3411'}, {'id': 'Shirdel-Mahdy', 'name': {'family': 'Shirdel', 'given': 'Mahdy'}}, {'id': 'Stanley-Michael', 'name': {'family': 'Stanley', 'given': 'Michael'}}, {'id': 'Tavallali-Peyman', 'name': {'family': 'Tavallali', 'given': 'Peyman'}, 'orcid': '0000-0001-7166-5489'}]}
Year: 2022
DOI: 10.48550/arXiv.2108.10517
There are essentially three kinds of approaches to Uncertainty Quantification (UQ): (A) robust optimization, (B) Bayesian, (C) decision theory. Although (A) is robust, it is unfavorable with respect to accuracy and data assimilation. (B) requires a prior, it is generally brittle and posterior estimations can be slow. Although (C) leads to the identification of an optimal prior, its approximation suffers from the curse of dimensionality and the notion of risk is one that is averaged with respect to the distribution of the data. We introduce a 4th kind which is a hybrid between (A), (B), (C), and hypothesis testing. It can be summarized as, after observing a sample x, (1) defining a likelihood region through the relative likelihood and (2) playing a minmax game in that region to define optimal estimators and their risk. The resulting method has several desirable properties (a) an optimal prior is identified after measuring the data, and the notion of risk is a posterior one, (b) the determination of the optimal estimate and its risk can be reduced to computing the minimum enclosing ball of the image of the likelihood region under the quantity of interest map (which is fast and not subject to the curse of dimensionality). The method is characterized by a parameter in [0,1] acting as an assumed lower bound on the rarity of the observed data (the relative likelihood). When that parameter is near 1, the method produces a posterior distribution concentrated around a maximum likelihood estimate with tight but low confidence UQ estimates. When that parameter is near 0, the method produces a maximal risk posterior distribution with high confidence UQ estimates. In addition to navigating the accuracy-uncertainty tradeoff, the proposed method addresses the brittleness of Bayesian inference by navigating the robustness-accuracy tradeoff associated with data assimilation.https://authors.library.caltech.edu/records/v0cp5-ykc35Learning dynamical systems from data: A simple cross-validation perspective, part III: Irregularly-Sampled Time Series
https://resolver.caltech.edu/CaltechAUTHORS:20220524-180315371
Authors: {'items': [{'id': 'Lee-Jonghyeon', 'name': {'family': 'Lee', 'given': 'Jonghyeon'}}, {'id': 'De-Brouwer-Edward', 'name': {'family': 'De Brouwer', 'given': 'Edward'}, 'orcid': '0000-0003-0608-0155'}, {'id': 'Hamzi-Boumediene', 'name': {'family': 'Hamzi', 'given': 'Boumediene'}, 'orcid': '0000-0002-9446-2614'}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}]}
Year: 2022
DOI: 10.48550/arXiv.2111.13037
A simple and interpretable way to learn a dynamical system from data is to interpolate its vector-field with a kernel. In particular, this strategy is highly efficient (both in terms of accuracy and complexity) when the kernel is data-adapted using Kernel Flows (KF) [34] (which uses gradient-based optimization to learn a kernel based on the premise that a kernel is good if there is no significant loss in accuracy if half of the data is used for interpolation). Despite its previous successes, this strategy (based on interpolating the vector field driving the dynamical system) breaks down when the observed time series is not regularly sampled in time. In this work, we propose to address this problem by directly approximating the vector field of the dynamical system by incorporating time differences between observations in the (KF) data-adapted kernels. We compare our approach with the classical one over different benchmark dynamical systems and show that it significantly improves the forecasting accuracy while remaining simple, fast, and robust.https://authors.library.caltech.edu/records/vhc39-wjz49Data-driven geophysical forecasting: Simple, low-cost, and accurate baselines with kernel methods
https://resolver.caltech.edu/CaltechAUTHORS:20220524-180305206
Authors: {'items': [{'id': 'Hamzi-Boumediene', 'name': {'family': 'Hamzi', 'given': 'Boumediene'}, 'orcid': '0000-0002-9446-2614'}, {'id': 'Maulik-Romit', 'name': {'family': 'Maulik', 'given': 'Romit'}, 'orcid': '0000-0001-9731-8936'}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}]}
Year: 2022
DOI: 10.48550/arXiv.2103.10935
Modeling geophysical processes as low-dimensional dynamical systems and regressing their vector field from data is a promising approach for learning emulators of such systems. We show that when the kernel of these emulators is also learned from data (using kernel flows, a variant of cross-validation), then the resulting data-driven models are not only faster than equation-based models but are easier to train than neural networks such as the long short-term memory neural network. In addition, they are also more accurate and predictive than the latter. When trained on geophysical observational data, for example, the weekly averaged global sea-surface temperature, considerable gains are also observed by the proposed technique in comparison to classical partial differential equation-based models in terms of forecast computational cost and accuracy. When trained on publicly available re-analysis data for the daily temperature of the North-American continent, we see significant improvements over classical baselines such as climatology and persistence-based forecast techniques. Although our experiments concern specific examples, the proposed approach is general, and our results support the viability of kernel methods (with learned kernels) for interpretable and computationally efficient geophysical forecasting for a large diversity of processes.https://authors.library.caltech.edu/records/kmksv-qsg95Aggregation of Pareto optimal models
https://resolver.caltech.edu/CaltechAUTHORS:20220524-180318744
Authors: {'items': [{'id': 'Hamze-Bajgiran-Hamed', 'name': {'family': 'Hamze Bajgiran', 'given': 'Hamed'}, 'orcid': '0000-0002-6246-2783'}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}]}
Year: 2022
DOI: 10.48550/arXiv.2112.04161
In statistical decision theory, a model is said to be Pareto optimal (or admissible) if no other model carries less risk for at least one state of nature while presenting no more risk for others. How can you rationally aggregate/combine a finite set of Pareto optimal models while preserving Pareto efficiency? This question is nontrivial because weighted model averaging does not, in general, preserve Pareto efficiency. This paper presents an answer in four logical steps: (1) A rational aggregation rule should preserve Pareto efficiency (2) Due to the complete class theorem, Pareto optimal models must be Bayesian, i.e., they minimize a risk where the true state of nature is averaged with respect to some prior. Therefore each Pareto optimal model can be associated with a prior, and Pareto efficiency can be maintained by aggregating Pareto optimal models through their priors. (3) A prior can be interpreted as a preference ranking over models: prior π prefers model A over model B if the average risk of A is lower than the average risk of B. (4) A rational/consistent aggregation rule should preserve this preference ranking: If both priors π and π′ prefer model A over model B, then the prior obtained by aggregating π and π′ must also prefer A over B. Under these four steps, we show that all rational/consistent aggregation rules are as follows: Give each individual Pareto optimal model a weight, introduce a weak order/ranking over the set of Pareto optimal models, aggregate a finite set of models S as the model associated with the prior obtained as the weighted average of the priors of the highest-ranked models in S. This result shows that all rational/consistent aggregation rules must follow a generalization of hierarchical Bayesian modeling. Following our main result, we present applications to Kernel smoothing, time-depreciating models, and voting mechanisms.https://authors.library.caltech.edu/records/aeg7b-hbw77Aggregation of Models, Choices, Beliefs, and Preferences
https://resolver.caltech.edu/CaltechAUTHORS:20220524-180312022
Authors: {'items': [{'id': 'Hamze-Bajgiran-Hamed', 'name': {'family': 'Hamze Bajgiran', 'given': 'Hamed'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}]}
Year: 2022
DOI: 10.48550/arXiv.2111.11630
A natural notion of rationality/consistency for aggregating models is that, for all (possibly aggregated) models A and B, if the output of model A is f(A) and if the output model B is f(B), then the output of the model obtained by aggregating A and B must be a weighted average of f(A) and f(B). Similarly, a natural notion of rationality for aggregating preferences of ensembles of experts is that, for all (possibly aggregated) experts A and B, and all possible choices x and y, if both A and B prefer x over y, then the expert obtained by aggregating A and B must also prefer x over y. Rational aggregation is an important element of uncertainty quantification, and it lies behind many seemingly different results in economic theory: spanning social choice, belief formation, and individual decision making. Three examples of rational aggregation rules are as follows. (1) Give each individual model (expert) a weight (a score) and use weighted averaging to aggregate individual or finite ensembles of models (experts). (2) Order/rank individual model (expert) and let the aggregation of a finite ensemble of individual models (experts) be the highest-ranked individual model (expert) in that ensemble. (3) Give each individual model (expert) a weight, introduce a weak order/ranking over the set of models/experts, aggregate A and B as the weighted average of the highest-ranked models (experts) in A or B. Note that (1) and (2) are particular cases of (3). In this paper, we show that all rational aggregation rules are of the form (3). This result unifies aggregation procedures across different economic environments. Following the main representation, we show applications and extensions of our representation in various separated economics topics such as belief formation, choice theory, and social welfare economics.https://authors.library.caltech.edu/records/g0k6z-gvr57