CaltechAUTHORS: Article

Distances and diameters in concentration inequalities: from geometry to optimal assignment of sampling resources

Year: 2002 DOI: 10.1615/Int.J.UncertaintyQuantification.2011003433 This note reviews, compares and contrasts three notions of "distance" or "size" that arise often in concentration-of-measure inequalities. We review Talagrand′s convex distance and McDiarmid′s diameter, and consider in particular the normal distance on a topological vector space

Super-diffusivity in a shear flow model from perpetual homogenization

Year: 2002 DOI: 10.1007/s002200200640 This paper is concerned with the asymptotic behavior solutions of stochastic differential equations dy_t =dω_ t −∇Γ(y_t ) dt, y_0=0 and d=2. Γ is a 2 x 2 skew-symmetric matrix associated to a shear flow characterized by an infinite number of spatial scales Γ_(12) = −Γ_(21) = h(x_1), with h(x_1) = ∑_(n =0)^∞γ_n h^n (x_1/R_n ), where h^n are smooth functions of period 1, h^n (0)=0, γ_ n and R_n grow exponentially fast with n. We can show that y_t has an anomalous fast behavior (?[|y_t |^2]∼t^(1+ν) with ν > 0) and obtain quantitative estimates on the anomaly using and developing the tools of homogenization.

Multi-scale homogenization with bounded ratios and Anomalous Slow Diffusion

Year: 2003 DOI: 10.1002/cpa.10053 We show that the effective diffusivity matrix D(V^n) for the heat operator ∂_t − (Δ/2 − ∇V^n∇) in a periodic potential V^n = Σ^n_(k=0)U_k(x/R_k) obtained as a superposition of Hölder-continuous periodic potentials U_k (of period T^d:= ℝ^d/ℤ^d, d ∈ ℕ^*, U_k(0) = 0) decays exponentially fast with the number of scales when the scale ratios R_(k+1)/R_k are bounded above and below. From this we deduce the anomalous slow behavior for a Brownian motion in a potential obtained as a superposition of an infinite number of scales, dy_t = dω_t − ∇V^∞(yt)dt.

Approximation of the effective conductivity of ergodic media by periodization

Year: 2003 DOI: 10.1007/s00440-002-0240-4 This paper is concerned with the approximation of the effective conductivity σ(A, μ) associated to an elliptic operator ∇_xA(x,η)∇_x where for xЄℝ^d,d ≥ 1, A(x,η) is a bounded elliptic random symmetric d×d matrix and η takes value in an ergodic probability space (X, μ). Writing A^N (x, η) the periodization of A(x, η) on the torus T^d_N of dimension d and side N we prove that for μ-almost all η lim ^(N→+∞) σ(A^N, η) = σ(A,μ) We extend this result to non-symmetric operators ∇_x (a+E(x, η))∇_x corresponding to diffusions in ergodic divergence free flows (a is d×d elliptic symmetric matrix and E(x, η) an ergodic skew-symmetric matrix); and to discrete operators corresponding to random walks on ℤ^d with ergodic jump rates. The core of our result is to show that the ergodic Weyl decomposition associated to L^2(X,μ) can almost surely be approximated by periodic Weyl decompositions with increasing periods, implying that semi-continuous variational formulae associated to L^2(X,μ) can almost surely be approximated by variational formulae minimizing on periodic potential and solenoidal functions.

Anomalous slow diffusion from perpetual homogenization

Year: 2003 DOI: 10.1214/aop/1068646372 This paper is concerned with the asymptotic behavior of solutions of stochastic differential equations $dy_t=d\omega_t -\nabla V(y_t)\, dt$, $y_0=0$. When $d=1$ and V is not periodic but obtained as a superposition of an infinite number of periodic potentials with geometrically increasing periods [$V(x) = \sum_{k=0}^\infty U_k(x/R_k)$, where $U_k$ are smooth functions of period 1, $U_k(0)=0$, and $R_k$ grows exponentially fast with k] we can show that $y_t$ has an anomalous slow behavior and we obtain quantitative estimates on the anomaly using and developing the tools of homogenization. Pointwise estimates are based on a new analytical inequality for subharmonic functions. When $d\geq 1$ and V is periodic, quantitative estimates are obtained on the heat kernel of $y_t$, showing the rate at which homogenization takes place. The latter result proves Davies' conjecture and is based on a quantitative estimate for the Laplace transform of martingales that can be used to obtain similar results for periodic elliptic generators.

Averaging Versus Chaos in Turbulent Transport?

Year: 2004 DOI: 10.1007/s00220-004-1069-8 In this paper we analyze the transport of passive tracers by deterministic stationary incompressible flows which can be decomposed over an infinite number of spatial scales without separation between them. It appears that a low order dynamical system related to local Peclet numbers can be extracted from these flows and it controls their transport properties. Its analysis shows that these flows are strongly self-averaging and super-diffusive: the delay τ (r) for any finite number of passive tracers initially close to separate till a distance r is almost surely anomalously fast (τ (r)∼ r^(2–ν) , with ν > 0). This strong self-averaging property is such that the dissipative power of the flow compensates its convective power at every scale. However as the circulation increases in the eddies the transport behavior of the flow may (discontinuously) bifurcate and become ruled by deterministic chaos: the self-averaging property collapses and advection dominates dissipation. When the flow is anisotropic a new formula describing turbulent conductivity is identified.

From a market of dreamers to economical shocks

Year: 2004 DOI: 10.1016/j.physa.2004.05.078 Over the past years an intense work has been undertaken to understand the origin of the crashes and bubbles of financial markets. The explanations of these crashes have been grounded on the hypothesis of behavioral and social correlations between the agents in interacting particle models or on a feedback of the stock prices on trading behaviors in mean-field models (here bubbles and crashes are seen as collective hysteria). In this paper, we will introduce a market model as a particle system with no other interaction between the agents than the fact that to be able to sell, somebody must be willing to buy and no feedback of the price on their trading behavior. We will show that this model crashes in finite estimable time. Although the age of the market does not appear in the price dynamic the population of traders taken as a whole system is maturing towards collapse. The wealth distribution among the agents follows the second law of thermodynamics and with probability one an agent (or a minority of agents) will accumulate a large portion of the total wealth, at some point this disproportion in the wealth distribution becomes unbearable for the market leading to its collapse. We believe that the origin of the collapse in our model could be of some relevance in understanding long-term economic cycles such as the Kondratiev cycle.

Homogenization of Parabolic Equations with a Continuum of Space and Time Scales

Year: 2007 DOI: 10.1137/060670420 This paper addresses the issue of the homogenization of linear divergence form parabolic operators in situations where no ergodicity and no scale separation in time or space are available. Namely, we consider divergence form linear parabolic operators in $\Omega \subset \mathbb{R}^n$ with $L^\infty(\Omega \times (0,T))$-coefficients. It appears that the inverse operator maps the unit ball of $L^2(\Omega\times (0,T))$ into a space of functions which at small (time and space) scales are close in $H^1$ norm to a functional space of dimension $n$. It follows that once one has solved these equations at least $n$ times it is possible to homogenize them both in space and in time, reducing the number of operation counts necessary to obtain further solutions. In practice we show under a Cordes-type condition that the first order time derivatives and second order space derivatives of the solution of these operators with respect to caloric coordinates are in $L^2$ (instead of $H^{-1}$ with Euclidean coordinates). If the medium is time-independent, then it is sufficient to solve $n$ times the associated elliptic equation in order to homogenize the parabolic equation.

Bistable Equilibrium Points of Mercury Body Burden

Year: 2008 DOI: 10.1142/S0218339008002459 In the last century, mercury levels in the global environment have tripled as a result of increased pollution from industrial, occupational, medicinal and domestic uses. Glutathione is known to be the main agent responsible for the excretion of mercury (Refs. 2 to 4). It has also been shown that mercury inhibits glutathione synthetase (an enzyme acting in the synthesization of glutathione), therefore leading to decreased glutathione levels (Refs. 5 to 7). Mercury also interferes with the production of heme in the porphyrin pathway. Heme is needed for biological energy production and ability to detox organic toxins via the P450 enzymes. The purpose of this paper is to show that the body's response to mercury exposure is hysteretic, i.e. when this feedback of mercury on its main detoxifying agents is strong enough, then mercury body burden has two points of equilibrium: one with normal abilities to detoxify and low levels of mercury and one with inhibited abilities to detoxify and high levels of mercury. Furthermore, a small increase of the body's mercury burden may not be sufficient to trigger observable neurotoxic effects but it may be sufficient to act as a switch leading to an accumulation of mercury in the body through environmental exposure until its toxicity is manifested.

Rigorous verification, validation, uncertainty quantification and certification through concentration-of-measure inequalities

Year: 2008 DOI: 10.1016/j.cma.2008.06.008 We apply concentration-of-measure inequalities to the quantification of uncertainties in the performance of engineering systems. Specifically, we envision uncertainty quantification in the context of certification, i.e., as a tool for deciding whether a system is likely to perform safely and reliably within design specifications. We show that concentration-of-measure inequalities rigorously bound probabilities of failure and thus supply conservative certification criteria. In addition, they supply unambiguous quantitative definitions of terms such as margins, epistemic and aleatoric uncertainties, verification and validation measures, confidence factors, and others, as well as providing clear procedures for computing these quantities by means of concerted simulation and experimental campaigns. We also investigate numerically the tightness of concentration-of-measure inequalities with the aid of an imploding ring example. Our numerical tests establish the robustness and viability of concentration-of-measure inequalities as a basis for certification in that particular example of application.

Numerical homogenization of the acoustic wave equations with a continuum of scales

Year: 2008 DOI: 10.1016/j.cma.2008.08.012 In this paper, we consider numerical homogenization of acoustic wave equations with heterogeneous coefficients, namely, when the bulk modulus and the density of the medium are only bounded. We show that under a Cordes type condition the second order derivatives of the solution with respect to harmonic coordinates are L^2 (instead H^-1 with respect to Euclidean coordinates) and the solution itself is in L∞(0,T,H^2(Ω)) (instead of L∞(0,T,H^1(Ω)) with respect to Euclidean coordinates). Then, we propose an implicit time stepping method to solve the resulted linear system on coarse spatial scales, and present error estimates of the method. It follows that by pre-computing the associated harmonic coordinates, it is possible to numerically homogenize the wave equation without assumptions of scale separation or ergodicity.

Stochastic variational integrators

Year: 2009 DOI: 10.1093/imanum/drn018 This paper presents a continuous and discrete Lagrangian theory for stochastic Hamiltonian systems on manifolds, akin to the Ornstein–Uhlenbeck theory of Brownian motion in a force field. The main result is to derive governing SDEs for such systems from a critical point of a stochastic action. Using this result, the paper derives Langevin-type equations for constrained mechanical systems and implements a stochastic analogue of Lagrangian reduction. These are easy consequences of the fact that the stochastic action is intrinsically defined. Stochastic variational integrators (SVIs) are developed using a discrete variational principle. The paper shows that the discrete flow of an SVI is almost surely symplectic and in the presence of symmetry almost surely momentum-map preserving. A first-order mean-squared convergent SVI for mechanical systems on Lie groups is introduced. As an application of the theory, SVIs are exhibited for multiple, randomly forced and torqued rigid bodies interacting via a potential.

Numerical coarsening of inhomogeneous elastic materials

Year: 2009 DOI: 10.1145/1531326.1531357 We propose an approach for efficiently simulating elastic objects made of non-homogeneous, non-isotropic materials. Based on recent developments in homogenization theory, a methodology is introduced to approximate a deformable object made of arbitrary fine structures of various linear elastic materials with a dynamically-similar coarse model. This numerical coarsening of the material properties allows for simulation of fine, heterogeneous structures on very coarse grids while capturing the proper dynamics of the original dynamical system, thus saving orders of magnitude in computational time. Examples including inhomogeneous and/or anisotropic materials can be realistically simulated in realtime with a numerically-coarsened model made of a few mesh elements.

Global Energy Matching Method for Atomistic-to-Continuum Modeling of Self-Assembling Biopolymer Aggregates

Year: 2010 DOI: 10.1137/090781619 This paper studies mathematical models of biopolymer supramolecular aggregates that are formed by the self-assembly of single monomers. We develop a new multiscale numerical approach to model the structural properties of such aggregates. This theoretical approach establishes micro-macro relations between the geometrical and mechanical properties of the monomers and supramolecular aggregates. Most atomistic-to-continuum methods are constrained by a crystalline order or a periodic setting and therefore cannot be directly applied to modeling of soft matter. By contrast, the energy matching method developed in this paper does not require crystalline order and, therefore, can be applied to general microstructures with strongly variable spatial correlations. In this paper we use this method to compute the shape and the bending stiffness of their supramolecular aggregates from known chiral and amphiphilic properties of the short chain peptide monomers. Numerical implementation of our approach demonstrates consistency with results obtained by molecular dynamics simulations.

Long-Run Accuracy of Variational Integrators in the Stochastic Context

Year: 2010 DOI: 10.1137/090758842 This paper presents a Lie–Trotter splitting for inertial Langevin equations (geometric Langevin algorithm) and analyzes its long-time statistical properties. The splitting is defined as a composition of a variational integrator with an Ornstein–Uhlenbeck flow. Assuming that the exact solution and the splitting are geometrically ergodic, the paper proves the discrete invariant measure of the splitting approximates the invariant measure of inertial Langevin equations to within the accuracy of the variational integrator in representing the Hamiltonian. In particular, if the variational integrator admits no energy error, then the method samples the invariant measure of inertial Langevin equations without error. Numerical validation is provided using explicit variational integrators with first-, second-, and fourth-order accuracy.

Uncertainty quantification via codimension one domain partitioning and a new concentration inequality

Year: 2010 DOI: 10.1016/j.sbspro.2010.05.211 In [LOO08], it was proposed that a concentration-of-measure inequality known as Mc-Diarmid's inequality [McD89] be used to provide upper bounds on the failure probability of a system of interest, the response of which depends on a collection of independent random inputs. McDiarmid's inequality has the advantage of providing an upper bound in terms of only the mean response of the system, the failure threshold, and measures of system spread known as the McDiarmid subdiameters. A disadvantage of McDiarmid's inequality is that it that takes a global view of the response function: even if the response function exhibits large plateaus of success with only small, localized regions of failure, McDiarmid's inequality is unable to use this to any advantage. We propose a partitioning algorithm that uses McDiarmid diameters to generate "good" sequences of partitions, on which McDiarmid's inequality can be applied to each partition element, yielding arbitrarily tight upper bounds. We also investigate some new concentration-of-measure inequalities that arise if mean performance is known only through sampling.

Optimal Control Strategies for Robust Certification

Year: 2010 DOI: 10.1115/1.4001375 We present an optimal control methodology, which we refer to as concentration-of-measure optimal control (COMOC), that seeks to minimize a concentration-of-measure upper bound on the probability of failure of a system. The systems under consideration are characterized by a single performance measure that depends on random inputs through a known response function. For these systems, concentration-of-measure upper bound on the probability of failure of a system can be formulated in terms of the mean performance measure and a system diameter that measures the uncertainty in the operation of the system. COMOC then seeks to determine the optimal controls that maximize the confidence in the safe operation of the system, defined as the ratio of the design margin, which is measured by the difference between the mean performance and the design threshold, to the system uncertainty, which is measured by the system diameter. This strategy has been assessed in the case of a robot-arm maneuver for which the performance measure of interest is assumed to be the placement accuracy of the arm tip. The ability of COMOC to significantly increase the design confidence in that particular example of application is demonstrated.

Nonintrusive and Structure Preserving Multiscale Integration of Stiff ODEs, SDEs, and Hamiltonian Systems with Hidden Slow Dynamics via Flow Averaging

Year: 2010 DOI: 10.1137/090771648 We introduce a new class of integrators for stiff ODEs as well as SDEs. Examples of subclasses of systems that we treat are ODEs and SDEs that are sums of two terms, one of which has large coefficients. These integrators are as follows: (i) Multiscale: They are based on flow averaging and thus do not fully resolve the fast variables and have a computational cost determined by slow variables. (ii) Versatile: The method is based on averaging the flows of the given dynamical system (which may have hidden slow and fast processes) instead of averaging the instantaneous drift of assumed separated slow and fast processes. This bypasses the need for identifying explicitly (or numerically) the slow or fast variables. (iii) Nonintrusive: A pre-existing numerical scheme resolving the microscopic time scale can be used as a black box and easily turned into one of the integrators in this paper by turning the large coefficients on over a microscopic time scale and off during a mesoscopic time scale. (iv) Convergent over two scales: They converge strongly over slow processes and in the sense of measures over fast ones. We introduce the related notion of two-scale flow convergence and analyze the convergence of these integrators under the induced topology. (v) Structure preserving: They inherit the structure preserving properties of the legacy integrators from which they are derived. Therefore, for stiff Hamiltonian systems (possibly on manifolds), they can be made to be symplectic, time-reversible, and symmetry preserving (symmetries are group actions that leave the system invariant) in all variables. They are explicit and applicable to arbitrary stiff potentials (that need not be quadratic). Their application to the Fermi–Pasta–Ulam problems shows accuracy and stability over four orders of magnitude of time scales. For stiff Langevin equations, they are symmetry preserving, time-reversible, Boltzmann–Gibbs-reversible, quasi-symplectic on all variables, and conformally symplectic with isotropic friction.

Flux Norm Approach to Finite Dimensional Homogenization Approximations with Non-Separated Scales and High Contrast

Year: 2010 DOI: 10.1007/s00205-010-0302-1 We consider linear divergence-form scalar elliptic equations and vectorial equations for elasticity with rough (L∞(Ω)Ω⊂R^d) coefficients a(x) that, in particular, model media with non-separated scales and high contrast in material properties. While the homogenization of PDEs with periodic or ergodic coefficients and well separated scales is now well understood, we consider here the most general case of arbitrary bounded coefficients. For such problems, we introduce explicit and optimal finite dimensional approximations of solutions that can be viewed as a theoretical Galerkin method with controlled error estimates, analogous to classical homogenization approximations. In particular, this approach allows one to analyze a given medium directly without introducing the mathematical concept of an ε family of media as in classical homogenization. We define the flux norm as the L^2 norm of the potential part of the fluxes of solutions, which is equivalent to the usual H^1-norm. We show that in the flux norm, the error associated with approximating, in a properly defined finite-dimensional space, the set of solutions of the aforementioned PDEs with rough coefficients is equal to the error associated with approximating the set of solutions of the same type of PDEs with smooth coefficients in a standard space (for example, piecewise polynomial). We refer to this property as the transfer property. A simple application of this property is the construction of finite dimensional approximation spaces with errors independent of the regularity and contrast of the coefficients and with optimal and explicit convergence rates. This transfer property also provides an alternative to the global harmonic change of coordinates for the homogenization of elliptic operators that can be extended to elasticity equations. The proofs of these homogenization results are based on a new class of elliptic inequalities. These inequalities play the same role in our approach as the div-curl lemma in classical homogenization.

A cutoff phenomenon in accelerated stochastic simulations of chemical kinetics via flow averaging (FLAVOR-SSA)

Year: 2010 DOI: 10.1063/1.3518419 We present a simple algorithm for the simulation of stiff, discrete-space, continuous-time Markov processes. The algorithm is based on the concept of flow averaging for the integration of stiff ordinary and stochastic differential equations and ultimately leads to a straightforward variation of the the well-known stochastic simulation algorithm (SSA). The speedup that can be achieved by the present algorithm [flow averaging integrator SSA (FLAVOR-SSA)] over the classical SSA comes naturally at the expense of its accuracy. The error of the proposed method exhibits a cutoff phenomenon as a function of its speed-up, allowing for optimal tuning. Two numerical examples from chemical kinetics are provided to illustrate the efficiency of the method.

From efficient symplectic exponentiation of matrices to symplectic integration of high-dimensional Hamiltonian systems with slowly varying quadratic stiff potentials

Year: 2011 DOI: 10.1093/amrx/abr008 We present a multiscale integrator for Hamiltonian systems with slowly varying quadratic stiff potentials that uses coarse timesteps (analogous to what the impulse method uses for constant quadratic stiff potentials). This method is based on the highly-non-trivial introduction of two efficient symplectic schemes for exponentiations of matrices that only require O(n) matrix multiplications operations at each coarse time step for a preset small number n. The proposed integrator is shown to be (i) uniformly convergent on positions; (ii) symplectic in both slow and fast variables; (iii) well adapted to high dimensional systems. Our framework also provides a general method for iteratively exponentiating a slowly varying sequence of (possibly high dimensional) matrices in an efficient way.

Space-time FLAVORS: finite difference, multisymlectic, and pseudospectral integrators for multiscale PDEs

Year: 2011 DOI: 10.48550/arXiv.1104.0272 We present a new class of integrators for stiff PDEs. These integrators are generalizations of FLow AVeraging integratORS (FLAVORS) for stiff ODEs and SDEs introduced in [32] with the following properties: (i) Multiscale: they are based on flow averaging and have a computational cost determined by mesoscopic steps in space and time instead of microscopic steps in space and time; (ii) Versatile: the method is based on averaging the flows of the given PDEs (which may have hidden slow and fast processes). This bypasses the need for identifying explicitly (or numerically) the slow variables or reduced effective PDEs; (iii) Nonintrusive: A pre-existing numerical scheme resolving the microscopic time scale can be used as a black box and easily turned into one of the integrators in this paper by turning the large coefficients on over a microscopic timescale and off during a mesoscopic timescale; (iv) Convergent over two scales: strongly over slow processes and in the sense of measures over fast ones; (v) Structure-preserving: for stiff Hamiltonian PDEs (possibly on manifolds), they can be made to be multi-symplectic, symmetry-preserving (symmetries are group actions that leave the system invariant) in all variables and variational.

Uncertainty quantification via codimension-one partitioning

Year: 2011 DOI: 10.1002/nme.3030 We consider uncertainty quantification in the context of certification, i.e. showing that the probability of some 'failure' event is acceptably small. In this paper, we derive a new method for rigorous uncertainty quantification and conservative certification by combining McDiarmid's inequality with input domain partitioning and a new concentration-of-measure inequality. We show that arbitrarily sharp upper bounds on the probability of failure can be obtained by partitioning the input parameter space appropriately; in contrast, the bound provided by McDiarmid's inequality is usually not sharp. We prove an error estimate for the method (Proposition 3.2); we define a codimension-one recursive partitioning scheme and prove its convergence properties (Theorem 4.1); finally, we apply a new concentration-of-measure inequality to give confidence levels when empirical means are used in place of exact ones (Section 5).

A non-adapted sparse approximation of PDEs with stochastic inputs

Year: 2011 DOI: 10.1016/j.jcp.2011.01.002 We propose a method for the approximation of solutions of PDEs with stochastic coefficients based on the direct, i.e., non-adapted, sampling of solutions. This sampling can be done by using any legacy code for the deterministic problem as a black box. The method converges in probability (with probabilistic error bounds) as a consequence of sparsity and a concentration of measure phenomenon on the empirical correlation between samples. We show that the method is well suited for truly high-dimensional problems.

Rigorous uncertainty quantification without integral testing

Year: 2011 DOI: 10.1016/j.ress.2010.07.013 We describe a rigorous approach for certifying the safe operation of complex systems that bypasses the need for integral testing. We specifically consider systems that have a modular structure. These systems are composed of subsystems, or components, that interact through unidirectional interfaces. We show that, for systems that have the structure of an acyclic graph, it is possible to obtain rigorous upper bounds on the probability of failure of the entire system from an uncertainty analysis of the individual components and their interfaces and without the need for integral testing. Certification is then achieved if the probability of failure upper bound is below an acceptable failure tolerance. We demonstrate the approach by means of an example concerned with the performance of a fractal electric circuit.

Localized Bases for Finite-Dimensional Homogenization Approximations with Nonseparated Scales and High Contrast

Year: 2011 DOI: 10.1137/100813968 We construct finite-dimensional approximations of solution spaces of divergence-form operators with L^∞-coefficients. Our method does not rely on concepts of ergodicity or scale-separation, but on the property that the solution space of these operators is compactly embedded in H^1 if source terms are in the unit ball of L^2 instead of the unit ball of H^(−1). Approximation spaces are generated by solving elliptic PDEs on localized subdomains with source terms corresponding to approximation bases for H^2. The H^1-error estimates show that O(h^(−d))-dimensional spaces with basis elements localized to subdomains of diameter O(hα ln math) (with α ∊ [½,1)) result in an O(h^(2−2α)) accuracy for elliptic, parabolic, and hyperbolic problems. For high-contrast media, the accuracy of the method is preserved, provided that localized subdomains contain buffer zones of width O(h^α ln 1/4), where the contrast of the medium remains bounded. The proposed method can naturally be generalized to vectorial equations (such as elasto-dynamics).

Rigorous model-based uncertainty quantification with application to terminal ballistics—Part II. Systems with uncontrollable inputs and large scatter

Year: 2012 DOI: 10.1016/j.jmps.2011.12.002 This Part II of this series is concerned with establishing the feasibility of an extended data-on-demand (XDoD) uncertainty quantification (UQ) protocol based on concentration-of-measure inequalities and martingale theory. Specific aims are to establish the feasibility of the protocol and its basic properties, including the tightness of the predictions afforded by the protocol. The assessment is based on an application to terminal ballistics and a specific system configuration consisting of 6061-T6 aluminum plates struck by spherical 440c stainless steel projectiles at ballistic impact speeds in the range of 2.4–2.8 km/s. The system's inputs are the plate thickness, plate obliquity and impact velocity. The perforation area is chosen as the sole performance measure of the system. The objective of the UQ analysis is to certify the lethality of the projectile, i.e., that the projectile perforates the plate with high probability over a prespecified range of impact velocities, plate thicknesses and plate obliquities. All tests were conducted at Caltech's Small Particle Hypervelocity Range (SPHIR), which houses a two-stage gas gun. A feature of this facility is that the impact velocity, while amenable to precise measurement, cannot be controlled precisely but varies randomly according to a known probability density function. In addition, due to a competition between petalling and plugging mechanisms for the material system under consideration, the measured perforation area exhibits considerable scatter. The analysis establishes the feasibility of the XDoD UQ protocol as a rigorous yet practical approach for model-based certification of complex systems characterized by uncontrollable inputs and noisy experimental data.

Rigorous model-based uncertainty quantification with application to terminal ballistics, part I: Systems with controllable inputs and small scatter

Year: 2012 DOI: 10.1016/j.jmps.2011.12.001 This work is concerned with establishing the feasibility of a data-on-demand (DoD) uncertainty quantification (UQ) protocol based on concentration-of-measure inequalities. Specific aims are to establish the feasibility of the protocol and its basic properties, including the tightness of the predictions afforded by the protocol. The assessment is based on an application to terminal ballistics and a specific system configuration consisting of 6061-T6 aluminum plates struck by spherical S-2 tool steel projectiles at ballistic impact speeds. The system's inputs are the plate thickness and impact velocity and the perforation area is chosen as the sole performance measure of the system. The objective of the UQ analysis is to certify the lethality of the projectile, i.e., that the projectile perforates the plate with high probability over a prespecified range of impact velocities and plate thicknesses. The net outcome of the UQ analysis is an M/U ratio, or confidence factor, of 2.93, indicative of a small probability of no perforation of the plate over its entire operating range. The high-confidence (>99.9%) in the successful operation of the system afforded the analysis and the small number of tests (40) required for the determination of the modeling-error diameter, establishes the feasibility of the DoD UQ protocol as a rigorous yet practical approach for model-based certification of complex systems.

Control of a model of DNA division via parametric resonance

Year: 2013 DOI: 10.1063/1.4790835 We study the internal resonance, energy transfer, activation mechanism, and control of a model of DNA division via parametric resonance. While the system is robust to noise, this study shows that it is sensitive to specific fine scale modes and frequencies that could be targeted by low intensity electro-magnetic fields for triggering and controlling the division. The DNA model is a chain of pendula in a Morse potential. While the (possibly parametrically excited) system has a large number of degrees of freedom and a large number of intrinsic time scales, global and slow variables can be identified by (1) first reducing its dynamic to two modes exchanging energy between each other and (2) averaging the dynamic of the reduced system with respect to the phase of the fastest mode. Surprisingly, the global and slow dynamic of the system remains Hamiltonian (despite the parametric excitation) and the study of its associated effective potential shows how parametric excitation can turn the unstable open state into a stable one. Numerical experiments support the accuracy of the time-averaged reduced Hamiltonian in capturing the global and slow dynamic of the full system.

Optimal Uncertainty Quantification

Year: 2013 DOI: 10.1137/10080782X We propose a rigorous framework for uncertainty quantification (UQ) in which the UQ objectives and its assumptions/information set are brought to the forefront. This framework, which we call optimal uncertainty quantification (OUQ), is based on the observation that, given a set of assumptions and information about the problem, there exist optimal bounds on uncertainties: these are obtained as values of well-defined optimization problems corresponding to extremizing probabilities of failure, or of deviations, subject to the constraints imposed by the scenarios compatible with the assumptions and information. In particular, this framework does not implicitly impose inappropriate assumptions, nor does it repudiate relevant information. Although OUQ optimization problems are extremely large, we show that under general conditions they have finite-dimensional reductions. As an application, we develop optimal concentration inequalities (OCI) of Hoeffding and McDiarmid type. Surprisingly, these results show that uncertainties in input parameters, which propagate to output uncertainties in the classical sensitivity analysis paradigm, may fail to do so if the transfer functions (or probability distributions) are imperfectly known. We show how, for hierarchical structures, this phenomenon may lead to the nonpropagation of uncertainties or information across scales. In addition, a general algorithmic framework is developed for OUQ and is tested on the Caltech surrogate model for hypervelocity impact and on the seismic safety assessment of truss structures, suggesting the feasibility of the framework for important complex systems. The introduction of this paper provides both an overview of the paper and a self-contained minitutorial on the basic concepts and issues of UQ.

Variational integrators for electric circuits

Year: 2013 DOI: 10.1016/j.jcp.2013.02.006 In this contribution, we develop a variational integrator for the simulation of (stochastic and multiscale) electric circuits. When considering the dynamics of an electric circuit, one is faced with three special situations: 1. The system involves external (control) forcing through external (controlled) voltage sources and resistors. 2. The system is constrained via the Kirchhoff current (KCL) and voltage laws (KVL). 3. The Lagrangian is degenerate. Based on a geometric setting, an appropriate variational formulation is presented to model the circuit from which the equations of motion are derived. A time-discrete variational formulation provides an iteration scheme for the simulation of the electric circuit. Dependent on the discretization, the intrinsic degeneracy of the system can be canceled for the discrete variational scheme. In this way, a variational integrator is constructed that gains several advantages compared to standard integration tools for circuits; in particular, a comparison to BDF methods (which are usually the method of choice for the simulation of electric circuits) shows that even for simple LCR circuits, a better energy behavior and frequency spectrum preservation can be observed using the developed variational integrator.

On the equilibrium of simplicial masonry structures

Year: 2013 DOI: 10.1145/2461912.2461932 We present a novel approach for the analysis and design of self-supporting simplicial masonry structures. A finite-dimensional formulation of their compressive stress field is derived, offering a new interpretation of thrust networks through numerical homogenization theory. We further leverage geometric properties of the resulting force diagram to identify a set of reduced coordinates characterizing the equilibrium of simplicial masonry. We finally derive computational form-finding tools that improve over previous work in efficiency, accuracy, and scalability.

Optimal Uncertainty Quantification for Legacy Data Observations of Lipschitz Functions

Year: 2013 DOI: 10.1051/m2an/2013083 We consider the problem of providing optimal uncertainty quantification (UQ) – and hence rigorous certification – for partially-observed functions. We present a UQ framework within which the observations may be small or large in number, and need not carry information about the probability distribution of the system in operation. The UQ objectives are posed as optimization problems, the solutions of which are optimal bounds on the quantities of interest; we consider two typical settings, namely parameter sensitivities (McDiarmid diameters) and output deviation (or failure) probabilities. The solutions of these optimization problems depend non-trivially (even non-monotonically and discontinuously) upon the specified legacy data. Furthermore, the extreme values are often determined by only a few members of the data set; in our principal physically-motivated example, the bounds are determined by just 2 out of 32 data points, and the remainder carry no information and could be neglected without changing the final answer. We propose an analogue of the simplex algorithm from linear programming that uses these observations to offer efficient and rigorous UQ for high-dimensional systems with high-cardinality legacy data. These findings suggest natural methods for selecting optimal (maximally informative) next experiments.

Preface

Year: 2014 DOI: 10.1051/m2an/2013108 The mathematical study of "multiscale problems" has grown remarkably since the seventies beyond the asymptotic analysis of PDE's governing the behavior of heterogeneous media. The search for sharp bounds on the effective moduli of composites and homogenization approximation errors has led investigators to derive as much information as possible about fields in composites, and the behavior of correctors in periodic and stochastic environments.

Polyharmonic homogenization, rough polyharmonic splines and sparse super-localization

Year: 2014 DOI: 10.1051/m2an/2013118 We introduce a new variational method for the numerical homogenization of divergence form elliptic, parabolic and hyperbolic equations with arbitrary rough (L^∞) coefficients. Our method does not rely on concepts of ergodicity or scale-separation but on compactness properties of the solution space and a new variational approach to homogenization. The approximation space is generated by an interpolation basis (over scattered points forming a mesh of resolution H) minimizing the L^2 norm of the source terms; its (pre-)computation involves minimizing O(H^(-d)) quadratic (cell) problems on (super-)localized sub-domains of size O(H ln(1/H)). The resulting localized linear systems remain sparse and banded. The resulting interpolation basis functions are biharmonic for d ≤ 3, and polyharmonic for d ≥ 4, for the operator -div(a∇.) and can be seen as a generalization of polyharmonic splines to differential operators with arbitrary rough coefficients. The accuracy of the method (O(H)) in energy norm and independent from aspect ratios of the mesh formed by the scattered points) is established via the introduction of a new class of higher-order Poincaré inequalities. The method bypasses (pre-)computations on the full domain and naturally generalizes to time dependent problems, it also provides a natural solution to the inverse problem of recovering the solution of a divergence form elliptic equation from a finite number of point measurements.

Optimal uncertainty quantification with model uncertainty and legacy data

Year: 2014 DOI: 10.1016/j.jmps.2014.07.007 We present an optimal uncertainty quantification (OUQ) protocol for systems that are characterized by an existing physics-based model and for which only legacy data is available, i.e., no additional experimental testing of the system is possible. Specifically, the OUQ strategy developed in this work consists of using the legacy data to establish, in a probabilistic sense, the level of error of the model, or modeling error, and to subsequently use the validated model as a basis for the determination of probabilities of outcomes. The quantification of modeling uncertainty specifically establishes, to a specified confidence, the probability that the actual response of the system lies within a certain distance of the model. Once the extent of model uncertainty has been established in this manner, the model can be conveniently used to stand in for the actual or empirical response of the system in order to compute probabilities of outcomes. To this end, we resort to the OUQ reduction theorem of Owhadi et al. (2013) in order to reduce the computation of optimal upper and lower bounds on probabilities of outcomes to a finite-dimensional optimization problem. We illustrate the resulting UQ protocol by means of an application concerned with the response to hypervelocity impact of 6061-T6 Aluminum plates by Nylon 6/6 impactors at impact velocities in the range of 5–7 km/s. The ability of the legacy OUQ protocol to process diverse information on the system and its ability to supply rigorous bounds on system performance under realistic—and less than ideal—scenarios demonstrated by the hypervelocity impact application is remarkable.

On the Brittleness of Bayesian Inference

Year: 2015 DOI: 10.1137/130938633 With the advent of high-performance computing, Bayesian methods are becoming increasingly popular tools for the quantification of uncertainty throughout science and industry. Since these methods can impact the making of sometimes critical decisions in increasingly complicated contexts, the sensitivity of their posterior conclusions with respect to the underlying models and prior beliefs is a pressing question to which there currently exist positive and negative answers. We report new results suggesting that, although Bayesian methods are robust when the number of possible outcomes is finite or when only a finite number of marginals of the data-generating distribution are unknown, they could be generically brittle when applied to continuous systems (and their discretizations) with finite information on the data-generating distribution. If closeness is defined in terms of the total variation (TV) metric or the matching of a finite system of generalized moments, then (1) two practitioners who use arbitrarily close models and observe the same (possibly arbitrarily large amount of) data may reach opposite conclusions; and (2) any given prior and model can be slightly perturbed to achieve any desired posterior conclusion. The mechanism causing brittleness/robustness suggests that learning and robustness are antagonistic requirements, which raises the possibility of a missing stability condition when using Bayesian inference in a continuous world under finite information.

Brittleness of Bayesian inference under finite information in a continuous world

Year: 2015 DOI: 10.1214/15-EJS989 We derive, in the classical framework of Bayesian sensitivity analysis, optimal lower and upper bounds on posterior values obtained from Bayesian models that exactly capture an arbitrarily large number of finite-dimensional marginals of the data-generating distribution and/or that are as close as desired to the data-generating distribution in the Prokhorov or total variation metrics; these bounds show that such models may still make the largest possible prediction error after conditioning on an arbitrarily large number of sample data measured at finite precision. These results are obtained through the development of a reduction calculus for optimization problems over measures on spaces of measures. We use this calculus to investigate the mechanisms that generate brittleness/robustness and, in particular, we observe that learning and robustness are antagonistic properties. It is now well understood that the numerical resolution of PDEs requires the satisfaction of specific stability conditions. Is there a missing stability condition for using Bayesian inference in a continuous world under finite information?

Convex Optimal Uncertainty Quantification

Year: 2015 DOI: 10.1137/13094712X Optimal uncertainty quantification (OUQ) is a framework for numerical extreme-case analysis of stochastic systems with imperfect knowledge of the underlying probability distribution. This paper presents sufficient conditions under which an OUQ problem can be reformulated as a finite-dimensional convex optimization problem, for which efficient numerical solutions can be obtained. The sufficient conditions include that the objective function is piecewise concave and the constraints are piecewise convex. In particular, we show that piecewise concave objective functions may appear in applications where the objective is defined by the optimal value of a parameterized linear program.

Bayesian Numerical Homogenization

Year: 2015 DOI: 10.1137/140974596 Numerical homogenization, i.e., the finite-dimensional approximation of solution spaces of PDEs with arbitrary rough coefficients, requires the identification of accurate basis elements. These basis elements are oftentimes found after a laborious process of scientific investigation and plain guesswork. Can this identification problem be facilitated? Is there a general recipe/decision framework for guiding the design of basis elements? We suggest that the answer to the above questions could be positive based on the reformulation of numerical homogenization as a Bayesian inference problem in which a given PDE with rough coefficients (or multiscale operator) is excited with noise (random right-hand side/source term) and one tries to estimate the value of the solution at a given point based on a finite number of observations. We apply this reformulation to the identification of bases for the numerical homogenization of arbitrary integro-differential equations and show that these bases have optimal recovery properties. In particular we show how rough polyharmonic splines can be rediscovered as the optimal solution of a Gaussian filtering problem.

Temporal homogenization of linear ODEs, with applications to parametric super-resonance and energy harvest

Year: 2015 DOI: 10.1007/s00205-015-0932-4 We consider the temporal homogenization of linear ODEs of the form ẋ =Ax+ϵP(t)x+f(t), where P(t) is periodic and ϵ is small. Using a 2-scale expansion approach, we obtain the long-time approximation x(t)≈exp(At) (Ω(t)+∫^t_0exp(−Aτ)f(τ)dτ), where Ω solves the cell problem Ω=ϵBΩ+ϵF(t) with an effective matrix B and an explicitly-known F(t). We provide necessary and sufficient conditions for the accuracy of the approximation (over a O(ϵ^(−1)) time-scale), and show how B can be computed (at a cost independent of ϵ). As a direct application, we investigate the possibility of using RLC circuits to harvest the energy contained in small scale oscillations of ambient electromagnetic fields (such as Schumann resonances). Although a RLC circuit parametrically coupled to the field may achieve such energy extraction via parametric resonance, its resistance R needs to be smaller than a threshold κ proportional to the fluctuations of the field, thereby limiting practical applications. We show that if n RLC circuits are appropriately coupled via mutual capacitances or inductances, then energy extraction can be achieved when the resistance of each circuit is smaller than nκ. Hence, if the resistance of each circuit has a non-zero fixed value, energy extraction can be made possible through the coupling of a sufficiently large number n of circuits (n≈1000 for the first mode of Schumann resonances and contemporary values of capacitances, inductances and resistances). The theory is also applied to the control of the oscillation amplitude of a (damped) oscillator.

Variational and linearly implicit integrators, with applications

Year: 2016 DOI: 10.1093/imanum/dru064 We show that symplectic and linearly implicit integrators proposed by Zhang & Skeel (1997, Cheap implicit symplectic integrators. Appl. Numer. Math., 25, 297–302) are variational linearizations of Newmark methods. When used in conjunction with penalty methods (i.e., methods that replace constraints by stiff potentials), these integrators permit coarse time-stepping of holonomically constrained mechanical systems and bypass the resolution of nonlinear systems. Although penalty methods are widely employed, an explicit link to Lagrange multiplier approaches appears to be lacking; such a link is now provided (in the context of two-scale flow convergence (Tao, M., Owhadi, H. & Marsden, J. E. (2010) Nonintrusive and structure-preserving multiscale integration of stiff ODEs, SDEs and Hamiltonian systems with hidden slow dynamics via flow averaging. Multiscale Model. Simul., 8, 1269–1324). The variational formulation also allows efficient simulations of mechanical systems on Lie groups.

Extreme points of a ball about a measure with finite support

Year: 2017 DOI: 10.4310/CMS.2017.v15.n1.a4 We show that, for the space of Borel probability measures on a Borel subset of a Polish metric space, the extreme points of the Prokhorov, Monge–Wasserstein and Kantorovich metric balls about a measure whose support has at most n points, consist of measures whose supports have at most n+2 points. Moreover, we use the Strassen and Kantorovich–Rubinstein duality theorems to develop representations of supersets of the extreme points based on linear programming, and then develop these representations towards the goal of their efficient computation.

Multigrid with rough coefficients and Multiresolution operator decomposition from Hierarchical Information Games

Year: 2017 DOI: 10.1137/15M1013894 We introduce a near-linear complexity (geometric and meshless/algebraic) multigrid/multiresolution method for PDEs with rough (L∞) coefficients with rigorous a priori accuracy and performance estimates. The method is discovered through a decision/game theory formulation of the problems of (1) identifying restriction and interpolation operators, (2) recovering a signal from incomplete measurements based on norm constraints on its image under a linear operator, and (3) gambling on the value of the solution of the PDE based on a hierarchy of nested measurements of its solution or source term. The resulting elementary gambles form a hierarchy of (deterministic) basis functions of H^1_0 (Ω) (gamblets) that (1) are orthogonal across subscales/subbands with respect to the scalar product induced by the energy norm of the PDE, (2) enable sparse compression of the solution space in H^1_0 (Ω), and (3) induce an orthogonal multiresolution operator decomposition. The operating diagram of the multigrid method is that of an inverted pyramid in which gamblets are computed locally (by virtue of their exponential decay) and hierarchically (from fine to coarse scales) and the PDE is decomposed into a hierarchy of independent linear systems with uniformly bounded condition numbers. The resulting algorithm is parallelizable both in space (via localization) and in bandwidth/subscale (subscales can be computed independently from each other). Although the method is deterministic, it has a natural Bayesian interpretation under the measure of probability emerging (as a mixed strategy) from the information game formulation, and multiresolution approximations form a martingale with respect to the filtration induced by the hierarchy of nested measurements.

Separability of reproducing kernel Hilbert spaces

Year: 2017 DOI: 10.1090/proc/13354 We demonstrate that a reproducing kernel Hilbert or Banach space of functions on a separable absolute Borel space or an analytic subset of a Polish space is separable if it possesses a Borel measurable feature map.

Self-Powered Dynamic Systems in the Framework of Optimal Uncertainty Quantification

Year: 2017 DOI: 10.1115/1.4036367 The energy that is needed for operating a self-powered device is provided by the energy excess in the system in the form of kinetic energy, or a combination of regenerative and renewable energy. This paper addresses the energy exchange issues pertaining to regenerative and renewable energy in the development of a self-powered dynamic system. A rigorous framework that explores the supply and demand of energy for self-powered systems is developed, which considers uncertainties and optimal bounds, in the context of optimal uncertainty quantification. Examples of regenerative and solar-powered systems are given, and the analysis of self-powered feedback control for developing a fully self-powered dynamic system is discussed.

Gamblets for opening the complexity-bottleneck of implicit schemes for hyperbolic and parabolic ODEs/PDEs with rough coefficients

Year: 2017 DOI: 10.1016/j.jcp.2017.06.037 Implicit schemes are popular methods for the integration of time dependent PDEs such as hyperbolic and parabolic PDEs. However the necessity to solve corresponding linear systems at each time step constitutes a complexity bottleneck in their application to PDEs with rough coefficients. We present a generalization of gamblets introduced in [62] enabling the resolution of these implicit systems in near-linear complexity and provide rigorous a-priori error bounds on the resulting numerical approximations of hyperbolic and parabolic PDEs. These generalized gamblets induce a multiresolution decomposition of the solution space that is adapted to both the underlying (hyperbolic and parabolic) PDE (and the system of ODEs resulting from space discretization) and to the time-steps of the numerical scheme.

Qualitative Robustness in Bayesian Inference

Year: 2017 DOI: 10.1051/ps/2017014 The practical implementation of Bayesian inference requires numerical approximation when closed-form expressions are not available. What types of accuracy (convergence) of the numerical approximations guarantee robustness and what types do not? In particular, is the recursive application of Bayes' rule robust when subsequent data or posteriors are approximated? When the prior is the push forward of a distribution by the map induced by the solution of a PDE, in which norm should that solution be approximated? Motivated by such questions, we investigate the sensitivity of the distribution of posterior distributions (i.e. of posterior distribution-valued random variables, randomized through the data) with respect to perturbations of the prior and data-generating distributions in the limit when the number of data points grows towards infinity.

Operator-adapted wavelets for finite-element differential forms

Year: 2019 DOI: 10.1016/j.jcp.2019.02.018 We introduce in this paper an operator-adapted multiresolution analysis for finite-element differential forms. From a given continuous, linear, bijective, and self-adjoint positive-definite operator L, a hierarchy of basis functions and associated wavelets for discrete differential forms is constructed in a fine-to-coarse fashion and in quasilinear time. The resulting wavelets are L-orthogonal across all scales, and can be used to derive a Galerkin discretization of the operator such that its stiffness matrix becomes block-diagonal, with uniformly well-conditioned and sparse blocks. Because our approach applies to arbitrary differential p-forms, we can derive both scalar-valued and vector-valued wavelets block-diagonalizing a prescribed operator. We also discuss the generality of the construction by pointing out that it applies to various types of computational grids, offers arbitrary smoothness orders of basis functions and wavelets, and can accommodate linear differential constraints such as divergence-freeness. Finally, we demonstrate the benefits of the corresponding operator-adapted multiresolution decomposition for coarse-graining and model reduction of linear and non-linear partial differential equations.

Kernel Flows: From learning kernels from data into the abyss

Year: 2019 DOI: 10.1016/j.jcp.2019.03.040 Learning can be seen as approximating an unknown function by interpolating the training data. Although Kriging offers a solution to this problem, it requires the prior specification of a kernel and it is not scalable to large datasets. We explore a numerical approximation approach to kernel selection/construction based on the simple premise that a kernel must be good if the number of interpolation points can be halved without significant loss in accuracy (measured using the intrinsic RKHS norm ∥·∥ associated with the kernel). We first test and motivate this idea on a simple problem of recovering the Green's function of an elliptic PDE (with inhomogeneous coefficients) from the sparse observation of one of its solutions. Next we consider the problem of learning non-parametric families of deep kernels of the form K_1(F_n(x), F_n(x')) with F_(n+1) = (I_d + ϵG_(n+1)) ◦ F_n and G_(n+1) ∈ span{K_1(F_n(x_i), ·)}. With the proposed approach constructing the kernel becomes equivalent to integrating a stochastic data driven dynamical system, which allows for the training of very deep (bottomless) networks and the exploration of their properties. These networks learn by constructing flow maps in the kernel and input spaces via incremental data-dependent deformations/perturbations (appearing as the cooperative counterpart of adversarial examples) and, at profound depths, they (1) can achieve accurate classification from only one data point per class (2) appear to learn archetypes of each class (3) expand distances between points that are in different classes and contract distances between points in the same class. For kernels parameterized by the weights of Convolutional Neural Networks, minimizing approximation errors incurred by halving random subsets of interpolation points, appears to outperform training (the same CNN architecture) with relative entropy and dropout.

Multiresolution Operator Decomposition for Flow Simulation in Fractured Porous Media

Year: 2019 DOI: 10.1016/j.jcp.2018.12.032 Fractures should be simulated accurately given their significant effects on whole flow patterns in porous media. But such high-resolution simulations impose severe computational challenges to numerical methods in the applications. Therefore, the demand for accurate and efficient coarse-graining techniques is increasing. In this work, a near-linear complexity multiresolution operator decomposition method is proposed for solving and coarse graining flow problems in fractured porous media. We use the Discrete Fracture Model (DFM) to describe fractures, in which the fractures are explicitly represented as -dimensional elements. Using operator adapted wavelets, the solution space is decomposed into subspaces where DFM subsolutions can be computed by solving sparse and well-conditioned linear systems. By keeping only the coarse-scale part of the solution space, we furthermore obtain a reduced order model. We provide numerical experiments that investigate the accuracy of the reduced order model for different resolutions and different choices of medium.

Material-adapted refinable basis functions for elasticity simulation

Year: 2019 DOI: 10.1145/3355089.3356567 In this paper, we introduce a hierarchical construction of material-adapted refinable basis functions and associated wavelets to offer efficient coarse-graining of linear elastic objects. While spectral methods rely on global basis functions to restrict the number of degrees of freedom, our basis functions are locally supported; yet, unlike typical polynomial basis functions, they are adapted to the material inhomogeneity of the elastic object to better capture its physical properties and behavior. In particular, they share spectral approximation properties with eigenfunctions, offering a good compromise between computational complexity and accuracy. Their construction involves only linear algebra and follows a fine-to-coarse approach, leading to a block-diagonalization of the stiffness matrix where each block corresponds to an intermediate scale space of the elastic object. Once this hierarchy has been precomputed, we can simulate an object at runtime on very coarse resolution grids and still capture the correct physical behavior, with orders of magnitude speedup compared to a fine simulation. We show on a variety of heterogeneous materials that our approach outperforms all previous coarse-graining methods for elasticity.

De-noising by thresholding operator adapted wavelets

Year: 2019 DOI: 10.1007/s11222-019-09893-x Donoho and Johnstone (Ann Stat 26(3):879–921, 1998) proposed a method from reconstructing an unknown smooth function u from noisy data u+ζ by translating the empirical wavelet coefficients of u+ζ towards zero. We consider the situation where the prior information on the unknown function u may not be the regularity of u but that of Lu where L is a linear operator (such as a PDE or a graph Laplacian). We show that the approximation of u obtained by thresholding the gamblet (operator adapted wavelet) coefficients of u+ζ is near minimax optimal (up to a multiplicative constant), and with high probability, its energy norm (defined by the operator) is bounded by that of u up to a constant depending on the amplitude of the noise. Since gamblets can be computed in O(NpolylogN) complexity and are localized both in space and eigenspace, the proposed method is of near-linear complexity and generalizable to nonhomogeneous noise.

Fast eigenpairs computation with operator adapted wavelets and hierarchical subspace correction

Year: 2019 DOI: 10.1137/18M1194079 We present a method for the fast computation of the eigenpairs of a bijective positive symmetric linear operator L. The method is based on a combination of operator adapted wavelets (gamblets) with hierarchical subspace correction. First, gamblets provide a raw but fast approximation of the eigensubspaces of L by block-diagonalizing L into sparse and well-conditioned blocks. Next, the hierarchical subspace correction method computes the eigenpairs associated with the Galerkin restriction of L to a coarse (low-dimensional) gamblet subspace and then corrects those eigenpairs by solving a hierarchy of linear problems in the finer gamblet subspaces (from coarse to fine, using multigrid iteration). The proposed algorithm is robust to the presence of multiple (a continuum of) scales and is shown to be of near-linear complexity when L is an (arbitrary local, e.g., differential) operator mapping H^s₀(Ω) to H^(−s)(Ω) (e.g., an elliptic PDE with rough coefficients).

Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity

Year: 2021 DOI: 10.1137/19M129526X Dense kernel matrices Θ∈R^(N×N) obtained from point evaluations of a covariance function G at locations {x_i}_(1≤i≤N)⊂ ℝ^d arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions of elliptic boundary value problems and homogeneously distributed sampling points, we show how to identify a subset S⊂{1,…,N}², with #S=O(N log(N)log^d(N/ϵ)), such that the zero fill-in incomplete Cholesky factorization of the sparse matrix Θ_(i,j)1_((i,j)∈S) is an ϵ-approximation of Θ. This factorization can provably be obtained in complexity O(N log(N) log^d(N/ϵ)) in space and O(N log²(N) log^(2d))(N/ϵ)) in time, improving upon the state of the art for general elliptic operators; we further present numerical evidence that d can be taken to be the intrinsic dimension of the data set rather than that of the ambient space. The algorithm only needs to know the spatial configuration of the x_i and does not require an analytic representation of G. Furthermore, this factorization straightforwardly provides an approximate sparse PCA with optimal rate of convergence in the operator norm. Hence, by using only subsampling and the incomplete Cholesky factorization, we obtain, at nearly linear complexity, the compression, inversion, and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky factorization we also obtain a solver for elliptic PDE with complexity O (N log^d(N/ϵ)) in space and O (N log^(2d)(N/ϵ)) in time, improving upon the state of the art for general elliptic operators.

Sparse Cholesky Factorization by Kullback-Leibler Minimization

Year: 2021 DOI: 10.1137/20M1336254 We propose to compute a sparse approximate inverse Cholesky factor L of a dense covariance matrix Θ by minimizing the Kullback--Leibler divergence between the Gaussian distributions N(0,Θ) and N(0,L^(−⊤)L⁻¹), subject to a sparsity constraint. Surprisingly, this problem has a closed-form solution that can be computed efficiently, recovering the popular Vecchia approximation in spatial statistics. Based on recent results on the approximate sparsity of inverse Cholesky factors of Θ obtained from pairwise evaluation of Green's functions of elliptic boundary-value problems at points {x_i}_(1≤i≤N) ⊂ ℝ^d, we propose an elimination ordering and sparsity pattern that allows us to compute ϵ-approximate inverse Cholesky factors of such Θ in computational complexity O(N log(N/ϵ)^d) in space and O(N log(N/ϵ)^(2d)) in time. To the best of our knowledge, this is the best asymptotic complexity for this class of problems. Furthermore, our method is embarrassingly parallel, automatically exploits low-dimensional structure in the data, and can perform Gaussian-process regression in linear (in N) space complexity. Motivated by its optimality properties, we propose applying our method to the joint covariance of training and prediction points in Gaussian-process regression, greatly improving stability and computational cost. Finally, we show how to apply our method to the important setting of Gaussian processes with additive noise, compromising neither accuracy nor computational complexity.

Consistency of empirical Bayes and kernel flow for hierarchical parameter estimation

Year: 2021 DOI: 10.1090/mcom/3649 Gaussian process regression has proven very powerful in statistics, machine learning and inverse problems. A crucial aspect of the success of this methodology, in a wide range of applications to complex and real-world problems, is hierarchical modeling and learning of hyperparameters. The purpose of this paper is to study two paradigms of learning hierarchical parameters: one is from the probabilistic Bayesian perspective, in particular, the empirical Bayes approach that has been largely used in Bayesian statistics; the other is from the deterministic and approximation theoretic view, and in particular the kernel flow algorithm that was proposed recently in the machine learning literature. Analysis of their consistency in the large data limit, as well as explicit identification of their implicit bias in parameter learning, are established in this paper for a Matérn-like model on the torus. A particular technical challenge we overcome is the learning of the regularity parameter in the Matérn-like field, for which consistency results have been very scarce in the spatial statistics literature. Moreover, we conduct extensive numerical experiments beyond the Matérn-like model, comparing the two algorithms further. These experiments demonstrate learning of other hierarchical parameters, such as amplitude and lengthscale; they also illustrate the setting of model misspecification in which the kernel flow approach could show superior performance to the more traditional empirical Bayes approach.

Learning dynamical systems from data: A simple cross-validation perspective, part I: Parametric kernel flows

Year: 2021 DOI: 10.1016/j.physd.2020.132817 Regressing the vector field of a dynamical system from a finite number of observed states is a natural way to learn surrogate models for such systems. We present variants of cross-validation (Kernel Flows (Owhadi and Yoo, 2019) and its variants based on Maximum Mean Discrepancy and Lyapunov exponents) as simple approaches for learning the kernel used in these emulators.

Simple, low-cost and accurate data-driven geophysical forecasting with learned kernels

Year: 2021 DOI: 10.1098/rspa.2021.0326 Modelling geophysical processes as low-dimensional dynamical systems and regressing their vector field from data is a promising approach for learning emulators of such systems. We show that when the kernel of these emulators is also learned from data (using kernel flows, a variant of cross-validation), then the resulting data-driven models are not only faster than equation-based models but are easier to train than neural networks such as the long short-term memory neural network. In addition, they are also more accurate and predictive than the latter. When trained on geophysical observational data, for example the weekly averaged global sea-surface temperature, considerable gains are also observed by the proposed technique in comparison with classical partial differential equation-based models in terms of forecast computational cost and accuracy. When trained on publicly available re-analysis data for the daily temperature of the North American continent, we see significant improvements over classical baselines such as climatology and persistence-based forecast techniques. Although our experiments concern specific examples, the proposed approach is general, and our results support the viability of kernel methods (with learned kernels) for interpretable and computationally efficient geophysical forecasting for a large diversity of processes.

Deep regularization and direct training of the inner layers of Neural Networks with Kernel Flows

Year: 2021 DOI: 10.1016/j.physd.2021.132952 We introduce a new regularization method for Artificial Neural Networks (ANNs) based on the Kernel Flow (KF) algorithm. The algorithm was introduced in Owhadi and Yoo (2019) as a method for kernel selection in regression/kriging based on the minimization of the loss of accuracy incurred by halving the number of interpolation points in random batches of the dataset. Writing f_θ(x) = (f^((n))_(θn)∘f^((n−1))_(θn−1)∘⋯∘f^((1))_(θ₁))(x) for the functional representation of compositional structure of the ANN (where θ_i are the weights and biases of the layer i), the inner layers outputs h^((i))(x) = (f^((i))_(θi)∘f^((i−1))_(θi−1)∘⋯∘f^((1))_(θ1))(x) define a hierarchy of feature maps and a hierarchy of kernels k^((i))(x,x′) = exp(−γ_i∥h^((i))(x)−h^((i))(x′)∥²₂). When combined with a batch of the dataset, these kernels produce KF losses e(i)₂ (defined as the L² regression error incurred by using a random half of the batch to predict the other half) depending on the parameters of the inner layers θ₁,…,θ_i (and γ_i). The proposed method simply consists of aggregating (as a weighted sum) a subset of these KF losses with a classical output loss (e.g., cross-entropy). We test the proposed method on Convolutional Neural Networks (CNNs) and Wide Residual Networks (WRNs) without alteration of their structure nor their output classifier and report reduced test errors, decreased generalization gaps, and increased robustness to distribution shift without a significant increase in computational complexity relative to standard CNN and WRN training (with Drop Out and Batch Normalization). We suspect that these results might be explained by the fact that while conventional training only employs a linear functional (a generalized moment) of the empirical distribution defined by the dataset and can be prone to trapping in the Neural Tangent Kernel regime (under over-parameterizations), the proposed loss function (defined as a nonlinear functional of the empirical distribution) effectively trains the underlying kernel defined by the CNN beyond regressing the data with that kernel.

Solving and learning nonlinear PDEs with Gaussian processes

Year: 2021 DOI: 10.1016/j.jcp.2021.110668 We introduce a simple, rigorous, and unified framework for solving nonlinear partial differential equations (PDEs), and for solving inverse problems (IPs) involving the identification of parameters in PDEs, using the framework of Gaussian processes. The proposed approach: (1) provides a natural generalization of collocation kernel methods to nonlinear PDEs and IPs; (2) has guaranteed convergence for a very general class of PDEs, and comes equipped with a path to compute error bounds for specific PDE approximations; (3) inherits the state-of-the-art computational complexity of linear solvers for dense kernel matrices. The main idea of our method is to approximate the solution of a given PDE as the maximum a posteriori (MAP) estimator of a Gaussian process conditioned on solving the PDE at a finite number of collocation points. Although this optimization problem is infinite-dimensional, it can be reduced to a finite-dimensional one by introducing additional variables corresponding to the values of the derivatives of the solution at collocation points; this generalizes the representer theorem arising in Gaussian process regression. The reduced optimization problem has the form of a quadratic objective function subject to nonlinear constraints; it is solved with a variant of the Gauss–Newton method. The resulting algorithm (a) can be interpreted as solving successive linearizations of the nonlinear PDE, and (b) in practice is found to converge in a small number of iterations (2 to 10), for a wide range of PDEs. Most traditional approaches to IPs interleave parameter updates with numerical solution of the PDE; our algorithm solves for both parameter and PDE solution simultaneously. Experiments on nonlinear elliptic PDEs, Burgers' equation, a regularized Eikonal equation, and an IP for permeability identification in Darcy flow illustrate the efficacy and scope of our framework.

Computational graph completion

Year: 2022 DOI: 10.1007/s40687-022-00320-8 We introduce a framework for generating, organizing, and reasoning with computational knowledge. It is motivated by the observation that most problems in Computational Sciences and Engineering (CSE) can be described as that of completing (from data) a computational graph (or hypergraph) representing dependencies between functions and variables. In that setting nodes represent variables and edges (or hyperedges) represent functions (or functionals). Functions and variables may be known, unknown, or random. Data come in the form of observations of distinct values of a finite number of subsets of the variables of the graph (satisfying its functional dependencies). The underlying problem combines a regression problem (approximating unknown functions) with a matrix completion problem (recovering unobserved variables in the data). Replacing unknown functions by Gaussian processes and conditioning on observed data provides a simple but efficient approach to completing such graphs. Since the proposed framework is highly expressive, it has a vast potential application scope. Since the completion process can be automatized, as one solves √√2+√3 on a pocket calculator without thinking about it, one could, with the proposed framework, solve a complex CSE problem by drawing a diagram. Compared to traditional regression/kriging, the proposed framework can be used to recover unknown functions with much scarcer data by exploiting interdependencies between multiple functions and variables. The computational graph completion (CGC) problem addressed by the proposed framework could therefore also be interpreted as a generalization of that of solving linear systems of equations to that of approximating unknown variables and functions with noisy, incomplete, and nonlinear dependencies. Numerous examples illustrate the flexibility, scope, efficacy, and robustness of the CGC framework and show how it can be used as a pathway to identifying simple solutions to classical CSE problems. These examples include the seamless CGC representation of known methods (for solving/learning PDEs, surrogate/multiscale modeling, mode decomposition, deep learning) and the discovery of new ones (digital twin modeling, dimension reduction).

Mean-field limits of trained weights in deep learning: A dynamical systems perspective

Year: 2022 DOI: 10.14658/pupj-drna-2022-3-12 Training a residual neural network with L^2 regularization on weights and biases is equivalent to minimizing a discrete least action principle and to controlling a discrete Hamiltonian system representing the propagation of input data across layers. The kernel/feature map analysis of this Hamiltonian system suggests a mean-field limit for trained weights and biases as the number of data points goes to infinity. The purpose of this paper is to investigate this mean-field limit and illustrate its existence through numerical experiments and analysis (for simple kernels).

Learning "best" kernels from data in Gaussian process regression. With application to aerodynamics

Year: 2022 DOI: 10.1016/j.jcp.2022.111595 This paper introduces algorithms to select/design kernels in Gaussian process regression/kriging surrogate modeling techniques. We adopt the setting of kernel method solutions in ad hoc functional spaces, namely Reproducing Kernel Hilbert Spaces (RKHS), to solve the problem of approximating a regular target function given observations of it, i.e. supervised learning. A first class of algorithms is kernel flow, which was introduced in the context of classification in machine learning. It can be seen as a cross-validation procedure whereby a "best" kernel is selected such that the loss of accuracy incurred by removing some part of the dataset (typically half of it) is minimized. A second class of algorithms is called spectral kernel ridge regression, and aims at selecting a "best" kernel such that the norm of the function to be approximated is minimal in the associated RKHS. Within Mercer's theorem framework, we obtain an explicit construction of that "best" kernel in terms of the main features of the target function. Both approaches of learning kernels from data are illustrated by numerical examples on synthetic test functions, and on a classical test case in turbulence modeling validation for transonic flows about a two-dimensional airfoil.

Learning dynamical systems from data: A simple cross-validation perspective, Part III: Irregularly-sampled time series

Year: 2023 DOI: 10.1016/j.physd.2022.133546 A simple and interpretable way to learn a dynamical system from data is to interpolate its vector-field with a kernel. In particular, this strategy is highly efficient (both in terms of accuracy and complexity) when the kernel is data-adapted using Kernel Flows (KF) (Owhadi and Yoo, 2019) (which uses gradient-based optimization to learn a kernel based on the premise that a kernel is good if there is no significant loss in accuracy if half of the data is used for interpolation). Despite its previous successes, this strategy (based on interpolating the vector field driving the dynamical system) breaks down when the observed time series is not regularly sampled in time. In this work, we propose to address this problem by approximating a generalization of the flow map of the dynamical system by incorporating time differences between observations in the (KF) data-adapted kernels. We compare our approach with the classical one over different benchmark dynamical systems and show that it significantly improves the forecasting accuracy while remaining simple, fast, and robust.

One-shot learning of stochastic differential equations with data adapted kernels

Year: 2023 DOI: 10.1016/j.physd.2022.133583 We consider the problem of learning Stochastic Differential Equations of the form dXₜ = f (Xₜ)dₜ + σ(Xₜ)dWₜ from one sample trajectory. This problem is more challenging than learning deterministic dynamical systems because one sample trajectory only provides indirect information on the unknown functions f, σ, and stochastic process dWₜ representing the drift, the diffusion, and the stochastic forcing terms, respectively. We propose a method that combines Computational Graph Completion [1] and data adapted kernels learned via a new variant of cross validation. Our approach can be decomposed as follows: (1) Represent the time-increment map Xₜ → X_(t+dt) as a Computational Graph in which f, σ and dWₜ appear as unknown functions and random variables. (2) Complete the graph (approximate unknown functions and random variables) via Maximum a Posteriori Estimation (given the data) with Gaussian Process (GP) priors on the unknown functions. (3) Learn the covariance functions (kernels) of the GP priors from data with randomized cross-validation. Numerical experiments illustrate the efficacy, robustness, and scope of our method.

Do ideas have shape? Idea registration as the continuous limit of artificial neural networks

Year: 2023 DOI: 10.1016/j.physd.2022.133592 We introduce a Gaussian Process (GP) generalization of ResNets (with unknown functions of the network replaced by GPs and identified via MAP estimation), which includes ResNets (trained with L₂ regularization on weights and biases) as a particular case (when employing particular kernels). We show that ResNets (and their warping GP regression extension) converge, in the infinite depth limit, to a generalization of image registration variational algorithms. In this generalization, images are replaced by functions mapping input/output spaces to a space of unexpressed abstractions (ideas), and material points are replaced by data points. Whereas computational anatomy aligns images via warping of the material space, this generalization aligns ideas (or abstract shapes as in Plato's theory of forms) via the warping of the Reproducing Kernel Hilbert Space (RKHS) of functions mapping the input space to the output space. While the Hamiltonian interpretation of ResNets is not new, it was based on an Ansatz. We do not rely on this Ansatz and present the first rigorous proof of convergence of ResNets with trained weights and biases towards a Hamiltonian dynamics driven flow. Since our proof is constructive and based on discrete and continuous mechanics, it reveals several remarkable properties of ResNets and their GP generalization. ResNets regressors are kernel regressors with data-dependent warping kernels. Minimizers of L₂ regularized ResNets satisfy a discrete least action principle implying the near preservation of the norm of weights and biases across layers. The trained weights of ResNets with scaled/strong L² regularization can be identified by solving an autonomous Hamiltonian system. The trained ResNet parameters are unique up to (a function of) the initial momentum, and the initial momentum representation of those parameters is generally sparse. The kernel (nugget) regularization strategy provides a provably robust alternative to Dropout for ANNs. We introduce a functional generalization of GPs and show that pointwise GP/RKHS error estimates lead to probabilistic and deterministic generalization error estimates for ResNets. When performed with feature maps, the proposed analysis identifies the (EPDiff) mean fields limit of trained ResNet parameters as the number of data points goes to infinity. The search for good architectures can be reduced to that of good kernels, and we show that the composition of warping regression blocks with reduced equivariant multichannel kernels (introduced here) recovers and generalizes CNNs to arbitrary spaces and groups of transformations.

A note on microlocal kernel design for some slow–fast stochastic differential equations with critical transitions and application to EEG signals

Year: 2023 DOI: 10.1016/j.physa.2023.128583 This technical note presents an extension of kernel model decomposition (KMD) for detecting critical transitions in some fast–slow random dynamical systems. The approach rests upon modifying KMD for reconstructing an observable by using a novel data-based time-frequency-phase kernel that allows to approximate signals with critical transitions. In particular, we apply the developed method for approximating the solution and detecting critical transitions in some prototypical slow–fast SDEs with critical transitions. We also apply it to detecting seizures in a multi-scale mesoscale nine-dimensional SDE model of brain activity.

Gaussian process hydrodynamics

Year: 2023 DOI: 10.1007/s10483-023-2990-9 We present a Gaussian process (GP) approach, called Gaussian process hydrodynamics (GPH) for approximating the solution to the Euler and Navier-Stokes (NS) equations. Similar to smoothed particle hydrodynamics (SPH), GPH is a Lagrangian particle-based approach that involves the tracking of a finite number of particles transported by a flow. However, these particles do not represent mollified particles of matter but carry discrete/partial information about the continuous flow. Closure is achieved by placing a divergence-free GP prior ξ on the velocity field and conditioning it on the vorticity at the particle locations. Known physics (e.g., the Richardson cascade and velocity increment power laws) is incorporated into the GP prior by using physics-informed additive kernels. This is equivalent to expressing ξ as a sum of independent GPs ξl, which we call modes, acting at different scales (each mode ξl self-activates to represent the formation of eddies at the corresponding scales). This approach enables a quantitative analysis of the Richardson cascade through the analysis of the activation of these modes, and enables us to analyze coarse-grain turbulence statistically rather than deterministically. Because GPH is formulated by using the vorticity equations, it does not require solving a pressure equation. By enforcing incompressibility and fluid-structure boundary conditions through the selection of a kernel, GPH requires significantly fewer particles than SPH. Because GPH has a natural probabilistic interpretation, the numerical results come with uncertainty estimates, enabling their incorporation into an uncertainty quantification (UQ) pipeline and adding/removing particles (quanta of information) in an adapted manner. The proposed approach is suitable for analysis because it inherits the complexity of state-of-the-art solvers for dense kernel matrices and results in a natural definition of turbulence as information loss. Numerical experiments support the importance of selecting physics-informed kernels and illustrate the major impact of such kernels on the accuracy and stability. Because the proposed approach uses a Bayesian interpretation, it naturally enables data assimilation and predictions and estimations by mixing simulation data and experimental data.

Bridging Algorithmic Information Theory and Machine Learning: A new approach to kernel learning

Year: 2024 DOI: 10.1016/j.physd.2024.134153

Machine Learning (ML) and Algorithmic Information Theory (AIT) look at Complexity from different points of view. We explore the interface between AIT and Kernel Methods (that are prevalent in ML) by adopting an AIT perspective on the problem of learning kernels from data, in kernel ridge regression, through the method of Sparse Kernel Flows. In particular, by looking at the differences and commonalities between Minimal Description Length (MDL) and Regularization in Machine Learning (RML), we prove that the method of Sparse Kernel Flows is the natural approach to adopt to learn kernels from data. This approach aligns naturally with the MDL principle, offering a more robust theoretical basis than the existing reliance on cross-validation. The study reveals that deriving Sparse Kernel Flows does not require a statistical approach; instead, one can directly engage with code-lengths and complexities, concepts central to AIT. Thereby, this approach opens the door to reformulating algorithms in machine learning using tools from AIT, with the aim of providing them a more solid theoretical foundation.

Error analysis of kernel/GP methods for nonlinear and parametric PDEs

Year: 2025 DOI: 10.1016/j.jcp.2024.113488

We introduce a priori Sobolev-space error estimates for the solution of arbitrary nonlinear, and possibly parametric, PDEs that are defined in the strong sense, using Gaussian process and kernel based methods. The primary assumptions are: (1) a continuous embedding of the reproducing kernel Hilbert space of the kernel into a Sobolev space of sufficient regularity; and (2) the stability of the differential operator and the solution map of the PDE between corresponding Sobolev spaces. The proof is articulated around Sobolev norm error estimates for kernel interpolants and relies on the minimizing norm property of the solution. The error estimates demonstrate dimension-benign convergence rates if the solution space of the PDE is smooth enough. We illustrate these points with applications to high-dimensional nonlinear elliptic PDEs and parametric PDEs. Although some recent machine learning methods have been presented as breaking the curse of dimensionality in solving high-dimensional PDEs, our analysis suggests a more nuanced picture: there is a trade-off between the regularity of the solution and the presence of the curse of dimensionality. Therefore, our results are in line with the understanding that the curse is absent when the solution is regular enough.

Operator learning with Gaussian processes

Year: 2025 DOI: 10.1016/j.cma.2024.117581

Operator learning focuses on approximating mappings between infinite-dimensional spaces of functions, such as and . This makes it particularly suitable for solving parametric nonlinear partial differential equations (PDEs). Recent advancements in machine learning (ML) have brought operator learning to the forefront of research. While most progress in this area has been driven by variants of deep neural networks (NNs), recent studies have demonstrated that Gaussian process (GP)/kernel-based methods can also be competitive. These methods offer advantages in terms of interpretability and provide theoretical and computational guarantees. In this article, we introduce a hybrid GP/NN-based framework for operator learning, leveraging the strengths of both deep neural networks and kernel methods. Instead of directly approximating the function-valued operator , we use a GP to approximate its associated real-valued bilinear form This bilinear form is defined by the dual pairing $≔$ which allows us to recover the operator through The mean function of the GP can be set to zero or parameterized by a neural operator and for each setting we develop a robust and scalable training mechanism based on maximum likelihood estimation (MLE) that can optionally leverage the physics involved. Numerical benchmarks demonstrate our method’s scope, scalability, efficiency, and robustness; showing that (1) it enhances the performance of a base neural operator by using it as the mean function of a GP, and (2) it enables the construction of zero-shot data-driven models that can make accurate predictions without any prior training. Additionally, our framework (a) naturally extends to cases where maps into a vector of functions, and (b) benefits from computational speed-ups achieved through product kernel structures and Kronecker product matrix representations of the underlying kernel matrices.