Combined Feed
https://feeds.library.caltech.edu/people/Stuart-A-M/combined.rss
A Caltech Library Repository Feedhttp://www.rssboard.org/rss-specificationpython-feedgenenTue, 16 Apr 2024 16:06:14 +0000Volterra integral equations and a new Gronwall inequality (Part II: The nonlinear case)
https://resolver.caltech.edu/CaltechAUTHORS:20170612-135043115
Authors: {'items': [{'id': 'Norbury-J', 'name': {'family': 'Norbury', 'given': 'J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1987
DOI: 10.1017/S0308210500018485
We consider nonlinear singular Volterra integral equations of the second kind. We generalise the transformation method introduced in Part I of this paper [6] to cope with both the nonlinearity and slightly more general singular kernels. We also consider a particular class of nonlinear equation for which the solution behaviour is known. Using this a priori knowledge, we propose a modification of the transformation technique which results in a numerical method with good asymptotic stability properties. Applying the general theory of Part I of this paper, we prove convergence of this scheme.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/766y1-2rb19Volterra integral equations and a new Gronwall inequality (Part I: The linear case)
https://resolver.caltech.edu/CaltechAUTHORS:20170612-131130001
Authors: {'items': [{'id': 'Norbury-J', 'name': {'family': 'Norbury', 'given': 'J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1987
DOI: 10.1017/S0308210500018473
We present a generalisation of the continuous Gronwall inequality and show its use in bounding solutions of discrete inequalities of a form that arise when analysing the convergence of product integration methods for Volterra integral equations. We then use these ideas to prove convergence of a numerical method which is effective in approximating Volterra integral equations of the second kind with weakly singular kernels.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/wc6s1-sfq26Existence of Solutions of a Two-Point Free-Boundary Problem Arising in the Theory of Porous Medium Combustion
https://resolver.caltech.edu/CaltechAUTHORS:20170612-142648576
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1987
DOI: 10.1093/imamat/38.1.23
The existence of solutions of a two-point free-boundary problem arising from the theory of travelling combustion waves in a porous medium is examined. The problem comprises a third-order nonlinear ordinary differential equation posed on an unknown interval of finite length; four boundary conditions are given, two at either end of the interval. The equations possess a trivial solution for all values of the bifurcation parameter λ. A shooting technique is employed to prove the existence of a nontrivial solution for 0 < λ< λ_c and nonexistence theorems are proved for λ ∉ (0, λ_c ).https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/3b4cv-ce966Parabolic Free Boundary Problems Arising in Porous Medium Combustion
https://resolver.caltech.edu/CaltechAUTHORS:20170613-084535886
Authors: {'items': [{'id': 'Norbury-J', 'name': {'family': 'Norbury', 'given': 'J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1987
DOI: 10.1093/imamat/39.3.241
A parabolic partial differential equation approximating the evolution of temperature in highly exothermic porous-medium combustion at low driving velocities is examined. The equation is of reaction-diffusion type with a reaction term which is discontinuous as a function of the dependent variable.
Firstly the variation of the steady solution set with the scaled heat of the reaction is described and the related time-dependent behaviour analysed. The stability results follow from characterizing the ends of the solution branch and fold points explicitly and deducing global stability results about the whole of the continuous solution branch. The results are used to indicate the parameter regimes and temporal scales on which the small driving velocity approximation ceases to be valid.
Secondly the behaviour of the discontinuous partial differential equation is compared with that of a continuous equation which it approximates. This provides justification for the approximation of reaction terms possessing steep gradients by discontinuous functions; the large activation energy limit in porous-medium combustion involves such a process.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/7ehqs-2e156The Mathematics of Porous Medium Combustion
https://resolver.caltech.edu/CaltechAUTHORS:20170612-132031395
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1988
DOI: 10.1007/978-1-4613-9608-6_18
Two partial differential equations arising from the theory of porous medium combustion are examined. While both equations possess a trivial steady solution, the form of the reaction rate, which is discontinuous as a function of the dependent variable, precludes bifurcation of non-trivial steady solutions from the branch of trivial solutions. A constructive approach to the existence theory for non-trivial global solution branches is developed. The method relies on finding an appropriate set of solution dependent transformations which render the problems in a form to which local bifurcation theory is directly applicable. Specifically, by taking a singular limit of the (solution dependent) transformation, an artificial trivial solution (or set of solutions) of the transformed problem is created. The (solution dependent) mapping is not invertible when evaluated at the trivial solution(s) of the transformed problem; however, for non-trivial solutions which exist arbitrarily close to the artificial trivial solution, the mapping is invertible. By applying local bifurcation theory to the transformed problem and mapping back to the original problem, a series expansion for the non-trivial solution branch is obtained.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/55yhx-6v166Travelling Combustion Waves in a Porous Medium. Part I—Existence
https://resolver.caltech.edu/CaltechAUTHORS:20170612-165322289
Authors: {'items': [{'id': 'Norbury-J', 'name': {'family': 'Norbury', 'given': 'J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1988
DOI: 10.1137/0148007
A one-space-dimensional, time-dependent model for travelling combustion waves in a porous medium is analysed. The key variables are the temperature of the solid medium and its density and the temperature of the gaseous phase and its density. The key parameters µ, λ and a are related (respectively) to the driving gas velocity, the specific heat of the combustible solid and the ratio of consumption of oxygen to that of solid. The regions of existence of the different types of combustion waves are found in µ, λ parameter space, with a = 0. The types of combustion wave are classified by the switch mechanism that turns off the combustion, which occurs over a finite, but unknown, interval. Because the model is linear outside the combustion zone, the eigenvalue problem governing the existence of travelling waves may be reformulated as a two-point free boundary problem on a finite domain. Existence and nonexistence theorems are established for this unusual bifurcation problem.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/mwykp-q6j88Travelling Combustion Waves in a Porous Medium. Part II—Stability
https://resolver.caltech.edu/CaltechAUTHORS:20170612-164819129
Authors: {'items': [{'id': 'Norbury-J', 'name': {'family': 'Norbury', 'given': 'J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 1988
DOI: 10.1137/0148019
The linear stability properties of the travelling combustion waves found in Part I are examined. The key parameters which determine the stability properties of the waves are found to be the (scaled) driving velocity and the solid specific heat. In particular, the destabilising influence of increasing either of these two parameters is demonstrated. The results indicate that travelling combustion waves whose reaction is turned off because the solid temperature becomes too low are always unstable, whereas travelling waves whose reaction is turned off due to depletion of solid reactant can be stable. Global techniques are employed to prove that, for large enough values of the scaled solid specific heat, combustion cannot be sustained in any form, and all initial conditions lead to extinction.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/dbzy2-t2t45Similarity Solutions of a Heat Equation with Nonlinearly Varying Heat Capacity
https://resolver.caltech.edu/CaltechAUTHORS:20170613-090149803
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 1988
DOI: 10.1093/imamat/40.3.217
A reaction-diffusion equation, coupled through variable heat capacity and source term to a temporally evolving ordinary differential equation, is examined. The model is a prototype for the study of combustion processes where the heat capacity of a composite solid medium changes significantly as the reactant within the medium is consumed.
Similarity solutions are sought by analysing the invariance of the equations to various stretching groups. The resulting two-point boundary-value problem is singular at the origin and posed on the semi-infinite domain. By employing series expansion techniques we derive a regular problem posed on a finite domain. This problem is amenable to standard numerical solution by means of Newton-Kantorovich iteration. Results of the computations are presented and interpreted in terms of the governing partial differential equation.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/5x7by-n8e76A Note on High/Low-Wave-Number Interactions in Spatially Discrete Parabolic Equations
https://resolver.caltech.edu/CaltechAUTHORS:20170613-135127282
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 1989
DOI: 10.1093/imamat/42.1.27
We describe an instability introduced by the spatial discretization of reaction-diffusion equations. The mechanism is a nonlinear interaction between high and low wave-number modes in the discrete equations. In partial differential equations which exhibit strong temporal growth, a parasitic high-wave-number mode is stimulated, through aliasing, by a physically meaningful low-wave-number mode. We analyse the interaction using phase-plane techniques and present complementary numerical results.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/9n8yx-ced40Singular Free Boundary Problems and Local Bifurcation Theory
https://resolver.caltech.edu/CaltechAUTHORS:20170613-080915333
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1989
DOI: 10.1137/0149004
A constructive method applicable to the solution of a wide class of free boundary problems is presented. A solution-dependent transformation technique is introduced. By considering a singular limit of the transformation, a related problem, to which local bifurcation theory may be applied, is derived. By inverting the (near singular) mapping between the two problems, an expression for solutions of the original problem is obtained.
The method is illustrated by the study of a singularly perturbed elliptic equation. Approximate solutions are constructed and the validity of the approximations established by means of the Contraction Mapping Theorem.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/9k7ws-n9d44A Model for Porous-Medium Combustion
https://resolver.caltech.edu/CaltechAUTHORS:20170613-090547094
Authors: {'items': [{'id': 'Norbury-J', 'name': {'family': 'Norbury', 'given': 'J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1989
DOI: 10.1093/qjmam/42.1.159
A model of time-dependent porous-medium combustion is presented. The model is of combustion in a three-dimensional porous medium. The typical situation envisaged is the combustion of a non-deforming porous solid medium through which a gas such as air passes. The model represents conservation of mass and energy for both the gas and solid species, whilst the fluid flow is governed by Darcy's law and the ideal-gas law. This model is highly complex and requires sophisticated computer analysis.
Consequently we derive a simplified model as a one-dimensional version of the equations, by a number of asymptotic considerations. Central to the analysis is the concept of the large-activation-energy limit. This limit is shown to have entirely different features from those which arise in conventional flame theory. This fact is a consequence of the two-stage reaction rate governing porous-medium combustion; the stages are first the diffusion of gas components between the gas mainstream and the reaction sites in the solid and secondly the conventional Arrhenius reaction. Thus the overall reaction rate is not proportional to the Arrhenius reaction rate, but is a rational function of it.
Because of this two-stage reaction rate, the limit E→∞ has a novel result not encountered in conventional flame theory. A critical switching temperature T_c, determined by A = exp (E/T_c), where A is the pre-exponential factor in the Arrhenius reaction term, arises naturally from the large-activation-energy analysis. For temperatures beneath T_c the reaction rate is negligible whereas for temperatures above T_c the reaction is controlled by the ability of the active gas components to diffuse into or out of the reaction sites in the solid. This rate of active gas-component diffusion has been shown experimentally to be proportional to a power (approximately the square) of the gas temperature. Thus, when switched on, the rate-limiting reaction rate grows algebraically with the temperature, in contrast to the explosive exponential growth of the Arrhenius term which governs the switching process.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/ppmwq-n9d49Nonlinear Instability in Dissipative Finite Difference Schemes
https://resolver.caltech.edu/CaltechAUTHORS:20170613-075253765
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 1989
DOI: 10.1137/1031048
A unified analysis of reaction-diffusion equations and their finite difference representations is presented. The parallel treatment of the two problems shows clearly when and why the finite difference approximations break down. The approach used provides a general framework for the analysis and interpretation of numerical instability in approximations of dissipative nonlinear partial differential equations
Continuous and discrete problems are studied from the perspective of bifurcation theory, and numerical instability is shown to be associated with the bifurcation of periodic orbits in discrete systems. An asymptotic approach, due to Newell (SIAM J. Appl. Math., 33 (1977), 133–160), is used to investigate the instability phenomenon further. In particular, equations are derived that describe the interaction of the dynamics of the partial differential equation with the artefacts of the discretization.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/hmmky-1jj73Linear Instability Implies Spurious Periodic Solutions
https://resolver.caltech.edu/CaltechAUTHORS:20170612-083455306
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 1989
DOI: 10.1093/imanum/9.4.465
We analyse discrete approximations of reaction-diffusion-convection equations and show that linearized instability implies the existence of spurious periodic solutions in the fully nonlinear problem. The result is proved by using ideas from bifurcation theory. Using singularity theory we provide a precise local description of the spurious solutions. The results form the basis for an analysis of the range of discretization parameters in which spurious solutions can exist, their magnitude, and their spatial structure. We present a modified equations approach to determine criteria under which spurious periodic solutions exist for arbitrarily small values of the time-step. The theoretical results are applied to a specific example.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/vphsg-wzp91The Global Attractor Under Discretisation
https://resolver.caltech.edu/CaltechAUTHORS:20170612-140046210
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 1990
DOI: 10.1007/978-94-009-0659-4_14
The effect of temporal discretisation on dissipative differential equations is analysed. We discuss the effect of discretisation on the global attractor and survey some recent results in the area. The advantage of concentrating on ω and α limit sets (which are contained in the global attractor) is described. An analysis of spurious bifurcations in the ω and α limit sets is presented for linear multistep methods, using the time-step Δtas the bifurcation parameter. The results arising from application of local bifurcation theory are shown to hold globally and a necessary and sufficient condition is derived for the non-existence of a particular class of spurious solutions, for allΔt> 0. The class of linear multistep methods satisfying this condition is fairly restricted since the underlying theory is very general and takes no account of any inherent structure in the underlying differential equations. Hence a method complementary to the bifurcation analysis is described, the aim being to construct methods for which spurious solutions do not exist forΔt sufficiently small; for infinite dimensional dynamical systems the method relies on examining steady boundary value problems (which govern the existence of spurious solutions) in the singular limit corresponding to Δt→ 0_+. The analysis we describe is helpful in the design of schemes for long-time simulationshttps://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/an7xg-43319On the computation of blow-up
https://resolver.caltech.edu/CaltechAUTHORS:20170612-101245620
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Floater-M-S', 'name': {'family': 'Floater', 'given': 'M. S.'}}]}
Year: 1990
DOI: 10.1017/S095679250000005X
Numerical methods for initial-value problems which develop singularities in finite time are analyzed. The objective is to determine simple strategies which produce the correct asymptotic behaviour and give an accurate approximation of the blow-up time. Fixed step methods for scalar ordinary differential equations are studied first and it is shown that there is a natural embedding of the discrete process in a continuous one. This shows clearly how and why the fixed-step strategy fails. A class of time-stepping strategies that correspond to a time- continuous re-scaling of the underlying differential equation is then proposed; this class is analyzed and criteria established to determine suitable choices for the re-scaling. Finally the ideas are applied to a partial differential equation arising from the study of a fluid with temperature-dependent viscosity. The numerical method involves re-formulating the equation as a moving boundary problem for the peak value and applying the ODE time-stepping strategies based on this peak value.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/e5k6g-8v733Singular Limits in Free Boundary Problems
https://resolver.caltech.edu/CaltechAUTHORS:20170613-144801379
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 1991
DOI: 10.1216/rmjm/1181072969
We analyze the following class of nonlinear eigenvalue problems: find (uµ) є B x R satisfying
(1) Du + µH(a.u - 1)f(u) = 0 in Ω ⊆ R^N,
(2) u = O on ∂Ω.
Here H (X) is the Heaviside step-function defined by H(X) =0, X ≤ 0 H (X) = 1 X > 0.
B is some Banach space appropriate to the problem. D is taken to be a (possibly nonlinear) differential operator with the property than, when µ = 0, equations (1,2) have the unique solution uhttps://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/gyy0y-6fg03A Mathematical Model for the Diffusion of Tumour Angiogenesis Factor into the Surrounding Host Tissue
https://resolver.caltech.edu/CaltechAUTHORS:20170612-104202241
Authors: {'items': [{'id': 'Chaplain-M-A-J', 'name': {'family': 'Chaplain', 'given': 'M. A. J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1991
DOI: 10.1093/imammb/8.3.191
Unless they are furnished with an adequate blood supply and a means of disposing of their waste products by a mechanism other than diffusion, solid tumours cannot grow beyond a few millimetres in diameter. It is now a well-established fact that, in order to accomplish this neovascularization, solid tumours secrete a diffusable chemical compound known as turnour angiogenesis factor (TAF) into the surrounding tissue. This stimulates nearby blood vessels to migrate towards and finally penetrate the tumour. Once provided with the new supply of nutrient, rapid growth takes place. In this paper, a mathematical model is presented for the diffusion of TAF into the surrounding tissue. The complete process of angiogenesis is made up of a sequence of several distinct events and the model is an attempt to take into account as many of these as possible. In the diffusion equation for the TAF, a decay term is included which models the loss of the chemical in the surrounding tissue itself. A threshold distance for the TAF is incorporated in an attempt to reflect the results from experiments of corneal implants in test animals. By formulating the problems in terms of a free boundary problem, the extent of the diffusion of TAF into the surrounding tissue can be monitored. Finally, by introducing a sink term representing the action of proliferating endothelial cells, the boundary of the TAF is seen to recede, and hence the position and movement of the capillaries can be indirectly followed. The changing concentration gradient observed as the boundary recedes may offer a possible explanation for the initiation of anastomosis. Several functions are considered as possible sink terms and numerical results are presented. The situation where the turnour. (i.e. the source of TAF) is removed is also considered.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/w9nwt-r2278The Dynamics of the Theta Method
https://resolver.caltech.edu/CaltechAUTHORS:20170613-092411747
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Peplow-A-T', 'name': {'family': 'Peplow', 'given': 'A. T.'}}]}
Year: 1991
DOI: 10.1137/0912074
The dynamics of the theta method for arbitrary systems of nonlinear ordinary differential equations are analysed. Two scalar examples are presented to demonstrate the importance of spurious solutions in determining the dynamics of discretisations. A general system of differential equations is then considered. It is shown that the choice θ = ½ does not generate spurious solutions of period 2 in the timestep n. Using bifurcation theory, it is shown that for θ ≠ ½ the theta method does generate spurious solutions of period 2. The existence and form of spurious solutions are examined in the limit △t ⟶ 0. The existence of spurious steady solutions in a predictor-corrector method is proved to be equivalent to the existence of spurious period 2 solutions in the Euler method. The theory is applied to several examples from nonlinear parabolic equations. Numerical continuation is used to trace out the spurious solutions as Lit is varied. Timestepping experiments are presented to demonstrate the effect of the spurious solutions on the dynamics and some complementary theoretical results are proved. In particular, the linear stability restriction △t/△ x^2 ≤ ½ for the Euler method applied to the heat equation is generalised to cope with a nonlinear problem. This naturally introduces a restriction on △t in terms of the initial data; this restriction is necessary to avoid the effect of spurious periodic solutions.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/nr13k-r8557A Unified Approach to Spurious Solutions Introduced by Time Discretisation. Part I: Basic Theory
https://resolver.caltech.edu/CaltechAUTHORS:20170612-164247464
Authors: {'items': [{'id': 'Iserles-A', 'name': {'family': 'Iserles', 'given': 'A.'}}, {'id': 'Peplow-A-T', 'name': {'family': 'Peplow', 'given': 'A. T.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1991
DOI: 10.1137/0728086
The asymptotic states of numerical methods for initial value problems are examined. In particular, spurious steady solutions, solutions with period 2 in the timestep, and spurious invariant curves are studied. A numerical method is considered as a dynamical system parameterised by the timestep h. It is shown that the three kinds of spurious solutions can bifurcate from genuine steady solutions of the numerical method (which are inherited from the differential equation) as h is varied. Conditions under which these bifurcations occur are derived for Runge–Kutta schemes, linear multistep methods, and a class of predictor-corrector methods in a PE(CE)^M implementation. The results are used to provide a unifying framework to various scattered results on spurious solutions which already exist in the literature. Furthermore, the implications for choice of numerical scheme are studied. In numerical simulation it is desirable to minimise the effect of spurious solutions. Classes of methods with desirable dynamical properties are described and evaluated.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/ftsst-v7f46A note on uniform in time error estimates for approximations to reaction-diffusion equations
https://resolver.caltech.edu/CaltechAUTHORS:20170612-070148270
Authors: {'items': [{'id': 'Sanz-Serna-J-M', 'name': {'family': 'Sanz-Serna', 'given': 'J. M.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1992
DOI: 10.1093/imanum/12.3.457
The approximation of solutions of reaction-diffusion equations that approach asymptotically stable, hyperbolic equilibria is considered. Near such equilibria trajectories of the equation contract and hence it is possible to seek error estimates that are uniformly valid in time. A technique for the derivation of such estimates is illustrated in the context of an explicit Euler finite-difference scheme.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/1ert0-bat63Numerical Wave Propagation in an Advection Equation with a Nonlinear Source Term
https://resolver.caltech.edu/CaltechAUTHORS:20170613-084043611
Authors: {'items': [{'id': 'Griffiths-D-F', 'name': {'family': 'Griffiths', 'given': 'D. F.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Yee-Helen-C', 'name': {'family': 'Yee', 'given': 'H. C.'}}]}
Year: 1992
DOI: 10.1137/0729074
The Cauchy and initial boundary value problems are studied for a linear advection equation with a nonlinear source term. The source term is chosen to have two equilibrium states, one unstable and the other stable as solutions of the underlying characteristic equation. The true solutions exhibit travelling waves which propagate from one equilibrium to another. The speed of propagation is dependent on the rate of decay of the initial data at infinity A class of monotone explicit finite-difference schemes are proposed and analysed; the schemes are upwind in space for the advection term with some freedom of choice for the evaluation of the nonlinear source term. Convergence of the schemes is demonstrated and the existence of numerical waves, mimicking the travelling waves in the underlying equation, is proved. The convergence of the numerical wave-speeds to the true wave-speeds is also established. The behaviour of the scheme is studied when the monotonicity criteria are violated due to stiff source terms, and oscillations and divergence are shown to occur. The behaviour is contrasted with a split-step scheme where the solution remains monotone and bounded but where incorrect speeds of propagation are observed as the stiffness of the problem increases.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/r1da7-wt631Unified approach to spurious solutions introduced by time discretization Part II: BDF-like methods
https://resolver.caltech.edu/CaltechAUTHORS:20170612-105155948
Authors: {'items': [{'id': 'Iserles-A', 'name': {'family': 'Iserles', 'given': 'A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1992
DOI: 10.1093/imanum/12.4.487
It has been proved inter alia in part I of the present paper (Iserles et al., 1991) that irreducible multistep methods for ordinary differential equations may possess period-2 solutions as asymptotic states if and only if σ(−1)≠0, where the underlying method is
∑^m_k=0ρκyn+k = h ∑^m_k=0^σκf(yn+k) and σ(z):=∑^m_k=0^σκ^z^k. We provide an alternative proof of that statement and examine in detail properties of methods that obey σ(−1)=0. By using a variation of the original proof of the first Dahlquist barrier (Henrici, 1962), we establish an attainable upper bound on the order of zero-stable multistep methods with the aforementioned feature. Moreover, we modify the concept of backward differentiation formulae (BDF) to require that σ(−1)=0. A zero-stability bound on the ensuing methods is produced by extending the method of proof in (Hairer & Wanner, 1983).https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/0q3t2-2ns11The Numerical Computation of Heteroclinic Connections in Systems of Gradient Partial Differential Equations
https://resolver.caltech.edu/CaltechAUTHORS:20170613-082542564
Authors: {'items': [{'id': 'Bai-Fengshan', 'name': {'family': 'Bai', 'given': 'Fengshan'}}, {'id': 'Spence-A', 'name': {'family': 'Spence', 'given': 'Alastair'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 1993
DOI: 10.1137/0153037
The numerical computation of heteroclinic connections in partial differential equations (PDEs) with a gradient structure, such as those arising in the modeling of phase transitions, is considered. Initially, a scalar reaction diffusion equation is studied; structural assumptions are made on the problem to ensure the existence of absorbing sets and, consequently, a global attractor. As a result of the gradient structure, it is known that, if all equilibria are hyperbolic, the global attractor comprises the set of equilibria and heteroclinic orbits connecting equilibria to one another. Thus it is natural to consider direct approximation of the set of equilibria and the connecting orbits.
Results are proved about the Fourier spanning basis for branches of equilibria and also for certain heteroclinic connections; these results exploit the oddness of the nonlinearity. The reaction-diffusion equation is then approximated by a Galerkin spectral discretization to produce a system of ordinary differential equations (ODEs). Analogous results to those holding for the PDE are proved for the ODEs—in particular, the existence and structure of the global attractor and appropriate spanning bases for the equilibria and certain heteroclinic connections, are studied. Heteroclinic connections in the system of ODEs are then computed using a generalization of known methods to cope with the gradient structure. Suitable parameterizations of the attractor are introduced and numerical continuation used to find families of connections on the attractor. Special connections, which are stable in certain Fourier spanning bases, are used as starting points for the computations.
The methods used allow the calculation of connecting orbits that are unstable as solutions of the initial value problem, and thus provide a computational tool for understanding the dynamics of dissipative problems in a manner that could not be achieved by use of standard initial value methods. Numerical results are given for the Chafee–Infante problem and for the Cahn–Hilliard equation. A one-parameter family of PDEs connecting these two problems is introduced, and it is demonstrated numerically that the global attractor for the Chafee–Infante problem can be continuously deformed into that for the Cahn–Hilliard equation.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/v5f07-qxd39Blowup in a Partial Differential Equation with Conserved First Integral
https://resolver.caltech.edu/CaltechAUTHORS:20170613-080915743
Authors: {'items': [{'id': 'Budd-C', 'name': {'family': 'Budd', 'given': 'Chris'}}, {'id': 'Dold-B', 'name': {'family': 'Dold', 'given': 'Bill'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 1993
DOI: 10.1137/0153036
A reaction-diffusion equation with a nonlocal term is studied. The nonlocal term acts to conserve the spatial integral of the unknown function as time evolves. Such equations give insight into biological and chemical problems where conservation properties predominate. The aim of the paper is to understand how the conservation property affects the nature of blowup.
The equation studied has a trivial steady solution that is proved to be stable. Existence of nontrivial steady solutions is proved, and their instability established numerically. Blowup is proved for sufficiently large initial data by using a comparison principle in Fourier space. The nature of the blowup is investigated by a combination of asymptotic and numerical calculations.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/4pjdg-nfd20A model mechanism for the chemotactic response of endothelial cells to tumour angiogenesis factor
https://resolver.caltech.edu/CaltechAUTHORS:20170612-071258761
Authors: {'items': [{'id': 'Chaplain-M-A-J', 'name': {'family': 'Chaplain', 'given': 'M. A. J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1993
DOI: 10.1093/imammb/10.3.149
In order to accomplish the transition from avascular to vascular growth, solid tumours secrete a diffusible substance known as tumour angiogenesis factor (TAF) into the surrounding tissue. Endothelial cells which form the lining of neighbouring blood vessels respond to this chemotactic stimulus in a well-ordered sequence of events consisting, at minimum, of a degradation of their basement membrane, migration, and proliferation. A model mechanism is presented which includes the diffusion of the TAF into the surrounding host tissue and the response of the endothelial cells to the chemotactic stimulus. The model accounts for the main observed events associated with the endothelial cells during the process of angiogenesis (i.e. cell migration and proliferation); the numerical results compare very well with experimental observations. The situation where the tumour (i.e. the source of TAF) is removed and the vessels recede is also considered.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/1jfqj-ctp20The Global Dynamics of Discrete Semilinear Parabolic Equations
https://resolver.caltech.edu/CaltechAUTHORS:20170613-070150162
Authors: {'items': [{'id': 'Elliott-C-M', 'name': {'family': 'Elliott', 'given': 'C. M.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1993
DOI: 10.1137/0730084
A class of scalar semilinear parabolic equations possessing absorbing sets, a Lyapunov functional, and a global attractor are considered. The gradient structure of the problem implies that, provided all steady states are isolated, solutions approach a steady state as $t \to \infty $. The dynamical properties of various finite difference and finite element schemes for the equations are analysed. The existence of absorbing sets, bounded independently of the mesh size, is proved for the numerical methods. Discrete Lyapunov functions are constructed to show that, under appropriate conditions on the mesh parameters, numerical orbits approach steady state solutions as discrete time increases. However, it is shown that insufficient spatial resolution can introduce deceptively smooth spurious steady solutions and cause the stability properties of the true steady solutions to be incorrectly represented. Furthermore, it is also shown that the explicit Euler scheme introduces spurious solutions with period 2 in the timestep. As a result, the absorbing set is destroyed and there is initial data leading to blow up of the scheme, however small the mesh parameters are taken. To obtain stabilization to a steady state for this scheme, it is necessary to restrict the timestep in terms of the initial data and the space step. Implicit schemes are constructed for which absorbing sets and Lyapunov functions exist under restrictions on the timestep that are independent of initial data and of the space step; both one-step and multistep (BDF) methods are studied.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/ghxzk-tba10Numerical analysis of dynamical systems
https://resolver.caltech.edu/CaltechAUTHORS:20170613-082428693
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 1994
DOI: 10.1017/S0962492900002488
This article reviews the application of various notions from the theory of dynamical systems to the analysis of numerical approximation of initial value problems over long-time intervals. Standard error estimates comparing individual trajectories are of no direct use in this context since the error constant typically grows like the exponential of the time interval under consideration.
Instead of comparing trajectories, the effect of discretization on various sets which are invariant under the evolution of the underlying differential equation is studied. Such invariant sets are crucial in determining long-time dynamics. The particular invariant sets which are studied are equilibrium points, together with their unstable manifolds and local phase portraits, periodic solutions, quasi-periodic solutions and strange attractors.
Particular attention is paid to the development of a unified theory and to the development of an existence theory for invariant sets of the underlying differential equation which may be used directly to construct an analogous existence theory (and hence a simple approximation theory) for the numerical method.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/p4j80-k9b06Approximation of dissipative partial differential equations over long time intervals
https://resolver.caltech.edu/CaltechAUTHORS:20170612-152618052
Authors: {'items': [{'id': 'Humphries-A-R', 'name': {'family': 'Humphries', 'given': 'A. R.'}}, {'id': 'Jones-D-A', 'name': {'family': 'Jones', 'given': 'D. A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1994
In this article the numerical analysis of dissipative semilinear evolution equations with sectorial linear part is reviewed. In particular the approximation theory for such equations over long time intervals is discussed. Emphasis is placed on studying the effect of approximation on certain invariant objects which play an important role in understanding long time dynamics. Specifically the existence of absorbing sets, the upper and lower semicontinuity of global attractors and the existence and convergence of attractive invariant manifolds, such as the inertial manifold and unstable manifolds of equilibrium points, is studied.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/jep36-3q894Model Problems in Numerical Stability Theory for Initial Value Problems
https://resolver.caltech.edu/CaltechAUTHORS:20170613-100013806
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Humphries-A-R', 'name': {'family': 'Humphries', 'given': 'A. R.'}}]}
Year: 1994
DOI: 10.1137/1036054
In the past numerical stability theory for initial value problems in ordinary differential equations has been dominated by the study of problems with simple dynamics; this has been motivated by the need to study error propagation mechanisms in stiff problems, a question modeled effectively by contractive linear or nonlinear problems. While this has resulted in a coherent and self-contained body of knowledge, it has never been entirely clear to what extent this theory is relevant for problems exhibiting more complicated dynamics. Recently there have been a number of studies of numerical stability for wider classes of problems admitting more complicated dynamics. This on-going work is unified and, in particular, striking similarities between this new developing stability theory and the classical linear and nonlinear stability theories are emphasized.
The classical theories of A, B and algebraic stability for Runge–Kutta methods are briefly reviewed; the dynamics of solutions within the classes of equations to which these theories apply—linear decay and contractive problems—are studied. Four other categories of equations—gradient, dissipative, conservative and Hamiltonian systems—are considered. Relationships and differences between the possible dynamics in each category, which range from multiple competing equilibria to chaotic solutions, are highlighted. Runge-Kutta schemes that preserve the dynamical structure of the underlying problem are sought, and indications of a strong relationship between the developing stability theory for these new categories and the classical existing stability theory for the older problems are given. Algebraic stability, in particular, is seen to play a central role.
It should be emphasized that in all cases the class of methods for which a coherent and complete numerical stability theory exists, given a structural assumption on the initial value problem, is often considerably smaller than the class of methods found to be effective in practice. Nonetheless it is arguable that it is valuable to develop such stability theories to provide a firm theoretical framework in which to interpret existing methods and to formulate goals in the construction of new methods. Furthermore, there are indications that the theory of algebraic stability may sometimes be useful in the analysis of error control codes which are not stable in a fixed step implementation; this work is described.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/tf07t-s3h46Blow-up in a System of Partial Differential Equations with Conserved First Integral. Part II: Problems with Convection
https://resolver.caltech.edu/CaltechAUTHORS:20170613-120606856
Authors: {'items': [{'id': 'Budd-C-J', 'name': {'family': 'Budd', 'given': 'C. J.'}}, {'id': 'Dold-J-W', 'name': {'family': 'Dold', 'given': 'J. W.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1994
DOI: 10.1137/S0036139992232131
A reaction-diffusion-convection equation with a nonlocal term is studied; the nonlocal operator acts to conserve the spatial integral of the unknown function as time evolves. The equations are parameterised by µ, and for µ = 1 the equation arises as a similarity solution of the Navier-Stokes equations and the nonlocal term plays the role of pressure. For µ = 0, the equation is a nonlocal reaction-diffusion problem. The aim of the paper is to determine for which values of the parameter µ blow-up occurs and to study its form. In particular, interest is focused on the three cases µ < 1/2, µ > 1/2, and µ → 1.
It is observed that, for any 0 ≤ µ ≤ 1/2, nonuniform global blow-up occurs; if 1/2 < µ < 1, then the blow-up is global and uniform, while for µ = 1 (the Navier-Stokes equations) there are exact solutions with initial data of arbitrarily large L_∞, L_2, and H^1 norms that decay to zero. Furthermore, one of these exact solutions is proved to be nonlinearly stable in L_2 for arbitrarily large supremum norm. An understanding of this transition from blow-up behaviour to decay behaviour is achieved by a combination of analysis, asymptotics, and numerical techniques.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/vn92e-et061Runge–Kutta Methods for Dissipative and Gradient Dynamical Systems
https://resolver.caltech.edu/CaltechAUTHORS:20170613-084043889
Authors: {'items': [{'id': 'Humphries-A-R', 'name': {'family': 'Humphries', 'given': 'A. R.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1994
DOI: 10.1137/0731075
The numerical approximation of dissipative initial value problems by fixed time-stepping Runge–Kutta methods is considered and the asymptotic features of the numerical and exact solutions are compared. A general class of ordinary differential equations, for which dissipativity is induced through an inner product, is studied throughout. This class arises naturally in many finite dimensional applications (such as the Lorenz equations) and also from the spatial discretization of a variety of partial differential equations arising in applied mathematics.
It is shown that the numerical solution defined by an algebraically stable method has an absorbing set and is hence dissipative for any fixed step-size h > 0. The numerical solution is shown to define a dynamical system on the absorbing set if h is sufficiently small and hence a global attractor A_h exists; upper-semicontinuity of A_h at h = 0 is established, which shows that, for h small, every point on the numerical attractor is close to a point on the true global attractor A. Under the additional assumption that the problem is globally Lipschitz, it is shown that if h is sufficiently small any method with positive weights defines a dissipative dynamical system on the whole space and upper semicontinuity of A_h at h = 0 is again established.
For gradient systems with globally Lipschitz vector fields it is shown that any Runge–Kutta method preserves the gradient structure for h sufficiently small. For general dissipative gradient systems it is shown that algebraically stable methods preserve the gradient structure within the absorbing set for h sufficiently small. Convergence of the numerical attractor is studied and, for a dissipative gradient system with hyperbolic equilibria, lower semicontinuity at h = 0 is established. Thus, for such a system, A_h converges to A in the Hausdorff metric as h → 0.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/8af11-3p877Numerical computations of coarsening in the one-dimensional Cahn-Hilliard model of phase separation
https://resolver.caltech.edu/CaltechAUTHORS:20170609-122809369
Authors: {'items': [{'id': 'Bai-Fengshan', 'name': {'family': 'Bai', 'given': 'Fengshan'}}, {'id': 'Spence-A', 'name': {'family': 'Spence', 'given': 'Alastair'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 1994
DOI: 10.1016/0167-2789(94)90112-0
Time dependent solutions of the Cahn-Hilliard equation are studied numerically. In particular heteroclinic orbits, which connect different equilibrium solutions at t = -∞ and t = +∞, are sought. Thus boundary value problems in space-time are computed. This computation requires an investigation of the stability of equilibria, since projections onto the stable and unstable manifolds determine the boundary conditions at t = -∞ and t = +∞. This stability analysis is then followed by solution of the appropriate boundary value problem in space-time. The results obtained cannot be found by standard initial value simulations. By specifying the two steady states at t = ±∞ appropriately it is possible to find orbits reflecting a given degree of coarsening over the time evolution. This gives a clear picture of the dynamic coarsening admissible in the equation. It also provides an understanding of orbits on the global attractor for the equation.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/mkmzy-r1j95Perturbation Theory for Infinite Dimensional Dynamical Systems
https://resolver.caltech.edu/CaltechAUTHORS:20170613-133150018
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 1995
When considering the effect of perturbations on initial value problems over long time intervals it is not possible, in general, to uniformly approximate individual trajectories. This is because well-posed initial value problems allow exponential divergence of trajectories and this fact is reflected in the error bound relating trajectories of the perturbed and unperturbed problems. In order to interpret data obtained from numerical simulations over long time intervals, and from other forms of perturbations, it is hence often necessary to ask different questions concerning the behavior as the approximation is
refined. One possibility, which we concentrate on in this review, is to study the effect of perturbation on sets which are invariant under the evolution equation. Such sets include equilibria, periodic solutions, stable and unstable manifolds, phase portraits, inertial manifolds and attractors; they are crucial to the understanding of long-time dynamics.
An abstract semilinear evolution equation in a Hilbert space X is considered, yielding a semigroup S(t) actlng on a subspace V of X. A general class of perturped semigroups S^h(t) are considered which are C^1 close to S(t) uniformly on bounded subsets of V and time intervals [t_1, t_2] with 0 < t_1 < t_2 < ∞. A variety of perturbed problems are shown to satisfy these approximation properties. Examples include a Galerkin method based on the eigenfunctions of the linear part of the abstract sectorial evolution equation, a backward Euler approximation of the same equation and a singular perturbation of the Cahn-Hilliard equation arising from the phase-field model of phase transitions. The invariant sets of S(t) and S^h(t) are compared and convergence properties established.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/ca33n-4w564The viscous Cahn-Hilliard equation. I. Computations
https://resolver.caltech.edu/CaltechAUTHORS:20170612-135245414
Authors: {'items': [{'id': 'Bai-Fengshan', 'name': {'family': 'Bai', 'given': 'F.'}}, {'id': 'Elliott-C-M', 'name': {'family': 'Elliott', 'given': 'C. M.'}}, {'id': 'Gardiner-A', 'name': {'family': 'Gardiner', 'given': 'A.'}}, {'id': 'Spence-A', 'name': {'family': 'Spence', 'given': 'A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1995
DOI: 10.1088/0951-7715/8/2/002
The viscous Cahn-Hilliard equation arises as a singular limit of the phase-field model of phase transitions. It contains both the Cahn-Hilliard and Allen-Cahn equations as particular limits. The equation is in gradient form and possesses a compact global attractor A, comprising heteroclinic orbits between equilibria. Two classes of computation are described. First heteroclinic orbits on the global attractor are computed; by using the viscous Cahn-Hilliard equation to perform a homotopy, these results show that the orbits, and hence the geometry of the attractors, are remarkably insensitive to whether the Allen-Cahn or Cahn-Hilliard equation is studied. Second, initial-value computations are described; these computations emphasize three differing mechanisms by which interfaces in the equation propagate for the case of very small penalization of interfacial energy. Furthermore, convergence to an appropriate free boundary problem is demonstrated numerically.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/f6r7t-9px92Discrete Gevrey regularity attractors and uppers–semicontinuity for a finite difference approximation to the Ginzburg–Landau equation
https://resolver.caltech.edu/CaltechAUTHORS:20170613-143108777
Authors: {'items': [{'id': 'Lord-G-J', 'name': {'family': 'Lord', 'given': 'Gabriel J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 1995
DOI: 10.1080/01630569508816658
A semi-discrete spatial finite difference approximation to the complex Ginzburg-Landau equation with cubic non-linearity is considered. Using the fractional powers of a sectorial operator, discrete versions of the Sobolev spaces H^5, and Gevrey classes of regularity G, are introduced.Discrete versions of some standard Sobolev space norm inequalities are proved.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/g3emq-my122The rate of error growth in Hamiltonian-conserving integrators
https://resolver.caltech.edu/CaltechAUTHORS:20170613-104839336
Authors: {'items': [{'id': 'Estep-D-J', 'name': {'family': 'Estep', 'given': 'Donald J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 1995
DOI: 10.1007/BF01003559
In this note, we consider numerical methods for a class of Hamiltonian systems that preserve the Hamiltonian. We show that the rate of growth of error is at most linear in time when such methods are applied to problems with period uniquely determined by the value of the Hamiltonian. This contrasts to generic numerical schemes, for which the rate of error growth is superlinear. Asymptotically, the rate of error growth for symplectic schemes is also linear. Hence, Hamiltonian-conserving schemes are competitive with symplectic schemes in this respect. The theory is illustrated with a computation performed on Kepler's problem for the interaction of two bodies.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/9bvrq-wet43The Essential Stability of Local Error Control for Dynamical Systems
https://resolver.caltech.edu/CaltechAUTHORS:20170613-084044146
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Humphries-A-R', 'name': {'family': 'Humphries', 'given': 'A. R.'}}]}
Year: 1995
DOI: 10.1137/0732087
Although most adaptive software for initial value problems is designed with an accuracy requirement—control of the local error—it is frequently observed that stability is imparted by the adaptation. This relationship between local error control and numerical stability is given a firm theoretical underpinning.
The dynamics of numerical methods with local error control are studied for three classes of ordinary differential equations: dissipative, contractive, and gradient systems. Dissipative dynamical systems are characterised by having a bounded absorbing set B which all trajectories eventually enter and remain inside. The exponentially contractive problems studied have a unique, globally exponentially attracting equilibrium point and thus they are also dissipative since the absorbing set B may be chosen to be a ball of arbitrarily small radius around the equilibrium point. The gradient systems studied are those for which the set of equilibria comprises isolated points and all trajectories are bounded so that each trajectory converges to an equilibrium point as t → ∞. If the set of equilibria is bounded then the gradient systems are also dissipative. Conditions under which numerical methods with local error control replicate these large-time dynamical features are described. The results are proved without recourse to asymptotic expansions for the truncation error.
Standard embedded Runge–Kutta pairs are analysed together with several nonstandard error control strategies. Both error per step and error per unit step strategies are considered. Certain embedded pairs are identified for which the sequence generated can be viewed as coming from a small perturbation of an algebraically stable scheme, with the size of the perturbation proportional to the tolerance τ. Such embedded pairs are defined to be essentially algebraically stable and explicit essentially stable pairs are identified. Conditions on the tolerance τ are identified under which appropriate discrete analogues of the properties of the underlying differential equation may be proved for certain essentially stable embedded pairs. In particular, it is shown that for dissipative problems the discrete dynamical system has an absorbing set B_τ and is hence dissipative. For exponentially contractive problems the radius of B_τ is proved to be proportional to τ. For gradient systems the numerical solution enters and remains in a small ball about one of the equilibria and the radius of the ball is proportional to τ. Thus the local error control mechanisms confer desirable global properties on the numerical solution. It is shown that for error per unit step strategies the conditions on the tolerance τ are independent of initial data while for error per step strategies the conditions are initial-data dependent. Thus error per unit step strategies are considerably more robust.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/fnrx8-awp14Attractive Invariant Manifolds under Approximation. Inertial Manifolds
https://resolver.caltech.edu/CaltechAUTHORS:20170612-064633625
Authors: {'items': [{'id': 'Jones-D-A', 'name': {'family': 'Jones', 'given': 'D. A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1995
DOI: 10.1006/jdeq.1995.1174
A class of nonlinear dissipative partial differential equations that possess finite dimensional attractive invariant manifolds is considered. An existence and perturbation theory is developed which unifies the cases of unstable manifolds and inertial manifolds into a single framework. It is shown that certain approximations of these equations, such as those arising from spectral or finite element methods in space, one-step time-discretization or a combination of both. also have attractive invariant manifolds. Convergence of the approximate manifolds to the true manifolds is established as the approximation is refined. In this part of the paper applications to the behavior of inertial manifolds under approximation are considered. From this analysis deductions about the structure of the attractor and the flow on the attractor under discretization can be made.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/g6te7-8sz17Dynamical Systems and Numerical Analysis
https://resolver.caltech.edu/CaltechAUTHORS:20161110-163922626
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Humphries-A-R', 'name': {'family': 'Humphries', 'given': 'A. R.'}}]}
Year: 1996
This book unites the study of dynamical systems and numerical solution of differential equations. The first three chapters contain the elements of the theory of dynamical systems and the numerical solution of initial-value problems. In the remaining chapters, numerical methods are formulted as dynamical systems and the convergence and stability properties of the methods are examined. Topics studied include the stability of numerical methods for contractive, dissipative, gradient and Hamiltonian systems together with the convergence properties of equilibria, periodic solutions and strage attractors under numerical approximation. This book will be an invaluable tool for graduate students and researchers in the fields of numerical analysis and dynamical systems.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/5w6jy-4k716Viscous Cahn–Hilliard Equation II. Analysis
https://resolver.caltech.edu/CaltechAUTHORS:20170609-125856997
Authors: {'items': [{'id': 'Elliott-C-M', 'name': {'family': 'Elliott', 'given': 'C. M.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1996
DOI: 10.1006/jdeq.1996.0101
The viscous Cahn–Hilliard equation may be viewed as a singular limit of the phase-field equations for phase transitions. It contains both the Allen–Cahn and Cahn–Hilliard models of phase separation as particular cases; by specific choices of parameters it may be formulated as a one-parameter (sayα) homotopy connecting the Cahn–Hilliard (α=0) and Allen–Cahn (α=1) models. The limitα=0 is singular in the sense that the smoothing property of the analytic semigroup changes from being of the type associated with second order operators to the type associated with fourth order operators. The properties of the gradient dynamical system generated by the viscous Cahn–Hilliard equation are studied asαvaries in [0, 1]. Continuity of the phase portraits near equilibria is established independently ofα∈[0, 1] and, using this, a piecewise, uniform in time, perturbation result is proved for trajectories. Finally the continuity of the attractor is established and, in one dimension, the existence and continuity of inertial manifolds shown and the flow on the attractor detailed.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/z4gbs-k8746Convergence and stability in the numerical approximation of
dynamical systems
https://resolver.caltech.edu/CaltechAUTHORS:20170613-141546199
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 1997
In this article we give an overview of the application of theories from dynamical systems to the analysis of numerical methods for initial-value problems. We start by describing the classical viewpoints of numerical analysis and of dynamical systems and then indicate how the two viewpoints can be merged to provide a framework for both the interpretation of data obtained from numerical simulations and the design of efficient numerical methods. This is done in Section 2.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/zppe8-bm791Probabilistic and deterministic convergence proofs for
software for initial value problems
https://resolver.caltech.edu/CaltechAUTHORS:20170613-132616648
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 1997
DOI: 10.1023/A:1019169114976
The numerical solution of initial value problems for ordinary differential equations is frequently performed by means of adaptive algorithms with user-input tolerance τ. The time-step is then chosen according to an estimate, based on small time-step heuristics, designed to try and ensure that an approximation to the local error commited is bounded by τ. A question of natural interest is to determine how the global error behaves with respect to the tolerance τ. This has obvious practical interest and also leads to an interesting problem in mathematical analysis. The primary difficulties arising in the analysis are that: (i) the time-step selection mechanisms used in practice are discontinuous as functions of the specified data; (ii) the small time-step heuristics underlying the control of the local error can break down in some cases. In this paper an analysis is presented which incorporates these two difficulties.
For a mathematical model of an error per unit step or error per step adaptive Runge–Kutta algorithm, it may be shown that in a certain probabilistic sense, with respect to a measure on the space of initial data, the small time-step heuristics are valid with probability one, leading to a probabilistic convergence result for the global error as τ → 0. The probabilistic approach is only valid in dimension m > 1; this observation is consistent with recent analysis concerning the existence of spurious steady solutions of software codes which highlights the difference between the cases m = 1 and m > 1. The breakdown of the small time-step heuristics can be circumvented by making minor modifications to the algorithm, leading to a deterministic convergence proof for the global error of such algorithms as τ → 0. An underlying theory is developed and the deterministic and probabilistic convergence results proved as particular applications of this theory.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/v2414-vse17Waveform relaxation as a dynamical system
https://resolver.caltech.edu/CaltechAUTHORS:20170612-145717225
Authors: {'items': [{'id': 'Bjørhus-M', 'name': {'family': 'Bjørhus', 'given': 'Morten'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 1997
DOI: 10.1090/S0025-5718-97-00847-8
In this paper the properties of waveform relaxation are studied when applied to the dynamical system generated by an autonomous ordinary differential equation. In particular, the effect of the waveform relaxation on the invariant sets of the flow is analysed. Windowed waveform relaxation is studied, whereby the iterative technique is applied on successive time intervals of length T and a fixed, finite, number of iterations taken on each window. This process does not generate a dynamical system on R+ since two different applications of the waveform algorithm over different time intervals do not, in general, commute. In order to generate a dynamical system it is necessary to consider the time T map generated by the relaxation process. This is done, and C^1-closeness of the resulting map to the time T map of the underlying ordinary differential equation is established. Using this, various results from the theory of dynamical systems are applied, and the results discussed.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/0ksz3-zzy68Analysis of the dynamics of local error control via a piecewise continuous residual
https://resolver.caltech.edu/CaltechAUTHORS:20170613-110046810
Authors: {'items': [{'id': 'Higham-D-J', 'name': {'family': 'Higham', 'given': 'D. J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1998
DOI: 10.1007/BF02510916
Positive results are obtained about the effect of local error control in numerical simulations of ordinary differential equations. The results are cast in terms of the local error tolerance. Under theassumption that a local error control strategy is successful, it is shown that a continuous interpolant through the numerical solution exists that satisfies the differential equation to within a small, piecewise continuous, residual. The assumption is known to hold for thematlab ode23 algorithm [10] when applied to a variety of problems.
Using the smallness of the residual, it follows that at any finite time the continuous interpolant converges to the true solution as the error tolerance tends to zero. By studying the perturbed differential equation it is also possible to prove discrete analogs of the long-time dynamical properties of the equation—dissipative, contractive and gradient systems are analysed in this way.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/jy9w6-e3569Persistence of Invariant Sets for Dissipative Evolution Equations
https://resolver.caltech.edu/CaltechAUTHORS:20170609-123316540
Authors: {'items': [{'id': 'Jones-D-A', 'name': {'family': 'Jones', 'given': 'Don A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Titi-E-S', 'name': {'family': 'Titi', 'given': 'Edriss S.'}}]}
Year: 1998
DOI: 10.1006/jmaa.1997.5847
We show that results concerning the persistence of invariant sets of ordinary differential equations under perturbation may be applied directly to a certain class of partial differential equations. Our framework is particularly well-suited to encompass numerical approximations of these partial differential equations. Specifically, we show that for a class of PDEs with aC^1 inertial form, certain natural numerical approximations possess an inertial form close to that of the underlying PDE in theC^1 norm.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/z63zr-1jh21On the Solution of Convection-Diffusion Boundary Value Problems Using Equidistributed Grids
https://resolver.caltech.edu/CaltechAUTHORS:20170612-143618623
Authors: {'items': [{'id': 'Budd-C-J', 'name': {'family': 'Budd', 'given': 'C. J.'}}, {'id': 'Koomullil-G-P', 'name': {'family': 'Koomullil', 'given': 'G. P.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 1998
DOI: 10.1137/S1064827595280454
The effect of using grid adaptation on the numerical solution of model convection-diffusion equations with a conservation form is studied. The grid adaptation technique studied is based on moving a fixed number of mesh points to equidistribute a generalization of the arc-length of the solution. In particular, a parameter-dependent monitor function is introduced which incorporates fixed meshes, approximate arc-length equidistribution, and equidistribution of the absolute value of the solution, in a single framework. Thus the resulting numerical method is a coupled nonlinear system of equations for the mesh spacings and the nodal values. A class of singularly perturbed problems, including Burgers's equation in the limit of small viscosity, is studied. Singular perturbation and bifurcation techniques are used to analyze the solution of the discretized equations, and numerical results are compared with the results from the analysis. Computation of the bifurcation diagram of the system is performed numerically using a continuation method and the results are used to illustrate the theory. It is shown that equidistribution does not remove spurious solutions present on a fixed mesh and that, furthermore, the spurious solutions can be stable for an appropriate moving mesh method.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/j3185-35a16Space-Time Continuous Analysis of Waveform Relaxation for the Heat Equation
https://resolver.caltech.edu/CaltechAUTHORS:20170612-134545090
Authors: {'items': [{'id': 'Gander-M-J', 'name': {'family': 'Gander', 'given': 'Martin J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 1998
DOI: 10.1137/S1064827596305337
Waveform relaxation algorithms for partial differential equations (PDEs) are traditionally obtained by discretizing the PDE in space and then splitting the discrete operator using matrix splittings. For the semidiscrete heat equation one can show linear convergence on unbounded time intervals and superlinear convergence on bounded time intervals by this approach. However, the bounds depend in general on the mesh parameter and convergence rates deteriorate as one refines the mesh.
Motivated by the original development of waveform relaxation in circuit simulation, where the circuits are split in the physical domain into subcircuits, we split the PDE by using overlapping domain decomposition. We prove linear convergence of the algorithm in the continuous case on an infinite time interval, at a rate depending on the size of the overlap. This result remains valid after discretization in space and the convergence rates are robust with respect to mesh refinement. The algorithm is in the class of waveform relaxation algorithms based on overlapping multisplittings. Our analysis quantifies the empirical observation by Jeltsch and Pohl [SIAM J. Sci. Comput., 16 (1995), pp. 40--49] that the convergence rate of a multisplitting algorithm depends on the overlap.
Numerical results are presented which support the convergence theory.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/j6fbw-vgy61Convergence results for the MATLAB ODE23 routine
https://resolver.caltech.edu/CaltechAUTHORS:20170609-154149149
Authors: {'items': [{'id': 'Lamba-H', 'name': {'family': 'Lamba', 'given': 'H.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1998
DOI: 10.1007/BF02510413
We prove convergence results on finite time intervals, as the user-defined tolerance τ→0, for a class of adaptive timestepping ODE solvers that includes the ode23 routine supplied in MATLAB Version 4.2. In contrast to existing theories, these convergence results hold with error constants that are uniform in the neighbourhood of equilibria; such uniformity is crucial for the derivation of results concerning the numerical approximation of dynamical systems. For linear problems the error estimates are uniform on compact sets of initial data. The analysis relies upon the identification of explicit embedded Runge-Kutta pairs for which all but the leading order terms of the expansion of the local error estimate areO(∥f(u∥)^2).https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/5k2bc-fam50Qualitative properties of modified equations
https://resolver.caltech.edu/CaltechAUTHORS:20170609-164025962
Authors: {'items': [{'id': 'Gonzalez-O', 'name': {'family': 'Gonzalez', 'given': 'O.'}}, {'id': 'Higham-D-J', 'name': {'family': 'Higham', 'given': 'D. J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1999
DOI: 10.1093/imanum/19.2.169
Suppose that a consistent one-step numerical method of order r is applied to a smooth system of ordinary differential equations. Given any integer m ⩾ 1, the method may be shown to be of order r + m as an approximation to a certain modified equation. If the method and the system have a particular qualitative property then it is important to determine whether the modified equations inherit this property. In this article, a technique is introduced for proving that the modified equations inherit qualitative properties from the method and the underlying system. The technique uses a straightforward contradiction argument applicable to arbitrary one-step methods and does not rely on the detailed structure of associated power series expansions. Hence the conclusions apply, but are not restricted, to the case of Runge-Kutte methods. The new approach unifies and extends results of this type that have been derived by other means: results are presented for integral preservation, reversibility, inheritance of fixed points. Hamiltonian problems and volume preservation. The technique also applies when the system has an integral that the method preserves not exactly, but to order greater than r. Finally, a negative result is obtained by considering a gradient system and gradient numerical method possessing a global property that is not shared by the associated modified equations.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/wcj2z-qr303Ergodicity of Dissipative Differential Equations Subject to Random Impulses
https://resolver.caltech.edu/CaltechAUTHORS:20170609-124028394
Authors: {'items': [{'id': 'Sanz-Serna-J-M', 'name': {'family': 'Sanz-Serna', 'given': 'J. M.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 1999
DOI: 10.1006/jdeq.1998.3594
Differential equations subject to random impulses are studied. Randomness is introduced both through the time between impulses, which is distributed exponentially, and through the sign of the impulses, which are fixed in amplitude and orientation. Such models are particular instances of piecewise deterministic Markov processes and they arise naturally in the study of a number of physical phenomena, particularly impacting systems. The underlying deterministic semigroup is assumed to be dissipative and a general theorem which establishes the existence of invariant measures for the randomly forced problem is proved. Further structure is then added to the deterministic semigroup, which enables the proof of ergodic theorems. Characteristic functions are used for the case when the deterministic component forms a damped linear problem and irreducibility measures are employed for the study of a randomly forced damped double-well nonlinear oscillator with a gradient structure.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/hwz0s-hr185Analysis and Experiments for a Computational Model of a Heat Bath
https://resolver.caltech.edu/CaltechAUTHORS:20170609-161129844
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Warren-J-O', 'name': {'family': 'Warren', 'given': 'J. O.'}}]}
Year: 1999
DOI: 10.1023/A:1004667325896
A question of some interest in computational statistical mechanics is whether macroscopic quantities can be accurately computed without detailed resolution of the fastest scales in the problem. To address this question a simple model for a distinguished particle immersed in a heat bath is studied (due to Ford and Kac). The model yields a Hamiltonian system of dimension 2N+2 for the distinguished particle and the degrees of freedom describing the bath. It is proven that, in the limit of an infinite number of particles in the heat bath (N→∞), the motion of the distinguished particle is governed by a stochastic differential equation (SDE) of dimension 2. Numerical experiments are then conducted on the Hamiltonian system of dimension 2N+2 (N≫1) to investigate whether the motion of the distinguished particle is accurately computed (i.e., whether it is close to the solution of the SDE) when the time step is small relative to the natural time scale of the distinguished particle, but the product of the fastest frequency in the heat bath and the time step is not small—the underresolved regime in which many computations are performed. It is shown that certain methods accurately compute the limiting behavior of the distinguished particle, while others do not. Those that do not are shown to compute a different, incorrect, macroscopic limit.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/ryqrq-mjj54Convergence Proofs for Numerical IVP Software
https://resolver.caltech.edu/CaltechAUTHORS:20170613-142949630
Authors: {'items': [{'id': 'Lamba-H', 'name': {'family': 'Lamba', 'given': 'Harbir'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2000
DOI: 10.1007/978-1-4612-1274-4_6
The study of the running times of algorithms in computer science can be broken down into two broad types: worst-case and average-case analyses. For many problems this distinction is very important as the orders of magnitude (in terms of some measure of the problem size) of the running times may differ significantly in each case, providing useful information about the merits of the algorithm. Historically average-case analyses were first done with respect to a measure on the input data; to counter the argument that it is often difficult to find a natural measure on the data, randomised algorithms were then developed.
In this paper similar questions are studied for adaptive software used to integrate initial value problems for ODEs. In worst case these algorithms may fail completely giving O (1) errors. We consider the probability of failure for generic vector fields with random initial data chosen from a ball and perform average-case and worst-case analyses.We then perform a different average-case analysis where, having fixed the initial data, it is the algorithm that is chosen at random from some suitable class.This last analysis suggests a modified deterministic algorithm which cannot fail for generic vector fields.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/bfwg1-6pg46A Perturbation Theory for Ergodic Markov Chains and Application to Numerical Approximations
https://resolver.caltech.edu/CaltechAUTHORS:20170613-080747440
Authors: {'items': [{'id': 'Shardlow-T', 'name': {'family': 'Shardlow', 'given': 'T.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2000
DOI: 10.1137/S0036142998337235
Perturbations to Markov chains and Markov processes are considered. The unperturbed problem is assumed to be geometrically ergodic in the sense usually established through the use of Foster--Lyapunov drift conditions. The perturbations are assumed to be uniform, in a weak sense, on bounded time intervals. The long-time behavior of the perturbed chain is studied. Applications are given to numerical approximations of a randomly impulsed ODE, an Itô stochastic differential equation (SDE), and a parabolic stochastic partial differential equation (SPDE) subject to space-time Brownian noise. Existing perturbation theories for geometrically ergodic Markov chains are not readily applicable to these situations since they require very stringent hypotheses on the perturbations.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/tadm8-xe767Statistics From Computations
https://resolver.caltech.edu/CaltechAUTHORS:20170614-073434784
Authors: {'items': [{'id': 'Sigurgeirsson-H', 'name': {'family': 'Sigurgeirsson', 'given': 'Hersir'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2001
The study of numerical methods for initial value problems by considering their approximation properties from a dynamical systems viewpoint is now a well-established field; a substantial body of knowledge, developed over the past two decades, can be found in the literature. Nonetheless many open questions remain concerning the meaning of long-time simulations performed by approximating dynamical systems. In recent years various attempts to analyse the statistical content of these long-time simulations have emerged, and the purpose of this article is to review some of that work. The subject area is far from complete; nonetheless a certain unity can be seen in what has been achieved to date and it is therefore of value to give an overview of the field. Some mathematical background concerning the propagation of probability measures by discrete and continuous time dynamical systems or Markov chains will be given. In particular the Frobenius-Perron and Fokker-Planck operators will be described. Using the notion of ergodicity two different approaches, direct and indirect, will be outlined. The majority of the review is concerned with indirect methods, where the initial value problem is simulated from a single initial condition and the statistical content of this trajectory studied. Three classes of problems will be studied: deterministic problems in fixed finite dimension, stochastic problems in fixed finite dimension, and deterministic problems with random data in dimension n → ∞; in the latter case ideas from statistical mechanics can be exploited to analyse or interpret numerical schemes.
Throughout, the ideas are illustrated by simple numerical experiments. The emphasis is on understanding underlying concepts at a high level and mathematical detail will not be given a high priority in this review.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/00a3b-d0w60Stiff Oscillatory Systems, Delta Jumps and White Noise
https://resolver.caltech.edu/CaltechAUTHORS:20170613-130016747
Authors: {'items': [{'id': 'Cano-B', 'name': {'family': 'Cano', 'given': 'B.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Süli-E', 'name': {'family': 'Süli', 'given': 'E.'}}, {'id': 'Warren-J-O', 'name': {'family': 'Warren', 'given': 'J. O.'}}]}
Year: 2001
DOI: 10.1007/s10208001002
Two model problems for stiff oscillatory systems are introduced. Both comprise a linear superposition of N ≫ 1 harmonic oscillators used as a forcing term for a scalar ODE. In the first case the initial conditions are chosen so that the forcing term approximates a delta function as N → ∞ and in the second case so that it approximates white noise. In both cases the fastest natural frequency of the oscillators is OM(N). The model problems are integrated numerically in the stiff regime where the time-step Δt satisfies NΔt=O(1). The convergence of the algorithms is studied in this case in the limit N → ∞ and Δt → 0.For the white noise problem both strong and weak convergence are considered. Order reduction phenomena are observed numerically and proved theoretically.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/cwbqy-seq30Underresolved Simulations of Heat Baths
https://resolver.caltech.edu/CaltechAUTHORS:20170609-132504945
Authors: {'items': [{'id': 'Cano-B', 'name': {'family': 'Cano', 'given': 'B.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2001
DOI: 10.1006/jcph.2001.6722
Some recent numerical and theoretical studies indicate that it is possible to accurately simulate the macroscopic motion of a particle in a heat bath, comprising coupled oscillators, without accurately resolving the fast frequencies in the heat bath itself. Here we study this issue further by performing numerical experiments on a wide variety of mechanical heat bath models, all generalizations of the Ford–Kac oscillator model. The results indicate that the nature of the particle-bath damping in the macroscopic limit crucially affects the ability of underresolved simulations to correctly predict macroscopic behaviour. In particular, problems for which the damping is local in time pose more severe problems for approximation. The root cause is that local damping typically arises from the degeneration of a memory kernel to a delta singularity in the macroscopic limit. The approximation of such singularities is a more delicate issue than the approximation of smoother memory kernels.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/v95v4-b0889Algorithms for Particle-Field Simulations with Collisions
https://resolver.caltech.edu/CaltechAUTHORS:20170612-063817274
Authors: {'items': [{'id': 'Sigurgeirsson-H', 'name': {'family': 'Sigurgeirsson', 'given': 'Hersir'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}, {'id': 'Wan-Wing-Lok', 'name': {'family': 'Wan', 'given': 'Wing-Lok'}}]}
Year: 2001
DOI: 10.1006/jcph.2001.6858
We develop an efficient algorithm for detecting collisions among a large number of particles moving in a velocity field, when the field itself is possibly coupled to the particle motions. We build on ideas from molecular dynamics simulations and, as a byproduct, give a literature survey of methods for hard sphere molecular dynamics. We analyze the complexity of the algorithm in detail and present several experimental results on performance which corroborate the analysis. An optimal algorithm for collision detection has cost scaling at least like the total number of collisions detected. We argue, both theoretically and experimentally, that with the appropriate parameter choice and when the number of collisions grows with the number of particles at least as fast as for billiards, the algorithm we recommend is optimal.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/ysfda-hn059Geometric Ergodicity of Some Hypo-Elliptic Diffusions for Particle Motions
https://resolver.caltech.edu/CaltechAUTHORS:20170613-125012320
Authors: {'items': [{'id': 'Mattingly-J-C', 'name': {'family': 'Mattingly', 'given': 'J. C.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2002
Two degenerate SDEs arising in statistical physics are studied. The first is a Langevin equation with state-dependent noise and damping. The second is the equation of motion for a particle obeying Stokes' law in a Gaussian random field; this field is chosen to mimic certain features of turbulence. Both equations are hypo-elliptic and smoothness of probability densities may be established. By developing appropriate Lyapunov functions and by studying the necessary control problems, geometric ergodicity is proved.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/3qxjb-xaq09Deterministic and random dynamical systems: theory and numerics
https://resolver.caltech.edu/CaltechAUTHORS:20170613-125750895
Authors: {'items': [{'id': 'Humphries-A-R', 'name': {'family': 'Humphries', 'given': 'A. R.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2002
DOI: 10.1007/978-94-010-0510-4_6
The theory of (random) dynamical systems is a framework for the analysis of large time behaviour of time-evolving systems (driven by noise). These notes contain an elementary introduction to the theory of both dynamical and random dynamical systems. The subject matter is made accessible by means of very simple examples and highlights relationships between the deterministic and the random theories.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/85fr2-jrr64Inertial Particles in a Random Field
https://resolver.caltech.edu/CaltechAUTHORS:20170612-073927763
Authors: {'items': [{'id': 'Sigurgeirsson-H', 'name': {'family': 'Sigurgeirsson', 'given': 'H.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2002
DOI: 10.1142/S021949370200042X
The motion of an inertial particle in a Gaussian random field is studied. This is a model for the phenomenon of preferential concentration, whereby inertial particles in a turbulent flow can correlate significantly. Mathematically the motion is described by Newton's second law for a particle on a 2-D torus, with force proportional to the difference between a background fluid velocity and the particle velocity itself. The fluid velocity is defined through a linear stochastic PDE of Ornstein–Uhlenbeck type. The properties of the model are studied in terms of the covariance of the noise which drives the stochastic PDE.
Sufficient conditions are found for almost sure existence and uniqueness of particle paths, and for a random dynamical system with a global random attractor. The random attractor is illustrated by means of a numerical experiment, and the relevance of the random attractor for the understanding of particle distributions is highlighted.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/ybb6p-xmq84The dynamical behavior of the discontinuous Galerkin method and related difference schemes
https://resolver.caltech.edu/CaltechAUTHORS:20170609-164822143
Authors: {'items': [{'id': 'Estep-D-J', 'name': {'family': 'Estep', 'given': 'Donald J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2002
DOI: 10.1090/S0025-5718-01-01364-3
We study the dynamical behavior of the discontinuous Galerkin finite element method for initial value problems in ordinary differential equations. We make two different assumptions which guarantee that the continuous problem defines a dissipative dynamical system. We show that, under certain conditions, the discontinuous Galerkin approximation also defines a dissipative dynamical system and we study the approximation properties of the associated discrete dynamical system. We also study the behavior of difference schemes obtained by applying a quadrature formula to the integrals defining the discontinuous Galerkin approximation and construct two kinds of discrete finite element approximations that share the dissipativity properties of the original method.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/emy9z-ptf84Strong Convergence of Euler-Type Methods for Nonlinear Stochastic Differential Equations
https://resolver.caltech.edu/CaltechAUTHORS:20170613-133526351
Authors: {'items': [{'id': 'Higham-D-J', 'name': {'family': 'Higham', 'given': 'Desmond J.'}}, {'id': 'Mao-Xuerong', 'name': {'family': 'Mao', 'given': 'Xuerong'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2002
DOI: 10.1137/S0036142901389530
Traditional finite-time convergence theory for numerical methods applied to stochastic differential equations (SDEs) requires a global Lipschitz assumption on the drift and diffusion coefficients. In practice, many important SDE models satisfy only a local Lipschitz property and, since Brownian paths can make arbitrarily large excursions, the global Lipschitz-based theory is not directly relevant. In this work we prove strong convergence results under less restrictive conditions. First, we give a convergence result for Euler-Maruyama requiring only that the SDE is locally Lipschitz and that the pth moments of the exact and numerical solution are bounded for some p > 2. As an application of this general theory we show that an implicit variant of Euler-Maruyama converges if the diffusion coefficient is globally Lipschitz, but the drift coefficient satisfies only a one-sided Lipschitz condition; this is achieved by showing that the implicit method has bounded moments and may be viewed as an Euler-Maruyama approximation to a perturbed SDE of the same form. Second, we show that the optimal rate of convergence can be recovered if the drift coefficient is also assumed to behave like a polynomial.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/1wrtf-vyz04Ergodicity for SDEs and approximations: locally Lipschitz vector fields and degenerate noise
https://resolver.caltech.edu/CaltechAUTHORS:20170609-125050787
Authors: {'items': [{'id': 'Mattingly-J-C', 'name': {'family': 'Mattingly', 'given': 'J. C.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Higham-D-J', 'name': {'family': 'Higham', 'given': 'D. J.'}}]}
Year: 2002
DOI: 10.1016/S0304-4149(02)00150-3
The ergodic properties of SDEs, and various time discretizations for SDEs, are studied. The ergodicity of SDEs is established by using techniques from the theory of Markov chains on general state spaces, such as that expounded by Meyn–Tweedie. Application of these Markov chain results leads to straightforward proofs of geometric ergodicity for a variety of SDEs, including problems with degenerate noise and for problems with locally Lipschitz vector fields. Applications where this theory can be usefully applied include damped-driven Hamiltonian problems (the Langevin equation), the Lorenz equation with degenerate noise and gradient systems.
The same Markov chain theory is then used to study time-discrete approximations of these SDEs. The two primary ingredients for ergodicity are a minorization condition and a Lyapunov condition. It is shown that the minorization condition is robust under approximation. For globally Lipschitz vector fields this is also true of the Lyapunov condition. However in the locally Lipschitz case the Lyapunov condition fails for explicit methods such as Euler–Maruyama; for pathwise approximations it is, in general, only inherited by specially constructed implicit discretizations. Examples of such discretization based on backward Euler methods are given, and approximation of the Langevin equation studied in some detail.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/jdabe-vzf97A model for preferential concentration
https://resolver.caltech.edu/CaltechAUTHORS:20170609-152746245
Authors: {'items': [{'id': 'Sigurgeirsson-H', 'name': {'family': 'Sigurgeirsson', 'given': 'H.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2002
DOI: 10.1063/1.1517603
The preferential concentration of inertial particles in a turbulent velocity field occurs when the particle and fluid time constants are commensurate. We propose a straightforward mathematical model for this phenomenon and use the model to study various scaling limits of interest and to study numerically the effect of interparticle collisions. The model comprises Stokes' law for the particle motions, and a Gaussian random field for the velocity. The primary advantages of the model are its amenability to mathematical analysis in various interesting scaling limits and the speed at which numerical simulations can be performed. The scaling limits corroborate experimental evidence about the lack of preferential concentration for a large and small Stokes number and make new predictions about the possibility of preferential concentration at large times and lead to stochastic differential equations governing this phenomenon. The effect of collisions is found to be negligible for the most part, although in some cases they have an interesting antidiffusive effect.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/8bjhp-4ga86Long-Term Behavior of Large Mechanical Systems with Random Initial Data
https://resolver.caltech.edu/CaltechAUTHORS:20170612-072429575
Authors: {'items': [{'id': 'Kupferman-R', 'name': {'family': 'Kupferman', 'given': 'R.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Terry-J-R', 'name': {'family': 'Terry', 'given': 'J. R.'}}, {'id': 'Tupper-P-F', 'name': {'family': 'Tupper', 'given': 'P. F.'}}]}
Year: 2002
DOI: 10.1142/S0219493702000571
We study the long-time behaviour of large systems of ordinary differential equations with random data. Our main focus is a Hamiltonian system which describes a distinguished particle attached to a large collection of heat bath particles by springs. In the limit where the size of the heat bath tends to infinity, the trajectory of the distinguished particle can be weakly approximated, on finite time intervals, by a Langevin stochastic differential equation. We examine the long-term behaviour of these trajectories, both analytically and numerically. We find ergodic behaviour manifest in both the long-time empirical measures and in the resulting auto-correlation functions.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/4h8ry-5f831Exponential Mean-Square Stability of Numerical Solutions to Stochastic Differential Equations
https://resolver.caltech.edu/CaltechAUTHORS:20170612-133950435
Authors: {'items': [{'id': 'Higham-D-J', 'name': {'family': 'Higham', 'given': 'Desmond J.'}}, {'id': 'Mao-Xuerong', 'name': {'family': 'Mao', 'given': 'Xuerong'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2003
DOI: 10.1112/S1461157000000462
Positive results are proved here about the ability of numerical simulations to reproduce the exponential mean-square stability of stochastic differential equations (SDEs). The first set of results applies under finite-time convergence conditions on the numerical method. Under these conditions, the exponential mean-square stability of the SDE and that of the method (for sufficiently small step sizes) are shown to be equivalent, and the corresponding second-moment Lyapunov exponent bounds can be taken to be arbitrarily close. The required finite-time convergence conditions hold for the class of stochastic theta methods on globally Lipschitz problems. It is then shown that exponential mean-square stability for non-globally Lipschitz SDEs is not inherited, in general, by numerical methods. However, for a class of SDEs that satisfy a one-sided Lipschitz condition, positive results are obtained for two implicit methods. These results highlight the fact that for long-time simulation on nonlinear SDEs, the choice of numerical method can be crucial.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/zf7r4-f9p79Extracting macroscopic stochastic dynamics: Model problems
https://resolver.caltech.edu/CaltechAUTHORS:20170609-144239769
Authors: {'items': [{'id': 'Huisinga-W', 'name': {'family': 'Huisinga', 'given': 'Wilhelm'}}, {'id': 'Schütte-C', 'name': {'family': 'Schütte', 'given': 'Christof'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2003
DOI: 10.1002/cpa.10057
The purpose of this work is to shed light on an algorithm designed to extract effective macroscopic models from detailed microscopic simulations. The particular algorithm we study is a recently developed transfer operator approach due to Schütte et al. [20]. The investigations involve the formulation, and subsequent numerical study, of a class of model problems. The model problems are ordinary differential equations constructed to have the property that, when projected onto a low-dimensional subspace, the dynamics is approximately that of a stochastic differential equation exhibiting a finite-state-space Markov chain structure. The numerical studies show that the transfer operator approach can accurately extract finite-state Markov chain behavior embedded within high-dimensional ordinary differential equations. In so doing the studies lend considerable weight to existing applications of the algorithm to the complex systems arising in applications such as molecular dynamics. The algorithm is predicated on the assumption of Markovian input data; further numerical studies probe the role of memory effects. Although preliminary, these studies of memory indicate interesting avenues for further development of the transfer operator methodology.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/7dy8d-jh373White Noise Limits for Inertial Particles in a Random Field
https://resolver.caltech.edu/CaltechAUTHORS:20170609-165633126
Authors: {'items': [{'id': 'Pavliotis-G-A', 'name': {'family': 'Pavliotis', 'given': 'G. A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2003
DOI: 10.1137/S1540345903421076
In this paper we present a rigorous analysis of a scaling limit related to the motion of an inertial particle in a Gaussian random field. The mathematical model comprises Stokes's law for the particle motion and an infinite dimensional Ornstein-Uhlenbeck process for the fluid velocity field. The scaling limit studied leads to a white noise limit for the fluid velocity, which balances particle inertia and the friction term. Strong convergence methods are used to justify the limiting equations. The rigorously derived limiting equations are of physical interest for the concrete problem under investigation and facilitate the study of two-point motions in the white noise limit. Furthermore, the methodology developed may also prove useful in the study of various other asymptotic problems for stochastic differential equations in infinite dimensions.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/38qj0-9bk43Itô versus Stratonovich white-noise limits for systems with inertia and colored multiplicative noise
https://resolver.caltech.edu/CaltechAUTHORS:20170612-132002885
Authors: {'items': [{'id': 'Kupferman-R', 'name': {'family': 'Kupferman', 'given': 'R.'}}, {'id': 'Pavliotis-G-A', 'name': {'family': 'Pavliotis', 'given': 'G. A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2004
DOI: 10.1103/PhysRevE.70.036120
We consider the dynamics of systems in the presence of inertia and colored multiplicative noise. We study the limit where the particle relaxation time and the correlation time of the noise both tend to zero. We show that the limiting equation for the particle position depends on the magnitude of the particle relaxation time relative to the noise correlation time. In particular, the limiting equation should be interpreted either in the Itô or Stratonovich sense, with a crossover occurring when the two fast-time scales are of comparable magnitude. At the crossover the limiting stochastic differential equation is neither of Itô nor of Stratonovich type. This means that, after adiabatic elimination, the governing equations have different drift fields, leading to different physical behavior depending on the relative magnitude of the two fast-time scales. Our findings are supported by numerical simulations.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/e9afs-ap084Extracting macroscopic dynamics: model problems and algorithms
https://resolver.caltech.edu/CaltechAUTHORS:20170609-153344149
Authors: {'items': [{'id': 'Givon-D', 'name': {'family': 'Givon', 'given': 'Dror'}}, {'id': 'Kupferman-R', 'name': {'family': 'Kupferman', 'given': 'Raz'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2004
DOI: 10.1088/0951-7715/17/6/R01
In many applications, the primary objective of numerical simulation of time-evolving systems is the prediction of coarse-grained, or macroscopic, quantities. The purpose of this review is twofold: first, to describe a number of simple model systems where the coarse-grained or macroscopic behaviour of a system can be explicitly determined from the full, or microscopic, description; and second, to overview some of the emerging algorithmic approaches that have been introduced to extract effective, lower-dimensional, macroscopic dynamics.
The model problems we describe may be either stochastic or deterministic in both their microscopic and macroscopic behaviour, leading to four possibilities in the transition from microscopic to macroscopic descriptions. Model problems are given which illustrate all four situations, and mathematical tools for their study are introduced. These model problems are useful in the evaluation of algorithms. We use specific instances of the model problems to illustrate these algorithms. As the subject of algorithm development and analysis is, in many cases, in its infancy, the primary purpose here is to attempt to unify some of the emerging ideas so that individuals new to the field have a structured access to the literature. Furthermore, by discussing the algorithms in the context of the model problems, a platform for understanding existing algorithms and developing new ones is built.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/s3yvm-6xj15Conditional Path Sampling of SDEs and the Langevin MCMC Method
https://resolver.caltech.edu/CaltechAUTHORS:20170612-144819276
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Voss-J', 'name': {'family': 'Voss', 'given': 'Jochen'}, 'orcid': '0000-0001-7740-8811'}, {'id': 'Wilberg-P', 'name': {'family': 'Wilberg', 'given': 'Petter'}}]}
Year: 2004
DOI: 10.4310/CMS.2004.v2.n4.a7
We introduce a stochastic PDE based approach to sampling paths of SDEs, conditional on observations. The SPDEs are derived by generalising the Langevin MCMC method to infinite dimensions. Various applications are described, including sampling paths subject to two end-point conditions (bridges) and nonlinear filter/smoothers.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/degwz-8cg45Fitting SDE models to nonlinear Kac–Zwanzig heat bath models
https://resolver.caltech.edu/CaltechAUTHORS:20170609-143006730
Authors: {'items': [{'id': 'Kupferman-R', 'name': {'family': 'Kupferman', 'given': 'R.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2004
DOI: 10.1016/j.physd.2004.04.011
We study a class of "particle in a heat bath" models, which are a generalization of the well-known Kac–Zwanzig class of models, but where the coupling between the distinguished particle and the n heat bath particles is through nonlinear springs. The heat bath particles have random initial data drawn from an equilibrium Gibbs density. The primary objective is to approximate the forces exerted by the heat bath—which we do not want to resolve—by a stochastic process. By means of the central limit theorem for Gaussian processes, and heuristics based on linear response theory, we demonstrate conditions under which it is natural to expect that the trajectories of the distinguished particle can be weakly approximated, as n→∞, by the solution of a Markovian SDE. The quality of this approximation is verified by numerical calculations with parameters chosen according to the linear response theory. Alternatively, the parameters of the effective equation can be chosen using time series analysis. This is done and agreement with linear response theory is shown to be good.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/6qjxh-dpa95Analysis of SPDEs arising in path sampling. Part I: The Gaussian case
https://resolver.caltech.edu/CaltechAUTHORS:20170612-141658808
Authors: {'items': [{'id': 'Hairer-M', 'name': {'family': 'Hairer', 'given': 'M.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Voss-J', 'name': {'family': 'Voss', 'given': 'J.'}, 'orcid': '0000-0001-7740-8811'}, {'id': 'Wiberg-P', 'name': {'family': 'Wiberg', 'given': 'P.'}}]}
Year: 2005
DOI: 10.4310/CMS.2005.v3.n4.a8
In many applications it is important to be able to sample paths of SDEs conditional on observations of various kinds. This paper studies SPDEs which solve such sampling problems. The SPDE may be viewed as an infinite dimensional analogue of the Langevin SDE used in finite dimensional sampling. Here the theory is developed for conditioned Gaussian processes for which the resulting SPDE is linear. Applications include the Kalman-Bucy filter/smoother. A companion paper studies the nonlinear case, building on the linear analysis provided here.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/m0ncv-2k754Periodic homogenization for inertial particles
https://resolver.caltech.edu/CaltechAUTHORS:20170609-143518545
Authors: {'items': [{'id': 'Pavliotis-G-A', 'name': {'family': 'Pavliotis', 'given': 'G. A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2005
DOI: 10.1016/j.physd.2005.04.011
We study the problem of homogenization for inertial particles moving in a periodic velocity field, and subject to molecular diffusion. We show that, under appropriate assumptions on the velocity field, the large scale, long time behavior of the inertial particles is governed by an effective diffusion equation for the position variable alone. To achieve this we use a formal multiple scale expansion in the scale parameter. This expansion relies on the hypo-ellipticity of the underlying diffusion. An expression for the diffusivity tensor is found and various of its properties studied. In particular, an expansion in terms of the non-dimensional particle relaxation time τ (the Stokes number) is shown to co-incide with the known result for passive (non-inertial) tracers in the singular limit τ→0. This requires the solution of a singular perturbation problem, achieved by means of a formal multiple scales expansion in τ Incompressible and potential fields are studied, as well as fields which are neither, and theoretical findings are supported by numerical simulations.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/qgpd9-31t64Analysis of White Noise Limits for Stochastic Systems with Two Fast Relaxation Times
https://resolver.caltech.edu/CaltechAUTHORS:20170612-130002475
Authors: {'items': [{'id': 'Pavliotis-G-A', 'name': {'family': 'Pavliotis', 'given': 'G. A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2005
DOI: 10.1137/040610507
In this paper we present a rigorous asymptotic analysis for stochastic systems with two fast relaxation times. The mathematical model analyzed in this paper consists of a Langevin equation for the particle motion with time-dependent force constructed through an infinite dimensional Gaussian noise process. We study the limit as the particle relaxation time as well as the correlation time of the noise tend to zero, and we obtain the limiting equations under appropriate assumptions on the Gaussian noise. We show that the limiting equation depends on the relative magnitude of the two fast time scales of the system. In particular, we prove that in the case where the two relaxation times converge to zero at the same rate there is a drift correction, in addition to the limiting Itô integral, which is not of Stratonovich type. If, on the other hand, the colored noise is smooth on the scale of particle relaxation, then the drift correction is the standard Stratonovich correction. If the noise is rough on this scale, then there is no drift correction. Strong (i.e., pathwise) techniques are used for the proof of the convergence theorems.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/pgq5p-e9448Monte Carlo Studies of Effective Diffusivities for Inertial Particles
https://resolver.caltech.edu/CaltechAUTHORS:20170613-140425580
Authors: {'items': [{'id': 'Pavliotis-G-A', 'name': {'family': 'Pavliotis', 'given': 'G. A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Band-L', 'name': {'family': 'Band', 'given': 'L.'}}]}
Year: 2006
DOI: 10.1007/3-540-31186-6_26
The transport of inertial particles in incompressible flows and subject to molecular diffusion is studied through direct numerical simulations. It was shown in recent work [9, 15] that the long time behavior of inertial particles, with motion governed by Stokes' law in a periodic velocity field and in the presence of molecular diffusion, is diffusive. The effective diffusivity is defined through the solution of a degenerate elliptic partial differential equation. In this paper we study the dependence of the effective diffusivity on the non-dimensional parameters of the problem, as well as on the streamline topology, for a class of two dimensional periodic incompressible flows.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/ft7hh-p3059The Moment Map: Nonlinear Dynamics of Density Evolution via a Few Moments
https://resolver.caltech.edu/CaltechAUTHORS:20170612-131025773
Authors: {'items': [{'id': 'Barkley-D', 'name': {'family': 'Barkley', 'given': 'D.'}}, {'id': 'Kevrekidis-I-G', 'name': {'family': 'Kevrekidis', 'given': 'I. G.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2006
DOI: 10.1137/050638667
We explore situations in which certain stochastic and high-dimensional deterministic systems behave effectively as low-dimensional dynamical systems. We define and study moment maps, maps on spaces of low-order moments of evolving distributions, as a means of understanding equation-free multiscale algorithms for these systems. The moment map itself is deterministic and attempts to capture the implied probability distribution of the dynamics. By choosing situations where the low-dimensional dynamics can be understood a priori, we evaluate the moment map. Despite requiring the evolution of an ensemble to define the map, this can be an efficient numerical tool, as the map opens up the possibility of bifurcation analyses and other high level tasks being performed on the system. We demonstrate how nonlinearity arises in these maps and how this results in the stabilization of metastable states. Examples are shown for a hierarchy of models, ranging from simple stochastic differential equations to molecular dynamics simulations of a particle in contact with a heat bath.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/xzsyn-gr236Homogenization for inertial particles in a random flow
https://resolver.caltech.edu/CaltechAUTHORS:20161108-174342361
Authors: {'items': [{'id': 'Pavliotis-G-A', 'name': {'family': 'Pavliotis', 'given': 'G. A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Zygalakis-K-C', 'name': {'family': 'Zygalakis', 'given': 'K. C.'}}]}
Year: 2007
DOI: 10.4310/CMS.2007.v5.n3.a1
We study the problem of homogenization for inertial particles moving in a time-dependent random velocity field and subject to molecular diffusion. We show that, under appropriate assumptions on the velocity field, the large-scale, long-time behavior of the inertial particles is governed by an effective diffusion equation for the position variable alone. This is achieved by the use of a formal multiple scales expansion in the scale parameter. The expansion relies on the hypoellipticity of the underlying diffusion. An expression for the diffusivity tensor is found and various of its properties are studied. The results of the formal multiscale analysis are justified rigorously by the use of the martingale central limit theorem. Our theoretical findings are supported by numerical investigations where we study the parametric dependence of the effective diffusivity on the various non-dimensional parameters of the problem.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/1wjtb-qw782Analysis of SPDEs arising in path sampling part II: The nonlinear case
https://resolver.caltech.edu/CaltechAUTHORS:20170613-080132206
Authors: {'items': [{'id': 'Hairer-M', 'name': {'family': 'Hairer', 'given': 'M.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Voss-J', 'name': {'family': 'Voss', 'given': 'J.'}, 'orcid': '0000-0001-7740-8811'}]}
Year: 2007
DOI: 10.1214/07-AAP441
In many applications, it is important to be able to sample paths of SDEs conditional on observations of various kinds. This paper studies SPDEs which solve such sampling problems. The SPDE may be viewed as an infinite-dimensional analogue of the Langevin equation used in finite-dimensional sampling. In this paper, conditioned nonlinear SDEs, leading to nonlinear SPDEs for the sampling, are studied. In addition, a class of preconditioned SPDEs is studied, found by applying a Green's operator to the SPDE in such a way that the invariant measure remains unchanged; such infinite dimensional evolution equations are important for the development of practical algorithms for sampling infinite dimensional problems.
The resulting SPDEs provide several significant challenges in the theory of SPDEs. The two primary ones are the presence of nonlinear boundary conditions, involving first order derivatives, and a loss of the smoothing property in the case of the pre-conditioned SPDEs. These challenges are overcome and a theory of existence, uniqueness and ergodicity is developed in sufficient generality to subsume the sampling problems of interest to us. The Gaussian theory developed in Part I of this paper considers Gaussian SDEs, leading to linear Gaussian SPDEs for sampling. This Gaussian theory is used as the basis for deriving nonlinear SPDEs which affect the desired sampling in the nonlinear case, via a change of measure.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/jqwew-m8p37Parameter Estimation for Multiscale Diffusions
https://resolver.caltech.edu/CaltechAUTHORS:20170613-124705885
Authors: {'items': [{'id': 'Pavliotis-G-A', 'name': {'family': 'Pavliotis', 'given': 'G. A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2007
DOI: 10.1007/s10955-007-9300-6
We study the problem of parameter estimation for time-series possessing two, widely separated, characteristic time scales. The aim is to understand situations where it is desirable to fit a homogenized single-scale model to such multiscale data. We demonstrate, numerically and analytically, that if the data is sampled too finely then the parameter fit will fail, in that the correct parameters in the homogenized model are not identified. We also show, numerically and analytically, that if the data is subsampled at an appropriate rate then it is possible to estimate the coefficients of the homogenized model correctly.
The ideas are studied in the context of thermally activated motion in a two-scale potential. However the ideas may be expected to transfer to other situations where it is desirable to fit an averaged or homogenized equation to multiscale data.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/26wzb-8xj05Sampling the posterior: An approach to non-Gaussian data assimilation
https://resolver.caltech.edu/CaltechAUTHORS:20170609-130839195
Authors: {'items': [{'id': 'Apte-A', 'name': {'family': 'Apte', 'given': 'A.'}}, {'id': 'Hairer-M', 'name': {'family': 'Hairer', 'given': 'M.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Voss-J', 'name': {'family': 'Voss', 'given': 'J.'}, 'orcid': '0000-0001-7740-8811'}]}
Year: 2007
DOI: 10.1016/j.physd.2006.06.009
The viewpoint taken in this paper is that data assimilation is fundamentally a statistical problem and that this problem should be cast in a Bayesian framework. In the absence of model error, the correct solution to the data assimilation problem is to find the posterior distribution implied by this Bayesian setting. Methods for dealing with data assimilation should then be judged by their ability to probe this distribution. In this paper we propose a range of techniques for probing the posterior distribution, based around the Langevin equation; and we compare these new techniques with existing methods.
When the underlying dynamics is deterministic, the posterior distribution is on the space of initial conditions leading to a sampling problem over this space. When the underlying dynamics is stochastic the posterior distribution is on the space of continuous time paths. By writing down a density, and conditioning on observations, it is possible to define a range of Markov Chain Monte Carlo (MCMC) methods which sample from the desired posterior distribution, and thereby solve the data assimilation problem. The basic building-blocks for the MCMC methods that we concentrate on in this paper are Langevin equations which are ergodic and whose invariant measures give the desired distribution; in the case of path space sampling these are stochastic partial differential equations (SPDEs).
Two examples are given to show how data assimilation can be formulated in a Bayesian fashion. The first is weather prediction, and the second is Lagrangian data assimilation for oceanic velocity fields. Furthermore the relationship between the Bayesian approach outlined here and the commonly used Kalman filter based techniques, prevalent in practice, is discussed. Two simple pedagogical examples are studied to illustrate the application of Bayesian sampling to data assimilation concretely. Finally a range of open mathematical and computational issues, arising from the Bayesian approach, are outlined.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/6qwyn-zjz86An adaptive Euler-Maruyama scheme for SDEs: convergence and stability
https://resolver.caltech.edu/CaltechAUTHORS:20170612-132345468
Authors: {'items': [{'id': 'Lamba-H', 'name': {'family': 'Lamba', 'given': 'H.'}}, {'id': 'Mattingly-J-C', 'name': {'family': 'Mattingly', 'given': 'J. C.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2007
DOI: 10.1093/imanum/drl032
The understanding of adaptive algorithms for stochastic differential equations (SDEs) is an open area, where many issues related to both convergence and stability (long-time behaviour) of algorithms are unresolved. This paper considers a very simple adaptive algorithm, based on controlling only the drift component of a time step. Both convergence and stability are studied. The primary issue in the convergence analysis is that the adaptive method does not necessarily drive the time steps to zero with the user-input tolerance. This possibility must be quantified and shown to have low probability. The primary issue in the stability analysis is ergodicity. It is assumed that the noise is nondegenerate, so that the diffusion process is elliptic, and the drift is assumed to satisfy a coercivity condition. The SDE is then geometrically ergodic (averages converge to statistical equilibrium exponentially quickly). If the drift is not linearly bounded, then explicit fixed time step approximations, such as the Euler–Maruyama scheme, may fail to be ergodic. In this work, it is shown that the simple adaptive time-stepping strategy cures this problem. In addition to proving ergodicity, an exponential moment bound is also proved, generalizing a result known to hold for the SDE itself.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/hsv2z-dpc95Multiscale Methods: Averaging and Homogenization
https://resolver.caltech.edu/CaltechAUTHORS:20161110-162102417
Authors: {'items': [{'id': 'Pavliotis-G-A', 'name': {'family': 'Pavliotis', 'given': 'Grigorios A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2008
DOI: 10.1007/978-0-387-73829-1
This introduction to multiscale methods gives readers a broad overview of the many uses and applications of the methods. The book begins by setting the theoretical foundations of the subject area, and moves on to develop a unified approach to the simplification of a wide range of problems which possess multiple scales, via perturbation expansions; differential equations and stochastic processes are studied in one unified framework. The book concludes with an overview of a range of theoretical tools used to justify the simplified models derived via the perturbation expansions.
The presentation of the material is particularly suited to the range of mathematicians, scientists and engineers who want to exploit multiscale methods in applications. Extensive use of examples shows how to apply multiscale methods to solving a variety of problems. Exercises then enable readers to build their own skills and put them into practice.
Extensions and generalizations of the results presented in the book, as well as references to the literature, are provided in the Discussion and Bibliography section at the end of each chapter. All of the twenty-one chapters are supplemented with exercises.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/q2sa8-wy532A First Course in Continuum Mechanics
https://resolver.caltech.edu/CaltechAUTHORS:20161110-163303162
Authors: {'items': [{'id': 'Gonzalez-O', 'name': {'family': 'Gonzalez', 'given': 'Oscar'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2008
DOI: 10.1017/CBO9780511619571
A concise account of various classic theories of fluids and solids, this book is for courses in continuum mechanics for graduate students and advanced undergraduates. Thoroughly class-tested in courses at Stanford University and the University of Warwick, it is suitable for both applied mathematicians and engineers. The only prerequisites are an introductory undergraduate knowledge of basic linear algebra and differential equations. Unlike most existing works at this level, this book covers both isothermal and thermal theories. The theories are derived in a unified manner from the fundamental balance laws of continuum mechanics. Intended both for classroom use and for self-study, each chapter contains a wealth of exercises, with fully worked solutions to odd-numbered questions. A complete solutions manual is available to instructors upon request. Short bibliographies appear at the end of each chapter, pointing to material which underpins or expands upon the material discussed.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/qr46p-gdt32A Bayesian approach to Lagrangian data assimilation
https://resolver.caltech.edu/CaltechAUTHORS:20161108-173409695
Authors: {'items': [{'id': 'Apte-A', 'name': {'family': 'Apte', 'given': 'A.'}}, {'id': 'Jones-C-K-R-T', 'name': {'family': 'Jones', 'given': 'C. K. R. T.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2008
DOI: 10.1111/j.1600-0870.2007.00295.x
Lagrangian data arise from instruments that are carried by the flow in a fluid field. Assimilation of such data into ocean models presents a challenge due to the potential complexity of Lagrangian trajectories in relatively simple flow fields. We adopt a Bayesian perspective on this problem and thereby take account of the fully non-linear features of the underlying model.
In the perfect model scenario, the posterior distribution for the initial state of the system contains all the information that can be extracted from a given realization of observations and the model dynamics. We work in the smoothing context in which the posterior on the initial conditions is determined by future observations. This posterior distribution gives the optimal ensemble to be used in data assimilation. The issue then is sampling this distribution. We develop, implement, and test sampling methods, based on Markov-chain Monte Carlo (MCMC), which are particularly well suited to the low-dimensional, but highly non-linear, nature of Lagrangian data. We compare these methods to the well-established ensemble Kalman filter (EnKF) approach. It is seen that the MCMC based methods correctly sample the desired posterior distribution whereas the EnKF may fail due to infrequent observations or non-linear structures in the underlying flow.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/8q6fs-w7e62Data assimilation: Mathematical and statistical perspectives
https://resolver.caltech.edu/CaltechAUTHORS:20160805-165730529
Authors: {'items': [{'id': 'Apte-A', 'name': {'family': 'Apte', 'given': 'A.'}}, {'id': 'Jones-C-K-R-T', 'name': {'family': 'Jones', 'given': 'C. K. R. T.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Voss-J', 'name': {'family': 'Voss', 'given': 'J.'}, 'orcid': '0000-0001-7740-8811'}]}
Year: 2008
DOI: 10.1002/fld.1698
The bulk of this paper contains a concise mathematical overview of the subject of data assimilation, highlighting three primary ideas: (i) the standard optimization approaches of 3DVAR, 4DVAR and weak constraint 4DVAR are described and their interrelations explained; (ii) statistical analogues of these approaches are then introduced, leading to filtering (generalizing 3DVAR) and a form of smoothing (generalizing 4DVAR and weak constraint 4DVAR) and the optimization methods are shown to be maximum a posteriori estimators for the probability distributions implied by these statistical approaches; and (iii) by taking a general dynamical systems perspective on the subject it is shown that the incorporation of Lagrangian data can be handled by a straightforward extension of the preceding concepts.
We argue that the smoothing approach to data assimilation, based on statistical analogues of 4DVAR and weak constraint 4DVAR, provides the optimal solution to the assimilation of space–time distributed data into a model. The optimal solution obtained is a probability distribution on the relevant class of functions (initial conditions or time-dependent solutions). The approach is a useful one in the first instance because it clarifies the notion of what is the optimal solution, thereby providing a benchmark against which existing approaches can be evaluated. In the longer term it also provides the potential for new methods to create ensembles of solutions to the model, incorporating the available data in an optimal fashion.
Two examples are given illustrating this approach to data assimilation, both in the context of Lagrangian data, one based on statistical 4DVAR and the other on weak constraint statistical 4DVAR. The former is compared with the ensemble Kalman filter, which is thereby shown to be inaccurate in a variety of scenarios.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/bx265-r0m24MCMC Methods for Diffusion Bridges
https://resolver.caltech.edu/CaltechAUTHORS:20160805-165106874
Authors: {'items': [{'id': 'Beskos-A', 'name': {'family': 'Beskos', 'given': 'Alexandros'}}, {'id': 'Roberts-G', 'name': {'family': 'Roberts', 'given': 'Gareth'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}, {'id': 'Voss-J', 'name': {'family': 'Voss', 'given': 'Jochen'}, 'orcid': '0000-0001-7740-8811'}]}
Year: 2008
DOI: 10.1142/S0219493708002378
We present and study a Langevin MCMC approach for sampling nonlinear diffusion bridges. The method is based on recent theory concerning stochastic partial differential equations (SPDEs) reversible with respect to the target bridge, derived by applying the Langevin idea on the bridge pathspace. In the process, a Random-Walk Metropolis algorithm and an Independence Sampler are also obtained. The novel algorithmic idea of the paper is that proposed moves for the MCMC algorithm are determined by discretising the SPDEs in the time direction using an implicit scheme, parametrised by θ ∈ [0,1]. We show that the resulting infinite-dimensional MCMC sampler is well-defined only if θ = 1/2, when the MCMC proposals have the correct quadratic variation. Previous Langevin-based MCMC methods used explicit schemes, corresponding to θ = 0. The significance of the choice θ = 1/2 is inherited by the finite-dimensional approximation of the algorithm used in practice. We present numerical results illustrating the phenomenon and the theory that explains it. Diffusion bridges (with additive noise) are representative of the family of laws defined as a change of measure from Gaussian distributions on arbitrary separable Hilbert spaces; the analysis in this paper can be readily extended to target laws from this family and an example from signal processing illustrates this fact.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/7j7cr-ygm03Green's Functions by Monte Carlo
https://resolver.caltech.edu/CaltechAUTHORS:20161111-113156226
Authors: {'items': [{'id': 'White-D', 'name': {'family': 'White', 'given': 'David'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2009
DOI: 10.1007/978-3-642-04107-5_41
We describe a new numerical technique to estimate Green's functions of elliptic differential operators on bounded open sets. The algorithm utilizes SPDE based function space sampling techniques in conjunction with Metropolis-Hastings MCMC. The key idea is that neither the proposal nor the acceptance probability require the evaluation of a Dirac measure. The method allows Green's functions to be estimated via ergodic averaging. Numerical examples in both 1D and 2D, with second and fourth order elliptic PDE's, are presented to validate this methodology.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/y5x3s-0nq75Computational Complexity of Metropolis-Hastings Methods in High Dimensions
https://resolver.caltech.edu/CaltechAUTHORS:20170612-102025036
Authors: {'items': [{'id': 'Beskos-A', 'name': {'family': 'Beskos', 'given': 'Alexandros'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2009
DOI: 10.1007/978-3-642-04107-5_4
This article contains an overview of the literature concerning the computational complexity of Metropolis-Hastings based MCMC methods for sampling probability measures on ℝ^d, when the dimension d is large. The material is structured in three parts addressing, in turn, the following questions: (i) what are sensible assumptions to make on the family of probability measures indexed by d? (ii) what is known concerning computational complexity for Metropolis-Hastings methods applied to these families? (iii) what remains open in this area?https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/ft5ke-79y61Parameter estimation for partially observed hypoelliptic diffusions
https://resolver.caltech.edu/CaltechAUTHORS:20160805-155341773
Authors: {'items': [{'id': 'Pokern-Yvo', 'name': {'family': 'Pokern', 'given': 'Yvo'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Wiberg-P', 'name': {'family': 'Wiberg', 'given': 'Petter'}}]}
Year: 2009
DOI: 10.1111/j.1467-9868.2008.00689.x
Hypoelliptic diffusion processes can be used to model a variety of phenomena in applications ranging from molecular dynamics to audio signal analysis. We study parameter estimation for such processes in situations where we observe some components of the solution at discrete times. Since exact likelihoods for the transition densities are typically not known, approximations are used that are expected to work well in the limit of small intersample times Δt and large total observation times N Δt. Hypoellipticity together with partial observation leads to ill conditioning requiring a judicious combination of approximate likelihoods for the various parameters to be estimated. We combine these in a deterministic scan Gibbs sampler alternating between missing data in the unobserved solution components, and parameters. Numerical experiments illustrate asymptotic consistency of the method when applied to simulated data. The paper concludes with an application of the Gibbs sampler to molecular dynamics data.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/jsypz-0xz96Parameter estimation for partially observed hypo-elliptic diffusions
https://resolver.caltech.edu/CaltechAUTHORS:20161108-165631300
Authors: {'items': [{'id': 'Pokern-Yvo', 'name': {'family': 'Pokern', 'given': 'Yvo'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Wiberg-P', 'name': {'family': 'Wiberg', 'given': 'Petter'}}]}
Year: 2009
DOI: 10.1111/j.1467-9868.2008.00689.x
Hypoelliptic diffusion processes can be used to model a variety of phenomena in applications ranging from molecular dynamics to audio signal analysis. We study parameter estimation for such processes in situations where we observe some components of the solution at discrete times. Since exact likelihoods for the transition densities are typically not known, approximations are used that are expected to work well in the limit of small intersample times Δt and large total observation times N Δt. Hypoellipticity together with partial observation leads to ill conditioning requiring a judicious combination of approximate likelihoods for the various parameters to be estimated. We combine these in a deterministic scan Gibbs sampler alternating between missing data in the unobserved solution components, and parameters. Numerical experiments illustrate asymptotic consistency of the method when applied to simulated data. The paper concludes with an application of the Gibbs sampler to molecular dynamics data.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/95y96-saq13Calculating effective diffusivities in the limit of vanishing molecular diffusion
https://resolver.caltech.edu/CaltechAUTHORS:20160805-155749338
Authors: {'items': [{'id': 'Pavliotis-G-A', 'name': {'family': 'Pavliotis', 'given': 'G. A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Zygalakis-K-C', 'name': {'family': 'Zygalakis', 'given': 'K. C.'}}]}
Year: 2009
DOI: 10.1016/j.jcp.2008.10.014
In this paper we study the problem of the numerical calculation (by Monte Carlo methods) of the effective diffusivity for a particle moving in a periodic divergent-free velocity field, in the limit of vanishing molecular diffusion. In this limit traditional numerical methods typically fail, since they do not represent accurately the geometry of the underlying deterministic dynamics. We propose a stochastic splitting method that takes into account the volume-preserving property of the equations of motion in the absence of noise, and when inertial effects can be neglected. An extension of the method is then proposed for the cases where the noise has a non-trivial time-correlation structure and when inertial effects cannot be neglected. The method of modified equations is used to explain failings of Euler-based methods. The new stochastic geometric integrators are shown to outperform standard Euler-based integrators. Various asymptotic limits of physical interest are investigated by means of numerical experiments, using the new integrators.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/mtps1-kab57Sampling conditioned diffusions
https://resolver.caltech.edu/CaltechAUTHORS:20170614-080019284
Authors: {'items': [{'id': 'Hairer-M', 'name': {'family': 'Hairer', 'given': 'Martin'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}, {'id': 'Voss-J', 'name': {'family': 'Voss', 'given': 'Jochen'}, 'orcid': '0000-0001-7740-8811'}]}
Year: 2009
DOI: 10.1017/CBO9781139107020.009
For many practical problems it is useful to be able to sample conditioned diffusions on a computer (e.g. in filtering/ smoothing to sample from the conditioned distribution of the unknown signal given the known observations). We present a recently developed, SPDE-based method to tackle this problem. The method is an infinite-dimensional generalization of the Langevin sampling technique.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/pxfmt-jy760Optimal scalings for local Metropolis–Hastings chains on nonproduct targets in high dimensions
https://resolver.caltech.edu/CaltechAUTHORS:20160805-153017689
Authors: {'items': [{'id': 'Beskos-A', 'name': {'family': 'Beskos', 'given': 'Alexandros'}}, {'id': 'Roberts-G', 'name': {'family': 'Roberts', 'given': 'Gareth'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2009
DOI: 10.1214/08-AAP563
We investigate local MCMC algorithms, namely the random-walk Metropolis and the Langevin algorithms, and identify the optimal choice of the local step-size as a function of the dimension n of the state space, asymptotically as n→∞. We consider target distributions defined as a change of measure from a product law. Such structures arise, for instance, in inverse problems or Bayesian contexts when a product prior is combined with the likelihood. We state analytical results on the asymptotic behavior of the algorithms under general conditions on the change of measure. Our theory is motivated by applications on conditioned diffusion processes and inverse problems related to the 2D Navier–Stokes equation.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/zbbg8-pa277MCMC methods for sampling function space
https://resolver.caltech.edu/CaltechAUTHORS:20170612-093904953
Authors: {'items': [{'id': 'Beskos-A', 'name': {'family': 'Beskos', 'given': 'Alexandros'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2009
DOI: 10.4171/056-1/16
Applied mathematics is concerned with developing models with predictive capability, and with probing those models to obtain qualitative and quantitative insight into the phenomena being modelled. Statistics is data-driven and is aimed at the development of methodologies to optimize the information derived from data. The increasing complexity of phenomena that scientists and engineers wish to model, together with our increased ability to gather, store and interrogate data, mean that the subjects of applied mathematics and statistics are increasingly required to work in conjunction in order to significantly progress understanding.This article is concerned with a research program at the interface between these two disciplines, aimed at problems in differential equations where profusion of data and the sophisticated model combine to produce the mathematical problem of obtaining information from a probability measure on function space. In this context there is an array of problems with a common mathematical structure, namely that the probability measure in question is a change of measure from a Gaussian. We illustrate the wide-ranging applicability of this structure. For problems whose solution is determined by a probability measure on function space, information about the solution can be obtained by sampling from this probability measure. One way to do this is through the use of Markov chain Monte-Carlo (MCMC) methods. We show how the common mathematical structure of the aforementioned problems can be exploited in the design of effective MCMC methods.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/72177-s9k74Maximum likelihood drift estimation for multiscale diffusions
https://resolver.caltech.edu/CaltechAUTHORS:20160805-153633492
Authors: {'items': [{'id': 'Papavasiliou-A', 'name': {'family': 'Papavasiliou', 'given': 'A.'}}, {'id': 'Pavliotis-G-A', 'name': {'family': 'Pavliotis', 'given': 'G. A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2009
DOI: 10.1016/j.spa.2009.05.003
We study the problem of parameter estimation using maximum likelihood for fast/slow systems of stochastic differential equations. Our aim is to shed light on the problem of model/data mismatch at small scales. We consider two classes of fast/slow problems for which a closed coarse-grained equation for the slow variables can be rigorously derived, which we refer to as averaging and homogenization problems. We ask whether, given data from the slow variable in the fast/slow system, we can correctly estimate parameters in the drift of the coarse-grained equation for the slow variable, using maximum likelihood. We show that, whereas the maximum likelihood estimator is asymptotically unbiased for the averaging problem, for the homogenization problem maximum likelihood fails unless we subsample the data at an appropriate rate. An explicit formula for the asymptotic error in the log-likelihood function is presented. Our theory is applied to two simple examples from molecular dynamics.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/h0mxs-m0695Remarks on Drift Estimation for Diffusion Processes
https://resolver.caltech.edu/CaltechAUTHORS:20160805-152349945
Authors: {'items': [{'id': 'Pokern-Yvo', 'name': {'family': 'Pokern', 'given': 'Yvo'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Vanden-Eijnden-E', 'name': {'family': 'Vanden-Eijnden', 'given': 'Eric'}}]}
Year: 2009
DOI: 10.1137/070694806
In applications such as molecular dynamics it is of interest to fit Smoluchowski and Langevin equations to data. Practitioners often achieve this by a variety of seemingly ad hoc procedures such as fitting to the empirical measure generated by the data and fitting to properties of autocorrelation functions. Statisticians, on the other hand, often use estimation procedures, which fit diffusion processes to data by applying the maximum likelihood principle to the path-space density of the desired model equations, and through knowledge of the properties of quadratic variation. In this paper we show that the procedures used by practitioners and statisticians to fit drift functions are, in fact, closely related and can be thought of as two alternative ways to regularize the (singular) likelihood function for the drift. We also present the results of numerical experiments which probe the relative efficacy of the two approaches to model identification and compare them with other methods such as the minimum distance estimator.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/8b0r1-4et26Bayesian inverse problems for functions and applications to fluid mechanics
https://resolver.caltech.edu/CaltechAUTHORS:20160805-151904215
Authors: {'items': [{'id': 'Cotter-S-L', 'name': {'family': 'Cotter', 'given': 'S. L.'}}, {'id': 'Dashti-M', 'name': {'family': 'Dashti', 'given': 'M.'}}, {'id': 'Robinson-J-C', 'name': {'family': 'Robinson', 'given': 'J. C.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2009
DOI: 10.1088/0266-5611/25/11/115008
In this paper we establish a mathematical framework for a range of inverse problems for functions, given a finite set of noisy observations. The problems are hence underdetermined and are often ill-posed. We study these problems from the viewpoint of Bayesian statistics, with the resulting posterior probability measure being defined on a space of functions. We develop an abstract framework for such problems which facilitates application of an infinite-dimensional version of Bayes theorem, leads to a well-posedness result for the posterior measure (continuity in a suitable probability metric with respect to changes in data), and also leads to a theory for the existence of maximizing the posterior probability (MAP) estimators for such Bayesian inverse problems on function space. A central idea underlying these results is that continuity properties and bounds on the forward model guide the choice of the prior measure for the inverse problem, leading to the desired results on well-posedness and MAP estimators; the PDE analysis and probability theory required are thus clearly dileneated, allowing a straightforward derivation of results. We show that the abstract theory applies to some concrete applications of interest by studying problems arising from data assimilation in fluid mechanics. The objective is to make inference about the underlying velocity field, on the basis of either Eulerian or Lagrangian observations. We study problems without model error, in which case the inference is on the initial condition, and problems with model error in which case the inference is on the initial condition and on the driving noise process or, equivalently, on the entire time-dependent velocity field. In order to undertake a relatively uncluttered mathematical analysis we consider the two-dimensional Navier–Stokes equation on a torus. The case of Eulerian observations—direct observations of the velocity field itself—is then a model for weather forecasting. The case of Lagrangian observations—observations of passive tracers advected by the flow—is then a model for data arising in oceanography. The methodology which we describe herein may be applied to many other inverse problems in which it is of interest to find, given observations, an infinite-dimensional object, such as the initial condition for a PDE. A similar approach might be adopted, for example, to determine an appropriate mathematical setting for the inverse problem of determining an unknown tensor arising in a constitutive law for a PDE, given observations of the solution. The paper is structured so that the abstract theory can be read independently of the particular problems in fluid mechanics which are subsequently studied by application of the theory.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/stfqy-jsk92Approximation of Bayesian Inverse Problems for PDEs
https://resolver.caltech.edu/CaltechAUTHORS:20160804-170531840
Authors: {'items': [{'id': 'Cotter-S-L', 'name': {'family': 'Cotter', 'given': 'S. L.'}}, {'id': 'Dashti-M', 'name': {'family': 'Dashti', 'given': 'M.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2010
DOI: 10.1137/090770734
Inverse problems are often ill posed, with solutions that depend sensitively on data. In any numerical approach to the solution of such problems, regularization of some form is needed to counteract the resulting instability. This paper is based on an approach to regularization, employing
a Bayesian formulation of the problem, which leads to a notion of well posedness for inverse problems,
at the level of probability measures. The stability which results from this well posedness may be used as the basis for quantifying the approximation, in finite dimensional spaces, of inverse problems for functions. This paper contains a theory which utilizes this stability property to estimate the distance between the true and approximate posterior distributions, in the Hellinger metric, in terms
of error estimates for approximation of the underlying forward problem. This is potentially useful as
it allows for the transfer of estimates from the numerical analysis of forward problems into estimates
for the solution of the related inverse problem. It is noteworthy that, when the prior is a Gaussian
random field model, controlling differences in the Hellinger metric leads to control on the differences
between expected values of polynomially bounded functions and operators, including the mean and
covariance operator. The ideas are applied to some non-Gaussian inverse problems where the goal is
determination of the initial condition for the Stokes or Navier–Stokes equation from Lagrangian and
Eulerian observations, respectively.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/h1tbs-p0y57Inverse problems: A Bayesian perspective
https://resolver.caltech.edu/CaltechAUTHORS:20161111-112136150
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2010
DOI: 10.1017/S0962492910000061
The subject of inverse problems in differential equations is of enormous practical importance, and has also generated substantial mathematical and computational innovation. Typically some form of regularization is required to ameliorate ill-posed behaviour. In this article we review the Bayesian approach to regularization, developing a function space viewpoint on the subject. This approach allows for a full characterization of all possible solutions, and their relative probabilities, whilst simultaneously forcing significant modelling issues to be addressed in a clear and precise fashion. Although expensive to implement, this approach is starting to lie within the range of the available computational resources in many application areas. It also allows for the quantification of uncertainty and risk, something which is increasingly demanded by these applications. Furthermore, the approach is conceptually important for the understanding of simpler, computationally expedient approaches to inverse problems.
We demonstrate that, when formulated in a Bayesian fashion, a wide range
of inverse problems share a common mathematical framework, and we high-
light a theory of well-posedness which stems from this. The well-posedness
theory provides the basis for a number of stability and approximation results
which we describe. We also review a range of algorithmic approaches which
are used when adopting the Bayesian approach to inverse problems. These
include MCMC methods, filtering and the variational approach.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/fk9j9-5tq15Convergence of numerical time-averaging and stationary measures via Poisson equations
https://resolver.caltech.edu/CaltechAUTHORS:20160804-165401982
Authors: {'items': [{'id': 'Mattingly-J-C', 'name': {'family': 'Mattingly', 'given': 'J. C.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Tretyakov-M', 'name': {'family': 'Tretyakov', 'given': 'M.'}}]}
Year: 2010
DOI: 10.1137/090770527
Numerical approximation of the long time behavior of a stochastic differential equation (SDE) is considered. Error estimates for time-averaging estimators are obtained and then used to show that the stationary behavior of the numerical method converges to that of the SDE. The
error analysis is based on using an associated Poisson equation for the underlying SDE. The main
advantages of this approach are its simplicity and universality. It works equally well for a range of
explicit and implicit schemes, including those with simple simulation of random variables, and for hypoelliptic SDEs. To simplify the exposition, we consider only the case where the state space of the SDE is a torus, and we study only smooth test functions. However, we anticipate that the approach can be applied more widely. An analogy between our approach and Stein's method is indicated. Some practical implications of the results are discussed.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/jmycg-8bx11Transition paths in molecules at finite temperature
https://resolver.caltech.edu/CaltechAUTHORS:20161108-161530502
Authors: {'items': [{'id': 'Pinski-F-J', 'name': {'family': 'Pinski', 'given': 'F. J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2010
DOI: 10.1063/1.3391160
In the zero temperature limit, it is well known that in systems evolving via Brownian dynamics, the most likely transition path between reactant and product may be found as a minimizer of the Freidlin–Wentzell action functional. An analog for finite temperature transitions is given by the Onsager–Machlup functional. The purpose of this work is to investigate properties of Onsager–Machlup minimizers. We study transition paths for thermally activated molecules governed by the Langevin equation in the overdamped limit of Brownian dynamics. Using gradient descent in pathspace, we minimize the Onsager–Machlup functional for a range of model problems in one and two dimensions and then for some simple atomic models including Lennard-Jones seven-atom and 38-atom clusters, as well as for a model of vacancydiffusion in a planar crystal. Our results demonstrate interesting effects, which can occur at nonzero temperature, showing transition paths that could not be predicted on the basis of the zero temperature limit. However the results also demonstrate unphysical features associated with such Onsager–Machlup minimizers. As there is a growing literature that addresses transition path sampling by related techniques, these insights add a potentially useful perspective into the interpretation of this body of work.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/cenf4-fn069Random-weight particle filtering of continuous time processes
https://resolver.caltech.edu/CaltechAUTHORS:20160804-164847166
Authors: {'items': [{'id': 'Fearnhead-P', 'name': {'family': 'Fearnhead', 'given': 'Paul'}}, {'id': 'Papaspiliopoulos-O', 'name': {'family': 'Papaspiliopoulos', 'given': 'Omiros'}}, {'id': 'Roberts-G-O', 'name': {'family': 'Roberts', 'given': 'Gareth O.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2010
DOI: 10.1111/j.1467-9868.2010.00744.x
It is possible to implement importance sampling, and particle filter algorithms, where the importance sampling weight is random. Such random-weight algorithms have been shown to be efficient for inference for a class of diffusion models, as they enable inference without any (time discretization) approximation of the underlying diffusion model. One difficulty of implementing such random-weight algorithms is the requirement to have weights that are positive with probability 1. We show how Wald's identity for martingales can be used to ensure positive weights. We apply this idea to analysis of diffusion models from high frequency data. For a class of diffusion models we show how to implement a particle filter, which uses all the information in the data, but whose computational cost is independent of the frequency of the data. We use the Wald identity to implement a random-weight particle filter for these models which avoids time discretization error.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/n0qw4-vze39Random-weight particle filtering of continuous time processes
https://resolver.caltech.edu/CaltechAUTHORS:20170612-075052519
Authors: {'items': [{'id': 'Fearnhead-P', 'name': {'family': 'Fearnhead', 'given': 'Paul'}}, {'id': 'Papaspiliopoulos-O', 'name': {'family': 'Papaspiliopoulos', 'given': 'Omiros'}}, {'id': 'Roberts-G-O', 'name': {'family': 'Roberts', 'given': 'Gareth O.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2010
DOI: 10.1111/j.1467-9868.2010.00744.x
It is possible to implement importance sampling, and particle filter algorithms, where the importance sampling weight is random. Such random-weight algorithms have been shown to be efficient for inference for a class of diffusion models, as they enable inference without any (time discretization) approximation of the underlying diffusion model. One difficulty of implementing such random-weight algorithms is the requirement to have weights that are positive with probability 1. We show how Wald's identity for martingales can be used to ensure positive weights. We apply this idea to analysis of diffusion models from high frequency data. For a class of diffusion models we show how to implement a particle filter, which uses all the information in the data, but whose computational cost is independent of the frequency of the data. We use the Wald identity to implement a random-weight particle filter for these models which avoids time discretization error.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/34d4q-7w428Multiscale modelling and inverse problems
https://resolver.caltech.edu/CaltechAUTHORS:20161111-110328030
Authors: {'items': [{'id': 'Nolen-J', 'name': {'family': 'Nolen', 'given': 'James'}}, {'id': 'Pavliotis-G-A', 'name': {'family': 'Pavliotis', 'given': 'Grigorios A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2011
DOI: 10.1007/978-3-642-22061-6_1
The need to blend observational data and mathematical models arises in many applications and leads naturally to inverse problems. Parameters appearing in the model, such as constitutive tensors, initial conditions, boundary conditions, and forcing can be estimated on the basis of observed data. The resulting inverse problems are usually ill-posed and some form of regularization is required. These notes discuss parameter estimation in situations where the unknown parameters vary across multiple scales. We illustrate the main ideas using a simple model for groundwater flow.
We will highlight various approaches to regularization for inverse problems, including Tikhonov and Bayesian methods. We illustrate three ideas that arise when considering inverse problems in the multiscale context. The first idea is that the choice of space or set in which to seek the solution to the inverse problem is intimately related to whether a homogenized or full multiscale solution is required. This is a choice of regularization. The second idea is that, if a homogenized solution to the inverse problem is what is desired, then this can be recovered from carefully designed observations of the full multiscale system. The third idea is that the theory of homogenization can be used to improve the estimation of homogenized coefficients from multiscale data.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/bcezy-zr077Signal processing problems on function space: Bayesian formulation, stochastic PDEs and effective MCMC methods
https://resolver.caltech.edu/CaltechAUTHORS:20161111-111346824
Authors: {'items': [{'id': 'Hairer-M', 'name': {'family': 'Hairer', 'given': 'M.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Voss-J', 'name': {'family': 'Voss', 'given': 'J.'}, 'orcid': '0000-0001-7740-8811'}]}
Year: 2011
{No abstract]https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/evv21-g9q71Sampling conditioned hypoelliptic diffusions
https://resolver.caltech.edu/CaltechAUTHORS:20160804-162713014
Authors: {'items': [{'id': 'Hairer-M', 'name': {'family': 'Hairer', 'given': 'Martin'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Voss-J', 'name': {'family': 'Voss', 'given': 'Jochen'}, 'orcid': '0000-0001-7740-8811'}]}
Year: 2011
DOI: 10.1214/10-AAP708
A series of recent articles introduced a method to construct stochastic partial differential equations (SPDEs) which are invariant with respect to the distribution of a given conditioned diffusion. These works are restricted to the case of elliptic diffusions where the drift has a gradient structure and the resulting SPDE is of second-order parabolic type.
The present article extends this methodology to allow the construction of SPDEs which are invariant with respect to the distribution of a class of hypoelliptic diffusion processes, subject to a bridge conditioning, leading to SPDEs which are of fourth-order parabolic type. This allows the treatment of more realistic physical models, for example, one can use the resulting SPDE to study transitions between meta-stable states in mechanical systems with friction and noise. In this situation the restriction of the drift being a gradient can also be lifted.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/7zrak-pkh86A note on diffusion limits of chaotic skew-product flows
https://resolver.caltech.edu/CaltechAUTHORS:20160804-164518594
Authors: {'items': [{'id': 'Melbourne-I', 'name': {'family': 'Melbourne', 'given': 'I.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2011
DOI: 10.1088/0951-7715/24/4/018
We provide an explicit rigorous derivation of a diffusion limit—a stochastic differential equation (SDE) with additive noise—from a deterministic skew-product flow. This flow is assumed to exhibit time-scale separation and has the form of a slowly evolving system driven by a fast chaotic flow. Under mild assumptions on the fast flow, we prove convergence to a SDE as the time-scale separation grows. In contrast to existing work, we do not require the flow to have good mixing properties. As a consequence, our results incorporate a large class of fast flows, including the classical Lorenz equations.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/fww1c-y4b35Kalman filtering and smoothing for linear wave equations with model error
https://resolver.caltech.edu/CaltechAUTHORS:20160801-175538072
Authors: {'items': [{'id': 'Lee-Wongjung', 'name': {'family': 'Lee', 'given': 'Wonjung'}}, {'id': 'McDougall-D', 'name': {'family': 'McDougall', 'given': 'D.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2011
DOI: 10.1088/0266-5611/27/9/095008
Filtering is a widely used methodology for the incorporation of observed data into time-evolving systems. It provides an online approach to state estimation inverse problems when data are acquired sequentially. The Kalman filter plays a central role in many applications because it is exact for linear systems subject to Gaussian noise, and because it forms the basis for many approximate filters which are used in high-dimensional systems. The aim of this paper is to study the effect of model error on the Kalman filter, in the context of linear wave propagation problems. A consistency result is proved when no model error is present, showing recovery of the true signal in the large data limit. This result, however, is not robust: it is also proved that arbitrarily small model error can lead to inconsistent recovery of the signal in the large data limit. If the model error is in the form of a constant shift to the velocity, the filtering and smoothing distributions only recover a partial Fourier expansion, a phenomenon related to aliasing. On the other hand, for a class of wave velocity model errors which are time dependent, it is possible to recover the filtering distribution exactly, but not the smoothing distribution. Numerical results are presented which corroborate the theory, and also propose a computational approach which overcomes the inconsistency in the presence of model error, by relaxing the model.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/hq80a-n2420Hybrid Monte Carlo on Hilbert spaces
https://resolver.caltech.edu/CaltechAUTHORS:20160804-150437807
Authors: {'items': [{'id': 'Beskos-A', 'name': {'family': 'Beskos', 'given': 'A.'}}, {'id': 'Pinski-F-J', 'name': {'family': 'Pinski', 'given': 'F. J.'}}, {'id': 'Sanz-Serna-J-M', 'name': {'family': 'Sanz-Serna', 'given': 'J. M.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2011
DOI: 10.1016/j.spa.2011.06.003
The Hybrid Monte Carlo (HMC) algorithm provides a framework for sampling from complex, high-dimensional target distributions. In contrast with standard Markov chain Monte Carlo (MCMC) algorithms, it generates nonlocal, nonsymmetric moves in the state space, alleviating random walk type behaviour for the simulated trajectories. However, similarly to algorithms based on random walk or Langevin proposals, the number of steps required to explore the target distribution typically grows with the dimension of the state space. We define a generalized HMC algorithm which overcomes this problem for target measures arising as finite-dimensional approximations of measures π which have density with respect to a Gaussian measure on an infinite-dimensional Hilbert space. The key idea is to construct an MCMC method which is well defined on the Hilbert space itself.
We successively address the following issues in the infinite-dimensional setting of a Hilbert space: (i) construction of a probability measure Π in an enlarged phase space having the target π as a marginal, together with a Hamiltonian flow that preserves Π; (ii) development of a suitable geometric numerical integrator for the Hamiltonian flow; and (iii) derivation of an accept/reject rule to ensure preservation of Π when using the above numerical integrator instead of the actual Hamiltonian flow. Experiments are reported that compare the new algorithm with standard HMC and with a version of the Langevin MCMC method defined on a Hilbert space.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/kbprr-5m424Uncertainty Quantification and Weak Approximation of an Elliptic Inverse Problem
https://resolver.caltech.edu/CaltechAUTHORS:20160728-160946614
Authors: {'items': [{'id': 'Dashti-M', 'name': {'family': 'Dashti', 'given': 'M.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2011
DOI: 10.1137/100814664
We consider the inverse problem of determining the permeability from the pressure in a Darcy model of flow in a porous medium. Mathematically the problem is to find the diffusion coefficient for a linear uniformly elliptic partial differential equation in divergence form, in a bounded domain in dimension d ≤ 3, from measurements of the solution in the interior. We adopt a Bayesian approach to the problem. We place a prior random field measure on the log permeability, specified through the Karhunen–Loève expansion of its draws. We consider Gaussian measures constructed this way, and study the regularity of functions drawn from them. We also study the Lipschitz properties of the observation operator mapping the log permeability to the observations. Combining these regularity and continuity estimates, we show that the posterior measure is well defined on a suitable Banach space. Furthermore the posterior measure is shown to be Lipschitz with respect to the data in the Hellinger metric, giving rise to a form of well posedness of the inverse problem. Determining the posterior measure, given the data, solves the problem of uncertainty quantification for this inverse problem. In practice the posterior measure must be approximated in a finite dimensional space. We quantify the errors incurred by employing a truncated Karhunen–Loève expansion to represent this meausure. In particular we study weak convergence of a general class of locally Lipschitz functions of the log permeability, and apply this general theory to estimate errors in the posterior mean of the pressure and the pressure covariance, under refinement of the finite-dimensional Karhunen–Loève truncation.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/2ypvx-7z310Parameter estimation for multiscale diffusions: an overview
https://resolver.caltech.edu/CaltechAUTHORS:20161111-105740878
Authors: {'items': [{'id': 'Pavliotis-G-A', 'name': {'family': 'Pavliotis', 'given': 'Grigorios A.'}}, {'id': 'Pokern-Yvo', 'name': {'family': 'Pokern', 'given': 'Yvo'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2012
DOI: 10.1201/b12126-8
There are many applications where it is desirable to fit reduced stochastic descriptions (e.g. SDEs) to data. These include molecular dynamics (Schlick (2000), Frenkel and Smit (2002)), atmosphere/ocean science (Majda and Kramer (1999)), cellular biology (Alberts et al. (2002)) and econometrics (Dacorogna, Gençay, Miiller, Olsen, and Pictet (2001)). The data arising in these problems often has a multiscale character and may not be compatible with the desired diffusion at small scales (see Givon, Kupferman, and Stuart (2004), Majda, Timofeyev, and Vanden-Eijnden (1999), Kepler and Elston (2001), Zhang, Mykland, and Aft-Sahalia (2005) and Olhede, Sykulski, and Pavliotis (2009)). The question then arises as to how to optimally employ such data to find a useful diffusion approximation.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/j3zq6-wwb24Variational data assimilation using targetted random walks
https://resolver.caltech.edu/CaltechAUTHORS:20160728-160352524
Authors: {'items': [{'id': 'Cotter-S-L', 'name': {'family': 'Cotter', 'given': 'S. L.'}}, {'id': 'Dashti-M', 'name': {'family': 'Dashti', 'given': 'M.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2012
DOI: 10.1002/fld.2510
The variational approach to data assimilation is a widely used methodology for both online prediction and for reanalysis. In either of these scenarios, it can be important to assess uncertainties in the assimilated state. Ideally, it is desirable to have complete information concerning the Bayesian posterior distribution for unknown state given data. We show that complete computational probing of this posterior distribution is now within the reach in the offline situation. We introduce a Markov chain–Monte Carlo (MCMC) method which enables us to directly sample from the Bayesian posterior distribution on the unknown functions of interest given observations. Since we are aware that these methods are currently too computationally expensive to consider using in an online filtering scenario, we frame this in the context of offline reanalysis. Using a simple random walk-type MCMC method, we are able to characterize the posterior distribution using only evaluations of the forward model of the problem, and of the model and data mismatch. No adjoint model is required for the method we use; however, more sophisticated MCMC methods are available which exploit derivative information. For simplicity of exposition, we consider the problem of assimilating data, either Eulerian or Lagrangian, into a low Reynolds number flow in a two-dimensional periodic geometry. We will show that in many cases it is possible to recover the initial condition and model error (which we describe as unknown forcing to the model) from data, and that with increasing amounts of informative data, the uncertainty in our estimations reduces.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/9nb6e-w7y37Γ-Limit for Transition Paths of Maximal Probability
https://resolver.caltech.edu/CaltechAUTHORS:20160728-155454420
Authors: {'items': [{'id': 'Pinski-F-J', 'name': {'family': 'Pinski', 'given': 'F. J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Theil-F', 'name': {'family': 'Theil', 'given': 'F.'}}]}
Year: 2012
DOI: 10.1007/s10955-012-0443-8
Chemical reactions can be modeled via diffusion processes conditioned to make a transition between specified molecular configurations representing the state of the system before and after the chemical reaction. In particular the model of Brownian dynamics—gradient flow subject to additive noise—is frequently used. If the chemical reaction is specified to take place on a given time interval, then the most likely path taken by the system is a minimizer of the Onsager-Machlup functional. The Γ-limit of this functional is determined explicitly in the case where the temperature is small and the transition time scales as the inverse temperature.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/bvzrt-2r877Sparse determinisitc approximation of Bayesian inverse problems
https://resolver.caltech.edu/CaltechAUTHORS:20160728-155039916
Authors: {'items': [{'id': 'Schwab-C', 'name': {'family': 'Schwab', 'given': 'C.'}, 'orcid': '0000-0002-4046-987X'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2012
DOI: 10.1088/0266-5611/28/4/045003
We present a parametric deterministic formulation of Bayesian inverse problems with an input parameter from infinite-dimensional, separable Banach spaces. In this formulation, the forward problems are parametric, deterministic elliptic partial differential equations, and the inverse problem is to determine the unknown, parametric deterministic coefficients from noisy observations comprising linear functionals of the solution. We prove a generalized polynomial chaos representation of the posterior density with respect to the prior measure, given noisy observational data. We analyze the sparsity of the posterior density in terms of the summability of the input data's coefficient sequence. The first step in this process is to estimate the fluctuations in the prior. We exhibit sufficient conditions on the prior model in order for approximations of the posterior density to converge at a given algebraic rate, in terms of the number N of unknowns appearing in the parametric representation of the prior measure. Similar sparsity and approximation results are also exhibited for the solution and covariance of the elliptic partial differential equation under the posterior. These results then form the basis for efficient uncertainty quantification, in the presence of data with noise.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/cgv9x-q7443Besov priors for Bayesian inverse problems
https://resolver.caltech.edu/CaltechAUTHORS:20160728-153255881
Authors: {'items': [{'id': 'Dashti-M', 'name': {'family': 'Dashti', 'given': 'Masoumeh'}}, {'id': 'Harris-S-J', 'name': {'family': 'Harris', 'given': 'Stephen J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2012
DOI: 10.3934/ipi.2012.6.183
We consider the inverse problem of estimating a function u from noisy, possibly nonlinear, observations. We adopt a Bayesian approach to the problem. This approach has a long history for inversion, dating back to 1970, and has, over the last decade, gained importance as a practical tool. However most of the existing theory has been developed for Gaussian prior measures. Recently Lassas, Saksman and Siltanen (Inv. Prob. Imag. 2009) showed how to construct Besov prior measures, based on wavelet expansions with random coefficients, and used these prior measures to study linear inverse problems. In this paper we build on this development of Besov priors to include the case of nonlinear measurements. In doing so a key technical tool, established here, is a Fernique-like theorem for Besov measures. This theorem enables us to identify appropriate conditions on the forward solution operator which, when matched to properties of the prior Besov measure, imply the well-definedness and well-posedness of the posterior measure. We then consider the application of these results to the inverse problem of finding the diffusion coefficient of an elliptic partial differential equation, given noisy measurements of its solution.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/kc225-txc26Diffusion limits of the random walk Metropolis algorithm in high dimensions
https://resolver.caltech.edu/CaltechAUTHORS:20160728-154635836
Authors: {'items': [{'id': 'Mattingly-J-C', 'name': {'family': 'Mattingly', 'given': 'Jonathan C.'}}, {'id': 'Pillai-N-S', 'name': {'family': 'Pillai', 'given': 'Natesh S.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2012
DOI: 10.1214/10-AAP754
Diffusion limits of MCMC methods in high dimensions provide a useful theoretical tool for studying computational complexity. In particular, they lead directly to precise estimates of the number of steps required to explore the target measure, in stationarity, as a function of the dimension of the state space. However, to date such results have mainly been proved for target measures with a product structure, severely limiting their applicability. The purpose of this paper is to study diffusion limits for a class of naturally occurring high-dimensional measures found from the approximation of measures on a Hilbert space which are absolutely continuous with respect to a Gaussian reference measure. The diffusion limit of a random walk Metropolis algorithm to an infinite-dimensional Hilbert space valued SDE (or SPDE) is proved, facilitating understanding of the computational complexity of the algorithm.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/6tg0z-6jv09Nonparametric estimation of diffusions: a differential equations approach
https://resolver.caltech.edu/CaltechAUTHORS:20160728-152807075
Authors: {'items': [{'id': 'Papaspiliopoulos-O', 'name': {'family': 'Papaspiliopoulos', 'given': 'Omiros'}}, {'id': 'Pokern-Yvo', 'name': {'family': 'Pokern', 'given': 'Yvo'}}, {'id': 'Roberts-G-O', 'name': {'family': 'Roberts', 'given': 'Gareth O.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2012
DOI: 10.1093/biomet/ass034
We consider estimation of scalar functions that determine the dynamics of diffusion processes. It has been recently shown that nonparametric maximum likelihood estimation is ill-posed in this context. We adopt a probabilistic approach to regularize the problem by the adoption of a prior distribution for the unknown functional. A Gaussian prior measure is chosen in the function space by specifying its precision operator as an appropriate differential operator. We establish that a Bayesian–Gaussian conjugate analysis for the drift of one-dimensional nonlinear diffusions is feasible using high-frequency data, by expressing the loglikelihood as a quadratic function of the drift, with sufficient statistics given by the local time process and the end points of the observed path. Computationally efficient posterior inference is carried out using a finite element method. We embed this technology in partially observed situations and adopt a data augmentation approach whereby we iteratively generate missing data paths and draws from the unknown functional. Our methodology is applied to estimate the drift of models used in molecular dynamics and financial econometrics using high- and low-frequency observations. We discuss extensions to other partially observed schemes and connections to other types of nonparametric inference.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/2y8qv-w2d18Evaluating Data Assimilation Algorithms
https://resolver.caltech.edu/CaltechAUTHORS:20160728-150615482
Authors: {'items': [{'id': 'Law-K-J-H', 'name': {'family': 'Law', 'given': 'K. J. H.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2012
DOI: 10.1175/MWR-D-11-00257.1
Data assimilation leads naturally to a Bayesian formulation in which the posterior probability distribution of the system state, given all the observations on a time window of interest, plays a central conceptual role. The aim of this paper is to use this Bayesian posterior probability distribution as a gold standard against which to evaluate various commonly used data assimilation algorithms.
A key aspect of geophysical data assimilation is the high dimensionality and limited predictability of the computational model. This paper examines the two-dimensional Navier–Stokes equations in a periodic geometry, which has these features and yet is tractable for explicit and accurate computation of the posterior distribution by state-of-the-art statistical sampling techniques. The commonly used algorithms that are evaluated, as quantified by the relative error in reproducing moments of the posterior, are four-dimensional variational data assimilation (4DVAR) and a variety of sequential filtering approximations based on three-dimensional variational data assimilation (3DVAR) and on extended and ensemble Kalman filters.
The primary conclusions are that, under the assumption of a well-defined posterior probability distribution, (i) with appropriate parameter choices, approximate filters can perform well in reproducing the mean of the desired probability distribution, (ii) they do not perform as well in reproducing the covariance, and (iii) the error is compounded by the need to modify the covariance, in order to induce stability. Thus, filters can be a useful tool in predicting mean behavior but should be viewed with caution as predictors of uncertainty. These conclusions are intrinsic to the algorithms when assumptions underlying them are not valid and will not change if the model complexity is increased.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/km7gz-z3586Optimal scaling and diffusion limits for the Langevin algorithm in high dimensions
https://resolver.caltech.edu/CaltechAUTHORS:20160728-150141693
Authors: {'items': [{'id': 'Pillai-N-S', 'name': {'family': 'Pillai', 'given': 'Natesh S.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Thiéry-A-H', 'name': {'family': 'Thiéry', 'given': 'Alexandre H.'}}]}
Year: 2012
DOI: 10.1214/11-AAP828
The Metropolis-adjusted Langevin (MALA) algorithm is a sampling algorithm which makes local moves by incorporating information about the gradient of the logarithm of the target density. In this paper we study the efficiency of MALA on a natural class of target measures supported on an infinite dimensional Hilbert space. These natural measures have density with respect to a Gaussian random field measure and arise in many applications such as Bayesian nonparametric statistics and the theory of conditioned diffusions. We prove that, started in stationarity, a suitably interpolated and scaled version of the Markov chain corresponding to MALA converges to an infinite dimensional diffusion process. Our results imply that, in stationarity, the MALA algorithm applied to an N-dimensional approximation of the target will take O(N^(1/3)) steps to explore the invariant measure, comparing favorably with the Random Walk Metropolis which was recently shown to require O(N) steps when applied to the same class of problems. As a by-product of the diffusion limit, it also follows that the MALA algorithm is optimized at an average acceptance probability of 0.574. Previous results were proved only for targets which are products of one-dimensional distributions, or for variants of this situation, limiting their applicability. The correlation in our target means that the rescaled MALA algorithm converges weakly to an infinite dimensional Hilbert space valued diffusion, and the limit cannot be described through analysis of scalar diffusions. The limit theorem is proved by showing that a drift-martingale decomposition of the Markov chain, suitably scaled, closely resembles a weak Euler–Maruyama discretization of the putative limit. An invariance principle is proved for the martingale, and a continuous mapping argument is used to complete the proof.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/qa1at-v7t47Posterior consistency via precision operators for Bayesian nonparametric drift estimation in SDEs
https://resolver.caltech.edu/CaltechAUTHORS:20160727-175235615
Authors: {'items': [{'id': 'Pokern-Yvo', 'name': {'family': 'Pokern', 'given': 'Y.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'van-Zanten-J-H', 'name': {'family': 'van Zanten', 'given': 'J. H.'}}]}
Year: 2013
DOI: 10.1016/j.spa.2012.08.010
We study a Bayesian approach to nonparametric estimation of the periodic drift function of a one-dimensional diffusion from continuous-time data. Rewriting the likelihood in terms of local time of the process, and specifying a Gaussian prior with precision operator of differential form, we show that the posterior is also Gaussian with the precision operator also of differential form. The resulting expressions are explicit and lead to algorithms which are readily implementable. Using new functional limit theorems
for the local time of diffusions on the circle, we bound the rate at which the posterior contracts around the
true drift function.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/vygge-1st17Accuracy and stability of filters for dissipative PDEs
https://resolver.caltech.edu/CaltechAUTHORS:20160727-180601953
Authors: {'items': [{'id': 'Brett-C-E-A', 'name': {'family': 'Brett', 'given': 'C. E. A.'}}, {'id': 'Lam-K-F', 'name': {'family': 'Lam', 'given': 'K. F.'}}, {'id': 'Law-K-J-H', 'name': {'family': 'Law', 'given': 'K. J. H.'}}, {'id': 'McCormick-D-S', 'name': {'family': 'McCormick', 'given': 'D. S.'}}, {'id': 'Scott-M-R', 'name': {'family': 'Scott', 'given': 'M. R.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2013
DOI: 10.1016/j.physd.2012.11.005
Data assimilation methodologies are designed to incorporate noisy observations of a physical system into an underlying model in order to infer the properties of the state of the system. Filters refer to a class of data assimilation algorithms designed to update the estimation of the state in an on-line fashion, as data is acquired sequentially. For linear problems subject to Gaussian noise, filtering can be performed exactly using the Kalman filter. For nonlinear systems filtering can be approximated in a systematic way by particle filters. However in high dimensions these particle filtering methods can break down. Hence, for the large nonlinear systems arising in applications such as oceanography and weather forecasting, various ad hoc filters are used, mostly based on making Gaussian approximations. The purpose of this work is to study the accuracy and stability properties of these ad hoc filters. We work in the context of the 2D incompressible Navier–Stokes equation, although the ideas readily generalize to a range of dissipative partial differential equations (PDEs). By working in this infinite dimensional setting we provide an analysis which is useful for the understanding of high dimensional filtering, and is robust to mesh-refinement. We describe theoretical results showing that, in the small observational noise limit, the filters can be tuned to perform accurately in tracking the signal itself (filter accuracy), provided the system is observed in a sufficiently large low dimensional space; roughly speaking this space should be large enough to contain the unstable modes of the linearized dynamics. The tuning corresponds to what is known as variance inflation in the applied literature. Numerical results are given which illustrate the theory. The positive results herein concerning filter stability complement recent numerical studies which demonstrate that the ad hoc filters can perform poorly in reproducing statistical variation about the true signal.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/598h1-ne145Ensemble Kalman methods for inverse problems
https://resolver.caltech.edu/CaltechAUTHORS:20160727-180147046
Authors: {'items': [{'id': 'Iglesias-M-A', 'name': {'family': 'Iglesias', 'given': 'Marco A.'}, 'orcid': '0000-0002-8952-717X'}, {'id': 'Law-K-J-H', 'name': {'family': 'Law', 'given': 'Kody J. H.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2013
DOI: 10.1088/0266-5611/29/4/045001
The ensemble Kalman filter (EnKF) was introduced by Evensen in 1994 (Evensen 1994 J. Geophys. Res. 99 10143–62) as a novel method for data assimilation: state estimation for noisily observed time-dependent problems. Since that time it has had enormous impact in many application domains because of its robustness and ease of implementation, and numerical evidence of its accuracy. In this paper we propose the application of an iterative ensemble Kalman method for the solution of a wide class of inverse problems. In this context we show that the estimate of the unknown function that we obtain with the ensemble Kalman method lies in a subspace A spanned by the initial ensemble. Hence the resulting error may be bounded above by the error found from the best approximation in this subspace. We provide numerical experiments which compare the error incurred by the ensemble Kalman method for inverse problems with the error of the best approximation in A, and with variants on traditional least-squares approaches, restricted to the subspace A. In so doing we demonstrate that the ensemble Kalman method for inverse problems provides a derivative-free optimization method with comparable accuracy to that achieved by traditional least-squares approaches. Furthermore, we also demonstrate that the accuracy is of the same order of magnitude as that achieved by the best approximation. Three examples are used to demonstrate these assertions: inversion of a compact linear operator; inversion of piezometric head to determine hydraulic conductivity in a Darcy model of groundwater flow; and inversion of Eulerian velocity measurements at positive times to determine the initial condition in an incompressible fluid.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/8rzck-1rz12Evaluation of Gaussian approximations for data assimilation in reservoir models
https://resolver.caltech.edu/CaltechAUTHORS:20160727-153428298
Authors: {'items': [{'id': 'Iglesias-M-A', 'name': {'family': 'Iglesias', 'given': 'Marco A.'}, 'orcid': '0000-0002-8952-717X'}, {'id': 'Law-K-J-H', 'name': {'family': 'Law', 'given': 'Kody'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2013
DOI: 10.1007/s10596-013-9359-x
The Bayesian framework is the standard approach for data assimilation in reservoir modeling. This framework involves characterizing the posterior distribution of geological parameters in terms of a given prior distribution and data from the reservoir dynamics, together with a forward model connecting the space of geological parameters to the data space. Since the posterior distribution quantifies the uncertainty in the geologic parameters of the reservoir, the characterization of the posterior is fundamental for the optimal management of reservoirs. Unfortunately, due to the large-scale highly nonlinear properties of standard reservoir models, characterizing the posterior is computationally prohibitive. Instead, more affordable ad hoc techniques, based on Gaussian approximations, are often used for characterizing the posterior distribution. Evaluating the performance of those Gaussian approximations is typically conducted by assessing their ability at reproducing the truth within the confidence interval provided by the ad hoc technique under consideration. This has the disadvantage of mixing up the approximation properties of the history matching algorithm employed with the information content of the particular observations used, making it hard to evaluate the effect of the ad hoc approximations alone. In this paper, we avoid this disadvantage by comparing the ad hoc techniques with a fully resolved state-of-the-art probing of the Bayesian posterior distribution. The ad hoc techniques whose performance we assess are based on (1) linearization around the maximum a posteriori estimate, (2) randomized maximum likelihood, and (3) ensemble Kalman filter-type methods. In order to fully resolve the posterior distribution, we implement a state-of-the art Markov chain Monte Carlo (MCMC) method that scales well with respect to the dimension of the parameter space, enabling us to study realistic forward models, in two space dimensions, at a high level of grid refinement. Our implementation of the MCMC method provides the gold standard against which the aforementioned Gaussian approximations are assessed. We present numerical synthetic experiments where we quantify the capability of each of the ad hoc Gaussian approximation in reproducing the mean and the variance of the posterior distribution (characterized via MCMC) associated to a data assimilation problem. Both single-phase and two-phase (oil–water) reservoir models are considered so that fundamental differences in the resulting forward operators are highlighted. The main objective of our controlled experiments was to exhibit the substantial discrepancies of the approximation properties of standard ad hoc Gaussian approximations. Numerical investigations of the type we present here will lead to the greater understanding of the cost-efficient, but ad hoc, Bayesian techniques used for data assimilation in petroleum reservoirs and hence ultimately to improved techniques with more accurate uncertainty quantification.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/xf4ex-ef595Accuracy and stability of the continuous-time 3DVAR filter for the Navier–Stokes equation
https://resolver.caltech.edu/CaltechAUTHORS:20160726-141854847
Authors: {'items': [{'id': 'Blömker-D', 'name': {'family': 'Blömker', 'given': 'D.'}}, {'id': 'Law-K', 'name': {'family': 'Law', 'given': 'K.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Zygalakis-K-C', 'name': {'family': 'Zygalakis', 'given': 'K. C.'}}]}
Year: 2013
DOI: 10.1088/0951-7715/26/8/2193
The 3DVAR filter is prototypical of methods used to combine observed data with a dynamical system, online, in order to improve estimation of the state of the system. Such methods are used for high dimensional data assimilation problems, such as those arising in weather forecasting. To gain understanding of filters in applications such as these, it is hence of interest to study their behaviour when applied to infinite dimensional dynamical systems. This motivates the study of the problem of accuracy and stability of 3DVAR filters for the Navier–Stokes equation.
We work in the limit of high frequency observations and derive continuous time filters. This leads to a stochastic partial differential equation (SPDE) for state estimation, in the form of a damped-driven Navier–Stokes equation, with mean-reversion to the signal, and spatially-correlated time-white noise. Both forward and pullback accuracy and stability results are proved for this SPDE, showing in particular that when enough low Fourier modes are observed, and when the model uncertainty is larger than the data uncertainty in these modes (variance inflation), then the filter can lock on to a small neighbourhood of the true signal, recovering from order one initial error, if the error in the observed modes is small. Numerical examples are given to illustrate the theory.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/ft4dq-e9d51Complexity analysis of accelerated MCMC methods for Bayesian inversion
https://resolver.caltech.edu/CaltechAUTHORS:20160727-163339156
Authors: {'items': [{'id': 'Hoang-Viet-Ha', 'name': {'family': 'Hoang', 'given': 'Viet Ha'}}, {'id': 'Schwab-C', 'name': {'family': 'Schwab', 'given': 'Christoph'}, 'orcid': '0000-0002-4046-987X'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2013
DOI: 10.1088/0266-5611/29/8/085010
The Bayesian approach to inverse problems, in which the posterior probability distribution on an unknown field is sampled for the purposes of computing posterior expectations of quantities of interest, is starting to become computationally feasible for partial differential equation (PDE) inverse problems. Balancing the sources of error arising from finite-dimensional approximation of the unknown field, the PDE forward solution map and the sampling of the probability space under the posterior distribution are essential for the design of efficient computational Bayesian methods for PDE inverse problems. We study Bayesian inversion for a model elliptic PDE with an unknown diffusion coefficient. We provide complexity analyses of several Markov chain Monte Carlo (MCMC) methods for the efficient numerical evaluation of expectations under the Bayesian posterior distribution, given data δ. Particular attention is given to bounds on the overall work required to achieve a prescribed error level ε. Specifically, we first bound the computational complexity of 'plain' MCMC, based on combining MCMC sampling with linear complexity multi-level solvers for elliptic PDE. Our (new) work versus accuracy bounds show that the complexity of this approach can be quite prohibitive. Two strategies for reducing the computational complexity are then proposed and analyzed: first, a sparse, parametric and deterministic generalized polynomial chaos (gpc) 'surrogate' representation of the forward response map of the PDE over the entire parameter space, and, second, a novel multi-level Markov chain Monte Carlo strategy which utilizes sampling from a multi-level discretization of the posterior and the forward PDE. For both of these strategies, we derive asymptotic bounds on work versus accuracy, and hence asymptotic bounds on the computational complexity of the algorithms. In particular, we provide sufficient conditions on the regularity of the unknown coefficients of the PDE and on the approximation methods used, in order for the accelerations of MCMC resulting from these strategies to lead to complexity reductions over 'plain' MCMC algorithms for the Bayesian inversion of PDEs.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/dj9k7-v6d54MCMC Methods for Functions: Modifying Old Algorithms to Make Them Faster
https://resolver.caltech.edu/CaltechAUTHORS:20160727-155941152
Authors: {'items': [{'id': 'Cotter-S-L', 'name': {'family': 'Cotter', 'given': 'S. L.'}}, {'id': 'Roberts-G-O', 'name': {'family': 'Roberts', 'given': 'G. O.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'White-D', 'name': {'family': 'White', 'given': 'D.'}}]}
Year: 2013
DOI: 10.1214/13-STS421
Many problems arising in applications result in the need to probe a probability distribution for functions. Examples include Bayesian nonparametric statistics and conditioned diffusion processes. Standard MCMC algorithms typically become arbitrarily slow under the mesh refinement dictated by nonparametric description of the unknown function. We describe an approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement.
Gaussian processes or random fields are fields whose marginal distributions, when evaluated at any finite set of NNpoints, are ℝ^N-valued Gaussians. The algorithmic approach that we describe is applicable not only when the desired probability measure has density with respect to a Gaussian process or Gaussian random field reference measure, but also to some useful non-Gaussian reference measures constructed through random truncation. In the applications of interest the data is often sparse and the prior specification is an essential part of the overall modelling strategy. These Gaussian-based reference measures are a very flexible modelling tool, finding wide-ranging application. Examples are shown in density estimation, data assimilation in fluid mechanics, subsurface geophysics and image registration.
The key design principle is to formulate the MCMC method so that it is, in principle, applicable for functions; this may be achieved by use of proposals based on carefully chosen time-discretizations of stochastic dynamical systems which exactly preserve the Gaussian reference measure. Taking this approach leads to many new algorithms which can be implemented via minor modification of existing algorithms, yet which show enormous speed-up on a wide range of applied problems.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/9kcwv-y7e77MAP estimators and their consistency in Bayesian nonparametric inverse problems
https://resolver.caltech.edu/CaltechAUTHORS:20160727-154930631
Authors: {'items': [{'id': 'Dashti-M', 'name': {'family': 'Dashti', 'given': 'M.'}}, {'id': 'Law-K-J-H', 'name': {'family': 'Law', 'given': 'K. J. H.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Voss-J', 'name': {'family': 'Voss', 'given': 'J.'}, 'orcid': '0000-0001-7740-8811'}]}
Year: 2013
DOI: 10.1088/0266-5611/29/9/095017
We consider the inverse problem of estimating an unknown function u from noisy measurements y of a known, possibly nonlinear, map G applied to u. We adopt a Bayesian approach to the problem and work in a setting where the prior measure is specified as a Gaussian random field μ0. We work under a natural set of conditions on the likelihood which implies the existence of a well-posed posterior measure, μ^y. Under these conditions, we show that the maximum a posteriori (MAP) estimator is well defined as the minimizer of an Onsager–Machlup functional defined on the Cameron–Martin space of the prior; thus, we link a problem in probability with a problem in the calculus of variations. We then consider the case where the observational noise vanishes and establish a form of Bayesian posterior consistency for the MAP estimator. We also prove a similar result for the case where the observation of G(u) can be repeated as many times as desired with independent identically distributed noise. The theory is illustrated with examples from an inverse problem for the Navier–Stokes equation, motivated by problems arising in weather forecasting, and from the theory of conditioned diffusions, motivated by problems arising in molecular dynamics.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/3qtw6-tbj43Posterior contraction rates for the Bayesian approach to linear ill-posed inverse problems
https://resolver.caltech.edu/CaltechAUTHORS:20160727-163931558
Authors: {'items': [{'id': 'Agapiou-S', 'name': {'family': 'Agapiou', 'given': 'Sergios'}}, {'id': 'Larsson-S', 'name': {'family': 'Larsson', 'given': 'Stig'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2013
DOI: 10.1016/j.spa.2013.05.001
We consider a Bayesian nonparametric approach to a family of linear inverse problems in a separable Hilbert space setting with Gaussian noise. We assume Gaussian priors, which are conjugate to the model, and present a method of identifying the posterior using its precision operator. Working with the unbounded precision operator enables us to use partial differential equations (PDE) methodology to obtain rates of contraction of the posterior distribution to a Dirac measure centered on the true solution. Our methods assume a relatively weak relation between the prior covariance, noise covariance and forward operator, allowing for a wide range of applications.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/gkpcj-56y97Optimal tuning of the hybrid Monte Carlo algorithm
https://resolver.caltech.edu/CaltechAUTHORS:20160726-155502558
Authors: {'items': [{'id': 'Beskos-A', 'name': {'family': 'Beskos', 'given': 'Alexandros'}}, {'id': 'Pillai-N-S', 'name': {'family': 'Pillai', 'given': 'Natesh'}}, {'id': 'Roberts-G', 'name': {'family': 'Roberts', 'given': 'Gareth'}}, {'id': 'Sanz-Serna-J-M', 'name': {'family': 'Sanz-Serna', 'given': 'Jesus-Maria'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2013
DOI: 10.3150/12-BEJ414
We investigate the properties of the hybrid Monte Carlo algorithm (HMC) in high dimensions. HMC develops a Markov chain reversible with respect to a given target distribution Π using separable Hamiltonian dynamics with potential −logΠ−logΠ. The additional momentum variables are chosen at random from the Boltzmann distribution, and the continuous-time Hamiltonian dynamics are then discretised using the leapfrog scheme. The induced bias is removed via a Metropolis–Hastings accept/reject rule. In the simplified scenario of independent, identically distributed components, we prove that, to obtain an O(1) acceptance probability as the dimension dd of the state space tends to ∞, the leapfrog step size hh should be scaled as h=l×d^(−1/4). Therefore, in high dimensions, HMC requires O(d^(1/4)) steps to traverse the state space. We also identify analytically the asymptotically optimal acceptance probability, which turns out to be 0.651 (to three decimal places). This value optimally balances the cost of generating a proposal, which decreases as l increases (because fewer steps are required to reach the desired final integration time), against the cost related to the average number of proposals required to obtain acceptance, which increases as l increases.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/vp1h5-2p710Inverse Problems and Uncertainty Quantification
https://resolver.caltech.edu/CaltechAUTHORS:20161111-105218524
Authors: {'items': [{'id': 'Iglesias-M', 'name': {'family': 'Iglesias', 'given': 'Marco'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2014
Quantifying uncertainty in the solution
of inverse problems is an exciting area of
research in the mathematical sciences, one
that raises significant challenges at the
interfaces between analysis, computation,
probability, and statistics. The reach in
terms of applicability is enormous, with
diverse problems arising in the physical,
biological, and social sciences, such as
weather prediction, epidemiology, and traffic flow.
Loosely speaking, inverse problems confront mathematical models with data so
that we can deduce the inputs needed to
run the models; knowledge of these inputs
can then be used to make predictions, and
even to devise control strategies based
on the predictions. Both the models and
the data are typically uncertain, as are the
resulting deductions and predictions; as
a consequence, any decisions or control
strategies based on the predictions will be
greatly improved if the uncertainty is made
quantitative.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/sp5yy-03p83Analysis of the 3DVAR filter for the partially observed Lorenz'63 model
https://resolver.caltech.edu/CaltechAUTHORS:20160719-152032603
Authors: {'items': [{'id': 'Law-K-J-H', 'name': {'family': 'Law', 'given': 'Kody'}}, {'id': 'Shukla-Abishek', 'name': {'family': 'Shukla', 'given': 'Abishek'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2014
DOI: 10.3934/dcds.2014.34.1061
The problem of effectively combining data with a mathematical model constitutes a major challenge in applied mathematics. It is particular challenging for high-dimensional dynamical systems where data is received sequentially in time and the objective is to estimate the system state in an on-line fashion; this situation arises, for example, in weather forecasting. The sequential particle filter is then impractical and ad hoc filters, which employ some form of Gaussian approximation, are widely used. Prototypical of these ad hoc filters is the 3DVAR method. The goal of this paper is to analyze the 3DVAR method, using the Lorenz '63 model to exemplify the key ideas. The situation where the data is partial and noisy is studied, and both discrete time and continuous time data streams are considered. The theory demonstrates how the widely used technique of variance inflation acts to stabilize the filter, and hence leads to asymptotic accuracy.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/4jgk4-xm142Noisy gradient flow from a random walk in Hilbert space
https://resolver.caltech.edu/CaltechAUTHORS:20160719-145056000
Authors: {'items': [{'id': 'Pillai-N-S', 'name': {'family': 'Pillai', 'given': 'Natesh S.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Thiéry-A-H', 'name': {'family': 'Thiéry', 'given': 'Alexandre H.'}}]}
Year: 2014
DOI: 10.1007/s40072-014-0029-3
Consider a probability measure on a Hilbert space defined via its density with respect to a Gaussian. The purpose of this paper is to demonstrate that an appropriately defined Markov chain, which is reversible with respect to the measure in question, exhibits a diffusion limit to a noisy gradient flow, also reversible with respect to the same measure. The Markov chain is defined by applying a Metropolis–Hastings accept–reject mechanism (Tierney, Ann Appl Probab 8:1–9, 1998) to an Ornstein–Uhlenbeck (OU) proposal which is itself reversible with respect to the underlying Gaussian measure. The resulting noisy gradient flow is a stochastic partial differential equation driven by a Wiener process with spatial correlation given by the underlying Gaussian structure. There are two primary motivations for this work. The first concerns insight into Monte Carlo Markov Chain (MCMC) methods for sampling of measures on a Hilbert space defined via a density with respect to a Gaussian measure. These measures must be approximated on finite dimensional spaces of dimension N in order to be sampled. A conclusion of the work herein is that MCMC methods based on prior-reversible OU proposals will explore the target measure in O(1) steps with respect to dimension N. This is to be contrasted with standard MCMC methods based on the random walk or Langevin proposals which require O(N) and O(N^(1/3)) steps respectively (Mattingly et al., Ann Appl Prob 2011; Pillai et al., Ann Appl Prob 22:2320–2356 2012). The second motivation relates to optimization. There are many applications where it is of interest to find global or local minima of a functional defined on an infinite dimensional Hilbert space. Gradient flow or steepest descent is a natural approach to this problem, but in its basic form requires computation of a gradient which, in some applications, may be an expensive or complex task. This paper shows that a stochastic gradient descent described by a stochastic partial differential equation can emerge from certain carefully specified Markov chains. This idea is well-known in the finite state (Kirkpatricket al., Science 220:671–680, 1983; Cerny, J Optim Theory Appl 45:41–51, 1985) or finite dimensional context (German, IEEE Trans Geosci Remote Sens 1:269–276, 1985; German, SIAM J Control Optim 24:1031, 1986; Chiang, SIAM J Control Optim 25:737–753, 1987; J Funct Anal 83:333–347, 1989). The novelty of the work in this paper is that the emergence of the noisy gradient flow is developed on an infinite dimensional Hilbert space. In the context of global optimization, when the noise level is also adjusted as part of the algorithm, methods of the type studied here go by the name of simulated–annealing; see the review (Bertsimas and Tsitsiklis, Stat Sci 8:10–15, 1993) for further references. Although we do not consider adjusting the noise-level as part of the algorithm, the noise strength is a tuneable parameter in our construction and the methods developed here could potentially be used to study simulated annealing in a Hilbert space setting. The transferable idea behind this work is that conceiving of algorithms directly in the infinite dimensional setting leads to methods which are robust to finite dimensional approximation. We emphasize that discretizing, and then applying standard finite dimensional techniques in ℝ^N, to either sample or optimize, can lead to algorithms which degenerate as the dimension N increases.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/01580-ek934Determining white noise forcing from Eulerian observations in the Navier-Stokes equation
https://resolver.caltech.edu/CaltechAUTHORS:20160719-150657732
Authors: {'items': [{'id': 'Hoang-Viet-Ha', 'name': {'family': 'Hoang', 'given': 'Viet Ha'}}, {'id': 'Law-K-J-H', 'name': {'family': 'Law', 'given': 'Kody J. H.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2014
DOI: 10.1007/s40072-014-0028-4
The Bayesian approach to inverse problems is of paramount importance in quantifying uncertainty about the input to, and the state of, a system of interest given noisy observations. Herein we consider the forward problem of the forced 2D Navier-Stokes equation. The inverse problem is to make inference concerning the forcing, and possibly the initial condition, given noisy observations of the velocity field. We place a prior on the forcing which is in the form of a spatially-correlated and temporally-white Gaussian process, and formulate the inverse problem for the posterior distribution. Given appropriate spatial regularity conditions, we show that the solution is a continuous function of the forcing. Hence, for appropriately chosen spatial regularity in the prior, the posterior distribution on the forcing is absolutely continuous with respect to the prior and is hence well-defined. Furthermore, it may then be shown that the posterior distribution is a continuous function of the data. We complement these theoretical results with numerical simulations showing the feasibility of computing the posterior distribution, and illustrating its properties.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/0cpst-x4k95Bayesian posterior contraction rates for linear severely ill-posed inverse problems
https://resolver.caltech.edu/CaltechAUTHORS:20160719-151308932
Authors: {'items': [{'id': 'Agapiou-S', 'name': {'family': 'Agapiou', 'given': 'Sergios'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Zhang-Yuan-Xiang', 'name': {'family': 'Zhang', 'given': 'Yuan-Xiang'}}]}
Year: 2014
DOI: 10.1515/jip-2012-0071
We consider a class of linear ill-posed inverse problems arising from inversion of a compact operator with singular values which decay exponentially to zero. We adopt a Bayesian approach, assuming a Gaussian prior on the unknown function. The observational noise is assumed to be Gaussian; as a consequence the prior is conjugate to the likelihood so that the posterior distribution is also Gaussian. We study Bayesian posterior consistency in the small observational noise limit. We assume that the forward operator and the prior and noise covariance operators commute with one another. We show how, for given smoothness assumptions on the truth, the scale parameter of the prior, which is a constant multiplier of the prior covariance operator, can be adjusted to optimize the rate of posterior contraction to the truth, and we explicitly compute the logarithmic rate.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/045fa-mk661Analysis of the Gibbs Sampler for Hierarchical Inverse Problems
https://resolver.caltech.edu/CaltechAUTHORS:20160719-141859444
Authors: {'items': [{'id': 'Agapiou-S', 'name': {'family': 'Agapiou', 'given': 'Sergios'}}, {'id': 'Bardsley-J-M', 'name': {'family': 'Bardsley', 'given': 'Jonathan M.'}}, {'id': 'Papaspiliopoulos-O', 'name': {'family': 'Papaspiliopoulos', 'given': 'Omiros'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2014
DOI: 10.1137/130944229
Many inverse problems arising in applications come from continuum models where the unknown parameter is a field. In practice the unknown field is discretized, resulting in a problem in ℝ^N, with an understanding that refining the discretization, that is, increasing N, will often be desirable. In the context of Bayesian inversion this situation suggests the importance of two issues: (i) defining hyperparameters in such a way that they are interpretable in the continuum limit N →∞ and so that their values may be compared between different discretization levels; and (ii) understanding the efficiency of algorithms for probing the posterior distribution as a function of large $N.$ Here we address these two issues in the context of linear inverse problems subject to additive Gaussian noise within a hierarchical modeling framework based on a Gaussian prior for the unknown field and an inverse-gamma prior for a hyperparameter, namely the amplitude of the prior variance. The structure of the model is such that the Gibbs sampler can be easily implemented for probing the posterior distribution. Subscribing to the dogma that one should think infinite-dimensionally before implementing in finite dimensions, we present function space intuition and provide rigorous theory showing that as $N$ increases, the component of the Gibbs sampler for sampling the amplitude of the prior variance becomes increasingly slower. We discuss a reparametrization of the prior variance that is robust with respect to the increase in dimension; we give numerical experiments which exhibit that our reparametrization prevents the slowing down. Our intuition on the behavior of the prior hyperparameter, with and without reparametrization, is sufficiently general to include a broad class of nonlinear inverse problems as well as other families of hyperpriors.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/vhvqc-38q28Well-posedness and accuracy of the ensemble Kalman filter in discrete and continuous time
https://resolver.caltech.edu/CaltechAUTHORS:20160719-143029648
Authors: {'items': [{'id': 'Kelly-D-T-B', 'name': {'family': 'Kelly', 'given': 'D. B. T.'}}, {'id': 'Law-K-J-H', 'name': {'family': 'Law', 'given': 'K. J. H.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2014
DOI: 10.1088/0951-7715/27/10/2579
The ensemble Kalman filter (EnKF) is a method for combining a dynamical model with data in a sequential fashion. Despite its widespread use, there has been little analysis of its theoretical properties. Many of the algorithmic innovations associated with the filter, which are required to make a useable algorithm in practice, are derived in an ad hoc fashion. The aim of this paper is to initiate the development of a systematic analysis of the EnKF, in particular to do so for small ensemble size. The perspective is to view the method as a state estimator, and not as an algorithm which approximates the true filtering distribution. The perturbed observation version of the algorithm is studied, without and with variance inflation. Without variance inflation well-posedness of the filter is established; with variance inflation accuracy of the filter, with respect to the true signal underlying the data, is established. The algorithm is considered in discrete time, and also for a continuous time limit arising when observations are frequent and subject to large noise. The underlying dynamical model, and assumptions about it, is sufficiently general to include the Lorenz '63 and '96 models, together with the incompressible Navier–Stokes equation on a two-dimensional torus. The analysis is limited to the case of complete observation of the signal with additive white noise. Numerical results are presented for the Navier–Stokes equation on a two-dimensional torus for both complete and partial observations of the signal with additive white noise.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/gg38v-w5014Well-posed Bayesian geometric inverse problems arising in subsurface flow
https://resolver.caltech.edu/CaltechAUTHORS:20160719-114834043
Authors: {'items': [{'id': 'Iglesias-M-A', 'name': {'family': 'Iglesias', 'given': 'Marco A.'}, 'orcid': '0000-0002-8952-717X'}, {'id': 'Lin-Kui', 'name': {'family': 'Lin', 'given': 'Kui'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2014
DOI: 10.1088/0266-5611/30/11/114001
In this paper, we consider the inverse problem of determining the permeability of the subsurface from hydraulic head measurements, within the framework of a steady Darcy model of groundwater flow. We study geometrically defined prior permeability fields, which admit layered, fault and channel structures, in order to mimic realistic subsurface features; within each layer we adopt either a constant or continuous function representation of the permeability. This prior model leads to a parameter identification problem for a finite number of unknown parameters determining the geometry, together with either a finite number of permeability values (in the constant case) or a finite number of fields (in the continuous function case). We adopt a Bayesian framework showing the existence and well-posedness of the posterior distribution. We also introduce novel Markov chain Monte Carlo (MCMC) methods, which exploit the different character of the geometric and permeability parameters, and build on recent advances in function space MCMC. These algorithms provide rigorous estimates of the permeability, as well as the uncertainty associated with it, and only require forward model evaluations. No adjoint solvers are required and hence the methodology is applicable to black-box forward models. We then use these methods to explore the posterior and to illustrate the methodology with numerical experiments.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/2z58d-fny34Spectral gaps for a Metropolis–Hastings algorithm in infinite dimensions
https://resolver.caltech.edu/CaltechAUTHORS:20160719-144104557
Authors: {'items': [{'id': 'Hairer-M', 'name': {'family': 'Hairer', 'given': 'Martin'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Vollmer-S', 'name': {'family': 'Vollmer', 'given': 'Sebastian'}}]}
Year: 2014
DOI: 10.1214/13-AAP982
We study the problem of sampling high and infinite dimensional target measures arising in applications such as conditioned diffusions and inverse problems. We focus on those that arise from approximating measures on Hilbert spaces defined via a density with respect to a Gaussian reference measure. We consider the Metropolis–Hastings algorithm that adds an accept–reject mechanism to a Markov chain proposal in order to make the chain reversible with respect to the target measure. We focus on cases where the proposal is either a Gaussian random walk (RWM) with covariance equal to that of the reference measure or an Ornstein–Uhlenbeck proposal (pCN) for which the reference measure is invariant.
Previous results in terms of scaling and diffusion limits suggested that the pCN has a convergence rate that is independent of the dimension while the RWM method has undesirable dimension-dependent behaviour. We confirm this claim by exhibiting a dimension-independent Wasserstein spectral gap for pCN algorithm for a large class of target measures. In our setting this Wasserstein spectral gap implies an L^2-spectral gap. We use both spectral gaps to show that the ergodic average satisfies a strong law of large numbers, the central limit theorem and nonasymptotic bounds on the mean square error, all dimension independent. In contrast we show that the spectral gap of the RWM algorithm applied to the reference measures degenerates as the dimension tends to infinity.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/fmcnm-dh305Kullback-Leibler approximation for probability measures on infinite dimensional spaces
https://resolver.caltech.edu/CaltechAUTHORS:20160715-170335769
Authors: {'items': [{'id': 'Pinski-F-J', 'name': {'family': 'Pinski', 'given': 'F. J.'}}, {'id': 'Simpson-G', 'name': {'family': 'Simpson', 'given': 'G.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Weber-H', 'name': {'family': 'Weber', 'given': 'H.'}}]}
Year: 2015
DOI: 10.1137/140962802
In a variety of applications it is important to extract information from a probability measure μ on an infinite dimensional space. Examples include the Bayesian approach to inverse problems and (possibly conditioned) continuous time Markov processes. It may then be of interest to find a measure ν, from within a simple class of measures, which approximates μ. This problem is studied in the case where the Kullback--Leibler divergence is employed to measure the quality of the approximation. A calculus of variations viewpoint is adopted, and the particular case where ν is chosen from the set of Gaussian measures is studied in detail. Basic existence and uniqueness theorems are established, together with properties of minimizing sequences. Furthermore, parameterization of the class of Gaussians through the mean and inverse covariance is introduced, the need for regularization is explained, and a regularized minimization is studied in detail. The calculus of variations framework resulting from this work provides the appropriate underpinning for computational algorithms.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/g9mg6-h5p83Data Assimilation: New Challenges in Random and Stochastic Dynamical Systems
https://resolver.caltech.edu/CaltechAUTHORS:20161111-104214792
Authors: {'items': [{'id': 'Reich-S', 'name': {'family': 'Reich', 'given': 'Sebastian'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2015
The seamless integration of large data sets into sophisticated computational models provides one of the central challenges for the mathematical sciences in
the 21st century. When the computational model is based on dynamical systems, and the data set is time ordered, the process of combining models
and data is called data assimilation.
The assimilation of data into computational models serves a wide spectrum of purposes ranging from model calibration and model comparison, all the way
to the validation of novel model design
principles.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/r6xzh-cm264Data Assimilation: A Mathematical Introduction
https://resolver.caltech.edu/CaltechAUTHORS:20161110-161028249
Authors: {'items': [{'id': 'Law-K-J-H', 'name': {'family': 'Law', 'given': 'Kody'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}, {'id': 'Zygalakis-K-C', 'name': {'family': 'Zygalakis', 'given': 'Konstantinos'}}]}
Year: 2015
DOI: 10.1007/978-3-319-20325-6
This book provides a systematic treatment of the mathematical underpinnings of work in data assimilation, covering both theoretical and computational approaches. Specifically the authors develop a unified mathematical framework in which a Bayesian formulation of the problem provides the bedrock for the derivation, development and analysis of algorithms; the many examples used in the text, together with the algorithms which are introduced and discussed, are all illustrated by the MATLAB software detailed in the book and made freely available online.
The book is organized into nine chapters: the first contains a brief introduction to the mathematical tools around which the material is organized; the next four are concerned with discrete time dynamical systems and discrete time data; the last four are concerned with continuous time dynamical systems and continuous time data and are organized analogously to the corresponding discrete time chapters.
This book is aimed at mathematical researchers interested in a systematic development of this interdisciplinary field, and at researchers from the geosciences, and a variety of other scientific fields, who use tools from data assimilation to combine data with time-dependent models. The numerous examples and illustrations make understanding of the theoretical underpinnings of data assimilation accessible. Furthermore, the examples, exercises and MATLAB software, make the book suitable for students in applied mathematics, either through a lecture course, or through self-study.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/5b3bk-adp93Algorithms for Kullback--Leibler Approximation of Probability Measures in Infinite Dimensions
https://resolver.caltech.edu/CaltechAUTHORS:20160715-163821138
Authors: {'items': [{'id': 'Pinski-F-J', 'name': {'family': 'Pinski', 'given': 'F. J.'}}, {'id': 'Simpson-G', 'name': {'family': 'Simpson', 'given': 'G.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Weber-H', 'name': {'family': 'Weber', 'given': 'H.'}}]}
Year: 2015
DOI: 10.1137/14098171X
In this paper we study algorithms to find a Gaussian approximation to a target measure defined on a Hilbert space of functions; the target measure itself is defined via its density with respect to a reference Gaussian measure. We employ the Kullback--Leibler divergence as a distance and find the best Gaussian approximation by minimizing this distance. It then follows that the approximate Gaussian must be equivalent to the Gaussian reference measure, defining a natural function space setting for the underlying calculus of variations problem. We introduce a computational algorithm which is well-adapted to the required minimization, seeking to find the mean as a function, and parameterizing the covariance in two different ways: through low rank perturbations of the reference covariance and through Schrödinger potential perturbations of the inverse reference covariance. Two applications are shown: to a nonlinear inverse problem in elliptic PDEs and to a conditioned diffusion process. These Gaussian approximations also serve to provide a preconditioned proposal distribution for improved preconditioned Crank--Nicolson Monte Carlo--Markov chain sampling of the target distribution. This approach is not only well-adapted to the high dimensional setting, but also behaves well with respect to small observational noise (resp., small temperatures) in the inverse problem (resp., conditioned diffusion).https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/y05qf-4d976A Multiscale Analysis of Diffusions on Rapidly Varying Surfaces
https://resolver.caltech.edu/CaltechAUTHORS:20160715-172927058
Authors: {'items': [{'id': 'Duncan-A-B', 'name': {'family': 'Duncan', 'given': 'A. B.'}}, {'id': 'Elliott-C-M', 'name': {'family': 'Elliott', 'given': 'C. M.'}}, {'id': 'Pavliotis-G-A', 'name': {'family': 'Pavliotis', 'given': 'G. A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2015
DOI: 10.1007/s00332-015-9237-x
Lateral diffusion of molecules on surfaces plays a very important role in various biological processes, including lipid transport across the cell membrane, synaptic transmission, and other phenomena such as exo- and endocytosis, signal transduction, chemotaxis, and cell growth. In many cases, the surfaces can possess spatial inhomogeneities and/or be rapidly changing shape. Using a generalization of the model for a thermally excited Helfrich elastic membrane, we consider the problem of lateral diffusion on quasi-planar surfaces, possessing both spatial and temporal fluctuations. Using results from homogenization theory, we show that, under the assumption of scale separation between the characteristic length and timescales of the membrane fluctuations and the characteristic scale of the diffusing particle, the lateral diffusion process can be well approximated by a Brownian motion on the plane with constant diffusion tensor DD that depends on a highly nonlinear way on the detailed properties of the surface. The effective diffusion tensor will depend on the relative scales of the spatial and temporal fluctuations, and for different scaling regimes, we prove the existence of a macroscopic limit in each case.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/mzvhv-ew061Sequential Monte Carlo methods for Bayesian elliptic inverse problems
https://resolver.caltech.edu/CaltechAUTHORS:20160715-172126693
Authors: {'items': [{'id': 'Beskos-A', 'name': {'family': 'Beskos', 'given': 'Alexandros'}}, {'id': 'Jasra-A', 'name': {'family': 'Jasra', 'given': 'Ajay'}}, {'id': 'Muzaffer-E-A', 'name': {'family': 'Muzaffer', 'given': 'Ege A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2015
DOI: 10.1007/s11222-015-9556-7
In this article, we consider a Bayesian inverse problem associated to elliptic partial differential equations in two and three dimensions. This class of inverse problems is important in applications such as hydrology, but the complexity of the link function between unknown field and measurements can make it difficult to draw inference from the associated posterior. We prove that for this inverse problem a basic sequential Monte Carlo (SMC) method has a Monte Carlo rate of convergence with constants which are independent of the dimension of the discretization of the problem; indeed convergence of the SMC method is established in a function space setting. We also develop an enhancement of the SMC methods for inverse problems which were introduced in Kantas et al. (SIAM/ASA J Uncertain Quantif 2:464–489, 2014); the enhancement is designed to deal with the additional complexity of this elliptic inverse problem. The efficacy of the methodology and its desirable theoretical properties, are demonstrated for numerical examples in both two and three dimensions.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/8wh6f-trm63Long-Time Asymptotics of the Filtering Distribution for Partially Observed Chaotic Dynamical Systems
https://resolver.caltech.edu/CaltechAUTHORS:20160715-165131732
Authors: {'items': [{'id': 'Sanz-Alonso-D', 'name': {'family': 'Sanz-Alonso', 'given': 'Daniel'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2015
DOI: 10.1137/140997336
The filtering distribution is a time-evolving probability distribution on the state of a dynamical system given noisy observations. We study the large-time asymptotics of this probability distribution for discrete-time, randomly initialized signals that evolve according to a deterministic map Ψ. The observations are assumed to comprise a low-dimensional projection of the signal, given by an operator P, subject to additive noise. We address the question of whether these observations contain sufficient information to accurately reconstruct the signal. In a general framework, we establish conditions on Ψ and P under which the filtering distributions concentrate around the signal in the small-noise, long-time asymptotic regime. Linear systems, the Lorenz '63 and '96 models, and the Navier--Stokes equation on a two-dimensional torus are within the scope of the theory. Our main findings come as a by-product of computable bounds, of independent interest, for suboptimal filters based on new variants of the 3DVAR filtering algorithm.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/1axxh-tm466A Bayesian Level Set Method for Geometric Inverse Problems
https://resolver.caltech.edu/CaltechAUTHORS:20161221-114630868
Authors: {'items': [{'id': 'Iglesias-M-A', 'name': {'family': 'Iglesias', 'given': 'Marco A.'}, 'orcid': '0000-0002-8952-717X'}, {'id': 'Lu-Yulong', 'name': {'family': 'Lu', 'given': 'Yulong'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2016
DOI: 10.4171/IFB/362
We introduce a level set based approach to Bayesian geometric inverse problems. In these problems the interface between different domains is the key unknown, and is realized as the level set of a function. This function itself becomes the object of the inference. Whilst the level set methodology has been widely used for the solution of geometric inverse problems, the Bayesian formulation that we develop here contains two significant advances: firstly it leads to a well-posed inverse problem in which the posterior distribution is Lipschitz with respect to the observed data; and secondly it leads to computationally expedient algorithms in which the level set itself is updated implicitly via the MCMC methodology applied to the level set function- no explicit velocity field is required for the level set interface. Applications are numerous and include medical imaging, modelling of subsurface formations and the inverse source problem; our theory is illustrated with computational results involving the last two applications.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/q28kh-6zd12A Function Space HMC Algorithm With Second Order Langevin Diffusion Limit
https://resolver.caltech.edu/CaltechAUTHORS:20160715-161420502
Authors: {'items': [{'id': 'Ottobre-M', 'name': {'family': 'Ottobre', 'given': 'Michela'}, 'orcid': '0000-0002-8725-4278'}, {'id': 'Pillai-N-S', 'name': {'family': 'Pillai', 'given': 'Natesh S.'}}, {'id': 'Pinski-F-J', 'name': {'family': 'Pinski', 'given': 'Frank J.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2016
DOI: 10.3150/14-BEJ621
We describe a new MCMC method optimized for the sampling of probability measures on Hilbert space which have a density with respect to a Gaussian; such measures arise in the Bayesian approach to inverse problems, and in conditioned diffusions. Our algorithm is based on two key design principles: (i) algorithms which are well defined in infinite dimensions result in methods which do not suffer from the curse of dimensionality when they are applied to approximations of the infinite dimensional target measure on R^N; (ii) nonreversible algorithms can have better mixing properties compared to their reversible counterparts. The method we introduce is based on the hybrid Monte Carlo algorithm, tailored to incorporate these two design principles. The main result of this paper states that the new algorithm, appropriately rescaled, converges weakly to a second order Langevin diffusion on Hilbert space; as a consequence the algorithm explores the approximate target measures on R^N in a number of steps which is independent of N. We also present the underlying theory for the limiting nonreversible diffusion on Hilbert space, including characterization of the invariant measure, and we describe numerical simulations demonstrating that the proposed method has favourable mixing properties as an MCMC algorithm.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/qpcwv-pnr67Mathematics, Statistics and Data Science
https://resolver.caltech.edu/CaltechAUTHORS:20161111-103206810
Authors: {'items': [{'id': 'Bühlmann-P', 'name': {'family': 'Bühlmann', 'given': 'Peter'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2016
The process of extracting information from data has a long history (see, for example, [1]) stretching back over centuries. Because of the proliferation of data over the last few decades, and projections for its continued proliferation over coming decades, the term Data Science has emerged to describe the substantial current intellectual effort around research with the same overall goal, namely that of extracting information. The type of data currently available in all sorts of application domains is often massive in size, very heterogeneous and far from being collected under designed or controlled experimental conditions. Nonetheless, it contains information, often
substantial information, and data science requires
new interdisciplinary approaches to make maximal use
of this information. Data alone is typically not that informative and (machine) learning from data needs conceptual frameworks. Mathematics and statistics are crucial for providing such conceptual frameworks. The frameworks enhance the understanding of fundamental phenomena, highlight limitations and provide a formalism for properly founded data analysis, information extraction and quantification of uncertainty, as well as for the analysis and development of algorithms that carry out these key tasks. In this personal commentary on data science and its relations to mathematics and statistics, we highlight three important aspects of the emerging field: Models, High-Dimensionality and Heterogeneity, and then conclude with a brief discussion of where the field is now and implications for the mathematical sciences.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/fsa25-mvt68Filter accuracy for the Lorenz 96 model: fixed versus adaptive observation operators
https://resolver.caltech.edu/CaltechAUTHORS:20160715-152128338
Authors: {'items': [{'id': 'Law-K-J-H', 'name': {'family': 'Law', 'given': 'K. J. H.'}}, {'id': 'Sanz-Alonso-D', 'name': {'family': 'Sanz-Alonso', 'given': 'D.'}}, {'id': 'Shukla-A', 'name': {'family': 'Shukla', 'given': 'A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2016
DOI: 10.1016/j.physd.2015.12.008
In the context of filtering chaotic dynamical systems it is well-known that partial observations, if sufficiently informative, can be used to control the inherent uncertainty due to chaos. The purpose of this paper is to investigate, both theoretically and numerically, conditions on the observations of chaotic systems under which they can be accurately filtered. In particular, we highlight the advantage of adaptive observation operators over fixed ones. The Lorenz '96 model is used to exemplify our findings.
We consider discrete-time and continuous-time observations in our theoretical developments. We prove that, for fixed observation operator, the 3DVAR filter can recover the system state within a neighbourhood determined by the size of the observational noise. It is required that a sufficiently large proportion of the state vector is observed, and an explicit form for such sufficient fixed observation operator is given. Numerical experiments, where the data is incorporated by use of the 3DVAR and extended Kalman filters, suggest that less informative fixed operators than given by our theory can still lead to accurate signal reconstruction. Adaptive observation operators are then studied numerically; we show that, for carefully chosen adaptive observation operators, the proportion of the state vector that needs to be observed is drastically smaller than with a fixed observation operator. Indeed, we show that the number of state coordinates that need to be observed may even be significantly smaller than the total number of positive Lyapunov exponents of the underlying system.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/3m53r-s1x09MAP estimators for piecewise continuous inversion
https://resolver.caltech.edu/CaltechAUTHORS:20170612-142444027
Authors: {'items': [{'id': 'Dunlop-M-M', 'name': {'family': 'Dunlop', 'given': 'M. M.'}, 'orcid': '0000-0001-7718-3755'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2016
DOI: 10.1088/0266-5611/32/10/105003
We study the inverse problem of estimating a field u^a from data comprising a finite set of nonlinear functionals of u^a , subject to additive noise; we denote this observed data by y. Our interest is in the reconstruction of piecewise continuous fields u^a in which the discontinuity set is described by a finite number of geometric parameters a. Natural applications include groundwater flow and electrical impedance tomography. We take a Bayesian approach, placing a prior distribution on u^a and determining the conditional distribution on u^a given the data y. It is then natural to study maximum a posterior (MAP) estimators. Recently (Dashti et al 2013 Inverse Problems 29 095017) it has been shown that MAP estimators can be characterised as minimisers of a generalised Onsager–Machlup functional, in the case where the prior measure is a Gaussian random field. We extend this theory to a more general class of prior distributions which allows for piecewise continuous fields. Specifically, the prior field is assumed to be piecewise Gaussian with random interfaces between the different Gaussians defined by a finite number of parameters. We also make connections with recent work on MAP estimators for linear problems and possibly non-Gaussian priors (Helin and Burger 2015 Inverse Problems 31 085009) which employs the notion of Fomin derivative. In showing applicability of our theory we focus on the groundwater flow and EIT models, though the theory holds more generally. Numerical experiments are implemented for the groundwater flow model, demonstrating the feasibility of determining MAP estimators for these piecewise continuous models, but also that the geometric formulation can lead to multiple nearby (local) MAP estimators. We relate these MAP estimators to the behaviour of output from MCMC samples of the posterior, obtained using a state-of-the-art function space Metropolis–Hastings method.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/471b1-qsp94The Bayesian formulation of EIT: Analysis and algorithms
https://resolver.caltech.edu/CaltechAUTHORS:20170113-072909521
Authors: {'items': [{'id': 'Dunlop-M-M', 'name': {'family': 'Dunlop', 'given': 'Matthew M.'}, 'orcid': '0000-0001-7718-3755'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2016
DOI: 10.3934/ipi.2016030
We provide a rigorous Bayesian formulation of the EIT problem in an infinite dimensional setting, leading to well-posedness in the Hellinger metric with respect to the data. We focus particularly on the reconstruction of binary fields where the interface between different media is the primary unknown. We consider three different prior models -log-Gaussian, star-shaped and level set. Numerical simulations based on the implementation of MCMC are performed, illustrating the advantages and disadvantages of each type of prior in the reconstruction, in the case where the true conductivity is a binary field, and exhibiting the properties of the resulting posterior distribution.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/a4pre-m7359Matrix Analysis and Algorithms
https://resolver.caltech.edu/CaltechAUTHORS:20161110-160355297
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}, {'id': 'Voss-J', 'name': {'family': 'Voss', 'given': 'Jochen'}, 'orcid': '0000-0001-7740-8811'}]}
Year: 2016
The book contains an introduction to matrix analysis, and to the basic algorithms of numerical linear algebra. Further results can be found in many text books. The book of Horn and Johnson [HJ85] is an excellent reference for theoretical results about matrix analysis; see also
[Bha97]. The subject of linear algebra, and matrix analysis in particular, is treated in an original and illuminating fashion in [Lax97]. For a general introduction to the subject of numerical linear algebra we recommend the book by Trefethen and Bau [TB97]; more theoretical treatments of
the subject can be found in Demmel [Dem97], Golub and Van Loan [GL96] and in Stoer and Bulirsch [SB02]. Higham's book [Hig02] contains a wealth of information about stability and
the effect of rounding errors in numerical algorithms; it is this source that we used for almost all theorems we state concerning backward error analysis. The book of Saad [Saa97] covers the subject of iterative methods for linear systems. The symmetric eigenvalue problem is analysed
in Parlett [Par80].https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/mf2kc-m5915Gaussian approximations for transition paths in molecular dynamics
https://resolver.caltech.edu/CaltechAUTHORS:20161220-182307792
Authors: {'items': [{'id': 'Lu-Yulong', 'name': {'family': 'Lu', 'given': 'Yulong'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'name': {'family': 'Weber', 'given': 'Hendrik'}}]}
Year: 2016
DOI: 10.48550/arXiv.1604.06594
This paper is concerned with transition paths within the framework of the overdamped Langevin dynamics model of chemical reactions. We aim to give an efficient description of typical transition paths in the small temperature regime. We adopt a variational point of view and seek the best Gaussian approximation, with respect to Kullback-Leibler divergence, of the non-Gaussian distribution of the diffusion process. We interpret the mean of this Gaussian approximation as the "most likely path" and the covariance operator as a means to capture the typical fluctuations around this most likely path.
We give an explicit expression for the Kullback-Leibler divergence in terms of the mean and the covariance operator for a natural class of Gaussian approximations and show the existence of minimisers for the variational problem. Then the low temperature limit is studied via Γ-convergence of the associated variational problem. The limiting functional consists of two parts: The first part only depends on the mean and coincides with the Γ-limit of the Freidlin-Wentzell rate functional. The second part depends on both, the mean and the covariance operator and is minimized if the dynamics are given by a time-inhomogenous Ornstein-Uhlenbeck process found by linearization of the Langevin dynamics around the Freidlin-Wentzell minimizer.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/53ykm-qm573The Bayesian Approach to Inverse Problems
https://resolver.caltech.edu/CaltechAUTHORS:20161111-104641272
Authors: {'items': [{'id': 'Dashti-Masoumeh', 'name': {'family': 'Dashti', 'given': 'Masoumeh'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2017
DOI: 10.1007/978-3-319-12385-1_7
These lecture notes highlight the mathematical and computational structure relating to the formulation of, and development of algorithms for, the Bayesian
approach to inverse problems in differential equations. This approach is fundamental
in the quantification of uncertainty within applications in
volving the blending of mathematical models with data. The finite dimensional situation is described first, along with
some motivational examples. Then the development of probability measures on separable Banach space is undertaken, using a random series over an
infinite set of functions to
construct draws; these probability measures are used as
priors
in the Bayesian approach
to inverse problems. Regularity of draws from the
priors
is studied in the natural Sobolev
or Besov spaces implied by the choice of functions in the random series construction, and the Kolmogorov continuity
theorem is used to extend regularity considerations to the
space of Hölder continuous functions. Bayes' theorem is de
rived in this prior setting, and here interpreted as finding conditions under which the posterior is absolutely continuous with respect to the prior, and determining a formula for the Radon-Nikodym derivative in terms of the
likelihood of the data. Having established the form of the
posterior, we then describe various properties common to it in the infinite dimensional setting. These
properties include well-posedness, approximation theory, and the existence of maximum a posteriori estimators. We then describe measure-preserving dynamics, again on the
infinite dimensional space, including Markov chain-Monte C
arlo and sequential Monte Carlo methods, and measure-preserving reversible stochastic differential equations. By
formulating the theory and algorithms on the underlying infinite dimensional space, we obtain a framework suitable for rigorous analysis of the accuracy of reconstructions, of
computational complexity, as well as naturally constructing algorithms which perform well under mesh refinement, since they are inherently well-defined in infinite dimensions.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/nv6wx-sn328Filter Based Methods For Statistical Linear Inverse Problems
https://resolver.caltech.edu/CaltechAUTHORS:20161221-113147238
Authors: {'items': [{'id': 'Iglesias-M-A', 'name': {'family': 'Iglesias', 'given': 'Marco A.'}, 'orcid': '0000-0002-8952-717X'}, {'id': 'Lin-Kui', 'name': {'family': 'Lin', 'given': 'Kui'}}, {'id': 'Lu-Shuai', 'name': {'family': 'Lu', 'given': 'Shuai'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2017
DOI: 10.4310/CMS.2017.v15.n7.a4
Ill-posed inverse problems are ubiquitous in applications. Understanding of algorithms for their solution has been greatly enhanced by a deep understanding of the linear inverse problem. In the applied communities ensemble-based filtering methods have recently been used to solve inverse problems by introducing an artificial dynamical system. This opens up the possibility of using a range of other filtering methods, such as 3DVAR and Kalman based methods, to solve inverse problems, again by introducing an artificial dynamical system. The aim of this paper is to analyze such methods in the context of the linear inverse problem.
Statistical linear inverse problems are studied in the sense that the observational noise is assumed to be derived via realization of a Gaussian random variable. We investigate the asymptotic behavior of filter based methods for these inverse problems. Rigorous convergence rates are established for 3DVAR and for the Kalman filters, including minimax rates in some instances. Blowup of 3DVAR and a variant of its basic form is also presented, and optimality of the Kalman filter is discussed. These analyses reveal a close connection between (iterated) regularization schemes in deterministic inverse problems and filter based methods in data assimilation. Numerical experiments are presented to illustrate the theory.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/e539q-rc973Derivation and analysis of simplified filters
https://resolver.caltech.edu/CaltechAUTHORS:20161221-112623998
Authors: {'items': [{'id': 'Lee-Wongjung', 'name': {'family': 'Lee', 'given': 'Wongjung'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2017
DOI: 10.4310/CMS.2017.v15.n2.a6
Filtering is concerned with the sequential estimation of the state, and uncertainties, of a Markovian system, given noisy observations. It is particularly difficult to achieve accurate filtering in complex dynamical systems, such as those arising in turbulence, in which effective low-dimensional representation of the desired probability distribution is challenging. Nonetheless recent advances have shown considerable success in filtering based on certain carefully chosen simplifications of the underlying system, which allow closed form filters. This leads to filtering algorithms with significant, but judiciously chosen, model error. The purpose of this article is to analyze the effectiveness of these simplified filters, and to suggest modifications of them which lead to improved filtering in certain time-scale regimes. We employ a Markov switching process for the true signal underlying the data, rather than working with a fully resolved DNS PDE model. Such Markov switching models haven been demonstrated to provide an excellent surrogate test-bed for the turbulent bursting phenomena which make filtering of complex physical models, such as those arising in atmospheric sciences, so challenging.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/kxas9-4vf97Geometric MCMC for Infinite-Dimensional Inverse Problems
https://resolver.caltech.edu/CaltechAUTHORS:20161220-181119556
Authors: {'items': [{'id': 'Beskos-A', 'name': {'family': 'Beskos', 'given': 'Alexandros'}}, {'id': 'Girolami-M', 'name': {'family': 'Girolami', 'given': 'Mark'}}, {'id': 'Lan-Shiwei', 'name': {'family': 'Lan', 'given': 'Shiwei'}, 'orcid': '0000-0002-9167-3715'}, {'id': 'Farrell-P-E', 'name': {'family': 'Farrell', 'given': 'Patrick E.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2017
DOI: 10.1016/j.jcp.2016.12.041
Bayesian inverse problems often involve sampling posterior distributions on infinite-dimensional function spaces. Traditional Markov chain Monte Carlo (MCMC) algorithms are characterized by deteriorating mixing times upon mesh-refinement, when the finite-dimensional approximations become more accurate. Such methods are typically forced to reduce step-sizes as the discretization gets finer, and thus are expensive as a function of dimension. Recently, a new class of MCMC methods with mesh-independent convergence times has emerged. However, few of them take into account the geometry of the posterior informed by the data. At the same time, recently developed geometric MCMC algorithms have been found to be powerful in exploring complicated distributions that deviate significantly from elliptic Gaussian laws, but are in general computationally intractable for models defined in infinite dimensions. In this work, we combine geometric methods on a finite-dimensional subspace with mesh-independent infinite-dimensional approaches. Our objective is to speed up MCMC mixing times, without significantly increasing the computational cost per step (for instance, in comparison with the vanilla preconditioned Crank–Nicolson (pCN) method). This is achieved by using ideas from geometric MCMC to probe the complex structure of an intrinsic finite-dimensional subspace where most data information concentrates, while retaining robust mixing times as the dimension grows by using pCN-like methods in the complementary subspace. The resulting algorithms are demonstrated in the context of three challenging inverse problems arising in subsurface flow, heat conduction and incompressible flow control. The algorithms exhibit up to two orders of magnitude improvement in sampling efficiency when compared with the pCN method.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/trf67-2e570Quasi-Monte Carlo and Multilevel Monte Carlo Methods for Computing Posterior Expectations in Elliptic Inverse Problems
https://resolver.caltech.edu/CaltechAUTHORS:20161221-105527857
Authors: {'items': [{'id': 'Scheichl-R', 'name': {'family': 'Scheichl', 'given': 'R.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Teckentrup-A-L', 'name': {'family': 'Teckentrup', 'given': 'A. L.'}}]}
Year: 2017
DOI: 10.1137/16M1061692
We are interested in computing the expectation of a functional of a PDE solution under a Bayesian posterior distribution. Using Bayes's rule, we reduce the problem to estimating the ratio of two related prior expectations. For a model elliptic problem, we provide a full convergence and complexity analysis of the ratio estimator in the case where Monte Carlo, quasi-Monte Carlo, or multilevel Monte Carlo methods are used as estimators for the two prior expectations. We show that the computational complexity of the ratio estimator to achieve a given accuracy is the same as the corresponding complexity of the individual estimators for the numerator and the denominator. We also include numerical simulations, in the context of the model elliptic problem, which demonstrate the effectiveness of the approach.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/g9542-x2246Analysis of the ensemble Kalman filter for inverse problems
https://resolver.caltech.edu/CaltechAUTHORS:20161221-112013537
Authors: {'items': [{'id': 'Schillings-C', 'name': {'family': 'Schillings', 'given': 'Claudia'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2017
DOI: 10.1137/16M105959X
The ensemble Kalman filter (EnKF) is a widely used methodology for state estimation in partial, noisily observed dynamical systems, and for parameter estimation in inverse problems. Despite its widespread use in the geophysical sciences, and its gradual adoption in many other areas of application, analysis of the method is in its infancy. Furthermore, much of the existing analysis deals with the large ensemble limit, far from the regime in which the method is typically used. The goal of this paper is to analyze the method when applied to inverse problems with fixed ensemble size. A continuous-time limit is derived and the long-time behavior of the resulting dynamical system is studied. Most of the rigorous analysis is confined to the linear forward problem, where we demonstrate that the continuous time limit of the EnKF corresponds to a set of gradient flows for the data misfit in each ensemble member, coupled through a common pre-conditioner which is the empirical covariance matrix of the ensemble. Numerical results demonstrate that the conclusions of the analysis extend beyond the linear inverse problem setting. Numerical experiments are also given which demonstrate the benefits of various extensions of the basic methodology.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/48k45-45h26Convergence Analysis of the Ensemble Kalman Filter for Inverse Problems: the Noisy Case
https://resolver.caltech.edu/CaltechAUTHORS:20170612-123329512
Authors: {'items': [{'id': 'Schillings-C', 'name': {'family': 'Schillings', 'given': 'C.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2017
We present an analysis of the ensemble Kalman filter for inverse problems based on the continuous time limit of the algorithm. The analysis of the dynamical behaviour of the ensemble allows to establish well-posedness and convergence results for a fixed ensemble size. We will build on the results presented in [Schillings, Stuart 2017] and generalise them to the case of noisy observational data, in particular the influence of the noise on the convergence will be investigated, both theoretically and numerically.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/2t7pd-fnb55Stability of Filters for the Navier-Stokes Equation
https://resolver.caltech.edu/CaltechAUTHORS:20170613-070159561
Authors: {'items': [{'id': 'Brett-C-E-A', 'name': {'family': 'Brett', 'given': 'C. E. A.'}}, {'id': 'Lam-K-F', 'name': {'family': 'Lam', 'given': 'K. F.'}}, {'id': 'Law-K-J-H', 'name': {'family': 'Law', 'given': 'K. J. H.'}}, {'id': 'McCormick-D-S', 'name': {'family': 'McCormick', 'given': 'D. S.'}}, {'id': 'Scott-M-R', 'name': {'family': 'Scott', 'given': 'M. R.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2017
DOI: 10.48550/arXiv.1110.2527
Data assimilation methodologies are designed to incorporate noisy observations of a physical system into an underlying model in order to infer the properties of the state of the system. Filters refer to a class of data assimilation algorithms designed to update the estimation of the state in a on-line fashion, as data is acquired sequentially. For linear problems subject to Gaussian noise filtering can be performed exactly using the Kalman filter. For nonlinear systems it can be approximated in a systematic way by particle filters. However in high dimensions these particle filtering methods can break down. Hence, for the large nonlinear systems arising in applications such as weather forecasting, various ad hoc filters are used, mostly based on making Gaussian approximations. The purpose of this work is to study the properties of these ad hoc filters, working in the context of the 2D incompressible Navier-Stokes equation. By working in this infinite dimensional setting we provide an analysis which is useful for understanding high dimensional filtering, and is robust to mesh-refinement. We describe theoretical results showing that, in the small observational noise limit, the filters can be tuned to accurately track the signal itself (filter stability), provided the system is observed in a sufficiently large low dimensional space; roughly speaking this space should be large enough to contain the unstable modes of the linearized dynamics. Numerical results are given which illustrate the theory. In a simplified scenario we also derive, and study numerically, a stochastic PDE which determines filter stability in the limit of frequent observations, subject to large observational noise. The positive results herein concerning filter stability complement recent numerical studies which demonstrate that the ad hoc filters perform poorly in reproducing statistical variation about the true signal.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/dk7kg-hhg14Statistical analysis of differential equations: introducing probability measures on numerical solutions
https://resolver.caltech.edu/CaltechAUTHORS:20170609-133754387
Authors: {'items': [{'id': 'Conrad-P-R', 'name': {'family': 'Conrad', 'given': 'Patrick R.'}}, {'id': 'Girolami-M', 'name': {'family': 'Girolami', 'given': 'Mark'}}, {'id': 'Särkkä-S', 'name': {'family': 'Särkkä', 'given': 'Simo'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}, {'id': 'Zygalakis-K-C', 'name': {'family': 'Zygalakis', 'given': 'Konstantinos'}}]}
Year: 2017
DOI: 10.1007/s11222-016-9671-0
In this paper, we present a formal quantification of uncertainty induced by numerical solutions of ordinary and partial differential equation models. Numerical solutions of differential equations contain inherent uncertainties due to the finite-dimensional approximation of an unknown and implicitly defined function. When statistically analysing models based on differential equations describing physical, or other naturally occurring, phenomena, it can be important to explicitly account for the uncertainty introduced by the numerical method. Doing so enables objective determination of this source of uncertainty, relative to other uncertainties, such as those caused by data contaminated with noise or model error induced by missing physical or inadequate descriptors. As ever larger scale mathematical models are being used in the sciences, often sacrificing complete resolution of the differential equation on the grids used, formally accounting for the uncertainty in the numerical method is becoming increasingly more important. This paper provides the formal means to incorporate this uncertainty in a statistical model and its subsequent analysis. We show that a wide variety of existing solvers can be randomised, inducing a probability measure over the solutions of such differential equations. These measures exhibit contraction to a Dirac measure around the true unknown solution, where the rates of convergence are consistent with the underlying deterministic numerical method. Furthermore, we employ the method of modified equations to demonstrate enhanced rates of convergence to stochastic perturbations of the original deterministic problem. Ordinary differential equations and elliptic partial differential equations are used to illustrate the approach to quantify uncertainty in both the statistical analysis of the forward and inverse problems.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/26a7h-yj333Gaussian Approximations for Transition Paths in Brownian Dynamics
https://resolver.caltech.edu/CaltechAUTHORS:20170921-105427126
Authors: {'items': [{'id': 'Lu-Yulong', 'name': {'family': 'Lu', 'given': 'Yulong'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}, {'id': 'Weber-H', 'name': {'family': 'Weber', 'given': 'Hendrik'}}]}
Year: 2017
DOI: 10.1137/16M1071845
This paper is concerned with transition paths within the framework of the overdamped Langevin dynamics model of chemical reactions. We aim to give an efficient description of typical transition paths in the small temperature regime. We adopt a variational point of view and seek the best Gaussian approximation, with respect to Kullback--Leibler divergence, of the non-Gaussian distribution of the diffusion process. We interpret the mean of this Gaussian approximation as the "most likely path," and the covariance operator as a means to capture the typical fluctuations around this most likely path. We give an explicit expression for the Kullback--Leibler divergence in terms of the mean and the covariance operator for a natural class of Gaussian approximations and show the existence of minimizers for the variational problem. Then the low temperature limit is studied via Γ-convergence of the associated variational problem. The limiting functional consists of two parts: The first part depends only on the mean and coincides with the Γ-limit of the rescaled Freidlin--Wentzell rate functional. The second part depends on both the mean and the covariance operator and is minimized if the dynamics are given by a time-inhomogenous Ornstein--Uhlenbeck process found by linearization of the Langevin dynamics around the Freidlin--Wentzell minimizer.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/gkt2e-t5j16Importance Sampling: Intrinsic Dimension and Computational Cost
https://resolver.caltech.edu/CaltechAUTHORS:20161221-114242057
Authors: {'items': [{'id': 'Agapiou-S', 'name': {'family': 'Agapiou', 'given': 'S.'}}, {'id': 'Papaspiliopoulos-O', 'name': {'family': 'Papaspiliopoulos', 'given': 'O.'}}, {'id': 'Sanz-Alonso-D', 'name': {'family': 'Sanz-Alonso', 'given': 'D.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2017
DOI: 10.1214/17-STS611
The basic idea of importance sampling is to use independent samples from a proposal measure in order to approximate expectations with respect to a target measure. It is key to understand how many samples are required in order to guarantee accurate approximations. Intuitively, some notion of distance between the target and the proposal should determine the computational cost of the method. A major challenge is to quantify this distance in terms of parameters or statistics that are pertinent for the practitioner. The subject has attracted substantial interest from within a variety of communities. The objective of this paper is to overview and unify the resulting literature by creating an overarching framework. A general theory is presented, with a focus on the use of importance sampling in Bayesian inverse problems and filtering.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/px3j9-d7d69Gaussian Approximations of Small Noise Diffusions in Kullback-Leibler Divergence
https://resolver.caltech.edu/CaltechAUTHORS:20161220-181911579
Authors: {'items': [{'id': 'Sanz-Alonso-D', 'name': {'family': 'Sanz-Alonso', 'given': 'Daniel'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2017
DOI: 10.4310/CMS.2017.v15.n7.a13
We study Gaussian approximations to the distribution of a diffusion. The approximations are easy to compute: they are defined by two simple ordinary differential equations for the mean and the covariance. Time correlations can also be computed via solution of a linear stochastic differential equation. We show, using the Kullback–Leibler divergence, that the approximations are accurate in the small noise regime. An analogous discrete time setting is also studied. The results provide both theoretical support for the use of Gaussian processes in the approximation of diffusions, and methodological guidance in the construction of Gaussian approximations in applications.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/1evkm-1xx07Hierarchical Bayesian level set inversion
https://resolver.caltech.edu/CaltechAUTHORS:20161109-074003000
Authors: {'items': [{'id': 'Dunlop-M-M', 'name': {'family': 'Dunlop', 'given': 'Matthew M.'}, 'orcid': '0000-0001-7718-3755'}, {'id': 'Iglesias-M-A', 'name': {'family': 'Iglesias', 'given': 'Marco A.'}, 'orcid': '0000-0002-8952-717X'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2017
DOI: 10.1007/s11222-016-9704-8
The level set approach has proven widely successful in the study of inverse problems for interfaces, since its systematic development in the 1990s. Recently it has been employed in the context of Bayesian inversion, allowing for the quantification of uncertainty within the reconstruction of interfaces. However, the Bayesian approach is very sensitive to the length and amplitude scales in the prior probabilistic model. This paper demonstrates how the scale-sensitivity can be circumvented by means of a hierarchical approach, using a single scalar parameter. Together with careful consideration of the development of algorithms which encode probability measure equivalences as the hierarchical parameter is varied, this leads to well-defined Gibbs-based MCMC methods found by alternating Metropolis–Hastings updates of the level set function and the hierarchical parameter. These methods demonstrably outperform non-hierarchical Bayesian level set methods.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/hf61n-p3x73Gaussian Approximations for Probability Measures on R^d
https://resolver.caltech.edu/CaltechAUTHORS:20161221-163341129
Authors: {'items': [{'id': 'Lu-Yulong', 'name': {'family': 'Lu', 'given': 'Yulong'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}, {'id': 'Weber-H', 'name': {'family': 'Weber', 'given': 'Hendrik'}}]}
Year: 2017
DOI: 10.1137/16M1105384
This paper concerns the approximation of probability measures on R^d with respect to the Kullback-Leibler divergence. Given an admissible target measure, we show the existence of the best approximation, with respect to this divergence, from certain sets of Gaussian measures and Gaussian mixtures. The asymptotic behavior of such best approximations is then studied in the small parameter limit where the measure concentrates; this asympotic behavior is characterized using Γ-convergence. The theory developed is then applied to understand the frequentist consistency of Bayesian inverse problems in finite dimensions. For a fixed realization of additive observational noise, we show the asymptotic normality of the posterior measure in the small noise limit. Taking into account the randomness of the noise, we prove a Bernstein-Von Mises type result for the posterior measure.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/n8dv2-wd802Earth System Modeling 2.0: A Blueprint for Models That Learn From Observations and Targeted High-Resolution Simulations
https://resolver.caltech.edu/CaltechAUTHORS:20171201-113659166
Authors: {'items': [{'id': 'Schneider-T', 'name': {'family': 'Schneider', 'given': 'Tapio'}, 'orcid': '0000-0001-5687-2287'}, {'id': 'Lan-Shiwei', 'name': {'family': 'Lan', 'given': 'Shiwei'}, 'orcid': '0000-0002-9167-3715'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}, {'id': 'Teixeira-J', 'name': {'family': 'Teixeira', 'given': 'João'}}]}
Year: 2017
DOI: 10.1002/2017GL076101
Climate projections continue to be marred by large uncertainties, which originate in processes that need to be parameterized, such as clouds, convection, and ecosystems. But rapid progress is now within reach. New computational tools and methods from data assimilation and machine learning make it possible to integrate global observations and local high-resolution simulations in an Earth system model (ESM) that systematically learns from both and quantifies uncertainties. Here we propose a blueprint for such an ESM. We outline how parameterization schemes can learn from global observations and targeted high-resolution simulations, for example, of clouds and convection, through matching low-order statistics between ESMs, observations, and high-resolution simulations. We illustrate learning algorithms for ESMs with a simple dynamical system that shares characteristics of the climate system; and we discuss the opportunities the proposed framework presents and the challenges that remain to realize it.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/rnm7e-jst84Convergence analysis of ensemble Kalman inversion: the linear, noisy case
https://resolver.caltech.edu/CaltechAUTHORS:20180102-081954370
Authors: {'items': [{'id': 'Schillings-C', 'name': {'family': 'Schillings', 'given': 'C.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2018
DOI: 10.1080/00036811.2017.1386784
We present an analysis of ensemble Kalman inversion, based on the continuous time limit of the algorithm. The analysis of the dynamical behaviour of the ensemble allows us to establish well-posedness and convergence results for a fixed ensemble size. We will build on recent results on the convergence in the noise-free case and generalise them to the case of noisy observational data, in particular the influence of the noise on the convergence will be investigated, both theoretically and numerically. We focus on linear inverse problems where a very complete theoretical analysis is possible.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/dv2hb-jqv30Iterative Updating of Model Error for Bayesian Inversion
https://resolver.caltech.edu/CaltechAUTHORS:20170801-155824887
Authors: {'items': [{'id': 'Calvetti-D', 'name': {'family': 'Calvetti', 'given': 'Daniela'}}, {'id': 'Dunlop-M-M', 'name': {'family': 'Dunlop', 'given': 'Matthew'}, 'orcid': '0000-0001-7718-3755'}, {'id': 'Somersalo-E', 'name': {'family': 'Somersalo', 'given': 'Erkki'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2018
DOI: 10.1088/1361-6420/aaa34d
In computational inverse problems, it is common that a detailed and accurate forward model is approximated by a computationally less challenging substitute. The model reduction may be necessary to meet constraints in computing time when optimization algorithms are used to find a single estimate, or to speed up Markov chain Monte Carlo (MCMC) calculations in the Bayesian framework. The use of an approximate model introduces a discrepancy, or modeling error, that may have a detrimental effect on the solution of the ill-posed inverse problem, or it may severely distort the estimate of the posterior distribution. In the Bayesian paradigm, the modeling error can be considered as a random variable, and by using an estimate of the probability distribution of the unknown, one may estimate the probability distribution of the modeling error and incorporate it into the inversion. We introduce an algorithm which iterates this idea to update the distribution of the model error, leading to a sequence of posterior distributions that are demonstrated empirically to capture the underlying truth with increasing accuracy. Since the algorithm is not based on rejections, it requires only limited full model evaluations.
We show analytically that, in the linear Gaussian case, the algorithm converges geometrically fast with respect to the number of iterations when the data is finite dimensional. For more general models, we introduce particle approximations of the iteratively generated sequence of distributions; we also prove that each element of the sequence converges in the large particle limit under a simplifying assumption. We show numerically that, as in the linear case, rapid convergence occurs with respect to the number of iterations. Additionally, we show through computed examples that point estimates obtained from this iterative algorithm are superior to those obtained by neglecting the model error.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/ekz96-xk362Posterior consistency for Gaussian process approximations of Bayesian posterior distributions
https://resolver.caltech.edu/CaltechAUTHORS:20161221-104520265
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Teckentrup-A-L', 'name': {'family': 'Teckentrup', 'given': 'Aretha L.'}}]}
Year: 2018
DOI: 10.1090/mcom/3244
We study the use of Gaussian process emulators to approximate the parameter-to-observation map or the negative log-likelihood in Bayesian inverse problems. We prove error bounds on the Hellinger distance between the true posterior distribution and various approximations based on the Gaussian process emulator. Our analysis includes approximations based on the mean of the predictive process, as well as approximations based on the full Gaussian process emulator. Our results show that the Hellinger distance between the true posterior and its approximations can be bounded by moments of the error in the emulator. Numerical results confirm our theoretical findings.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/gjrna-jyx57Weak error estimates for trajectories of SPDEs for Spectral Galerkin discretization
https://resolver.caltech.edu/CaltechAUTHORS:20161221-110122611
Authors: {'items': [{'id': 'Bréhier-C-E', 'name': {'family': 'Bréhier', 'given': 'Charles-Edouard'}}, {'id': 'Hairer-M', 'name': {'family': 'Hairer', 'given': 'Martin'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2018
DOI: 10.4208/jcm.1607-m2016-0539
We consider stochastic semi-linear evolution equations which are driven by additive, spatially correlated, Wiener noise, and in particular consider problems of heat equation (analytic semigroup) and damped-driven wave equations (bounded semigroup) type. We discretize these equations by means of a spectral Galerkin projection, and we study the approximation of the probability distribution of the trajectories: test functions are regular, but depend on the values of the process on the interval [0, T ].
We introduce a new approach in the context of quantative weak error analysis for discretization of SPDEs. The weak error is formulated using a deterministic function (Itô map) of the stochastic convolution found when the nonlinear term is dropped. The regularity properties of the Itô map are exploited, and in particular second-order Taylor expansions employed, to transfer the error from spectral approximation of the stochastic convolution into the weak error of interest.
We prove that the weak rate of convergence is twice the strong rate of convergence in two situations. First, we assume that the covariance operator commutes with the generator of the semigroup: the first order term in the weak error expansion cancels out thanks to an independence property. Second, we remove the commuting assumption, and extend the previous result, thanks to the analysis of a new error term depending on a commutator.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/dan9d-jtb93Uncertainty Quantification in Graph-Based Classification of High Dimensional Data
https://resolver.caltech.edu/CaltechAUTHORS:20170712-141757416
Authors: {'items': [{'id': 'Bertozzi-A-L', 'name': {'family': 'Bertozzi', 'given': 'Andrea L.'}}, {'id': 'Luo-Xiyang', 'name': {'family': 'Luo', 'given': 'Xiyang'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Zygalakis-K-C', 'name': {'family': 'Zygalakis', 'given': 'Konstantinos C.'}}]}
Year: 2018
DOI: 10.1137/17M1134214
Classification of high dimensional data finds wide-ranging applications. In many of these applications equipping the resulting classification with a measure of uncertainty may be as important as the classification itself. In this paper we introduce, develop algorithms for, and investigate the properties of a variety of Bayesian models for the task of binary classification; via the posterior distribution on the classification labels, these methods automatically give measures of uncertainty. The methods are all based on the graph formulation of semisupervised learning. We provide a unified framework which brings together a variety of methods that have been introduced in different communities within the mathematical sciences. We study probit classification [C. K. Williams and C. E. Rasmussen, "Gaussian Processes for Regression," in Advances in Neural Information Processing Systems 8, MIT Press, 1996, pp. 514--520] in the graph-based setting, generalize the level-set method for Bayesian inverse problems [M. A. Iglesias, Y. Lu, and A. M. Stuart, Interfaces Free Bound., 18 (2016), pp. 181--217] to the classification setting, and generalize the Ginzburg--Landau optimization-based classifier [A. L. Bertozzi and A. Flenner, Multiscale Model. Simul., 10 (2012), pp. 1090--1118], [Y. Van Gennip and A. L. Bertozzi, Adv. Differential Equations, 17 (2012), pp. 1115--1180] to a Bayesian setting. We also show that the probit and level-set approaches are natural relaxations of the harmonic function approach introduced in [X. Zhu et al., "Semi-supervised Learning Using Gaussian Fields and Harmonic Functions," in ICML, Vol. 3, 2003, pp. 912--919]. We introduce efficient numerical methods, suited to large datasets, for both MCMC-based sampling and gradient-based MAP estimation. Through numerical experiments we study classification accuracy and uncertainty quantification for our models; these experiments showcase a suite of datasets commonly used to evaluate graph-based semisupervised learning algorithms.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/748ct-g3z39Parameterizations for ensemble Kalman inversion
https://resolver.caltech.edu/CaltechAUTHORS:20180413-092058450
Authors: {'items': [{'id': 'Chada-N-K', 'name': {'family': 'Chada', 'given': 'Neil K.'}, 'orcid': '0000-0002-2180-0985'}, {'id': 'Iglesias-M-A', 'name': {'family': 'Iglesias', 'given': 'Marco A.'}, 'orcid': '0000-0002-8952-717X'}, {'id': 'Roininen-L', 'name': {'family': 'Roininen', 'given': 'Lassi'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2018
DOI: 10.1088/1361-6420/aab6d9
The use of ensemble methods to solve inverse problems is attractive because it is a derivative-free methodology which is also well-adapted to parallelization. In its basic iterative form the method produces an ensemble of solutions which lie in the linear span of the initial ensemble. Choice of the parameterization of the unknown field is thus a key component of the success of the method. We demonstrate how both geometric ideas and hierarchical ideas can be used to design effective parameterizations for a number of applied inverse problems arising in electrical impedance tomography, groundwater flow and source inversion. In particular we show how geometric ideas, including the level set method, can be used to reconstruct piecewise continuous fields, and we show how hierarchical methods can be used to learn key parameters in continuous fields, such as length-scales, resulting in improved reconstructions. Geometric and hierarchical ideas are combined in the level set method to find piecewise constant reconstructions with interfaces of unknown topology.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/sezpk-a3w10Non-stationary phase of the MALA algorithm
https://resolver.caltech.edu/CaltechAUTHORS:20161220-175620681
Authors: {'items': [{'id': 'Kuntz-Juan', 'name': {'family': 'Kuntz', 'given': 'Juan'}}, {'id': 'Ottobre-Michaela', 'name': {'family': 'Ottobre', 'given': 'Michela'}, 'orcid': '0000-0002-8725-4278'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2018
DOI: 10.1007/s40072-018-0113-1
PMCID: PMC6411168
The Metropolis-Adjusted Langevin Algorithm (MALA) is a Markov Chain Monte Carlo method which creates a Markov chain reversible with respect to a given target distribution, π^N, with Lebesgue density on R^N; it can hence be used to approximately sample the target distribution. When the dimension N is large a key question is to determine the computational cost of the algorithm as a function of N. The measure of efficiency that we consider in this paper is the expected squared jumping distance (ESJD), introduced in Roberts et al. (Ann Appl Probab 7(1):110–120, 1997). To determine how the cost of the algorithm (in terms of ESJD) increases with dimension N, we adopt the widely used approach of deriving a diffusion limit for the Markov chain produced by the MALA algorithm. We study this problem for a class of target measures which is not in product form and we address the situation of practical relevance in which the algorithm is started out of stationarity. We thereby significantly extend previous works which consider either measures of product form, when the Markov chain is started out of stationarity, or non-product measures (defined via a density with respect to a Gaussian), when the Markov chain is started in stationarity. In order to work in this non-stationary and non-product setting, significant new analysis is required. In particular, our diffusion limit comprises a stochastic PDE coupled to a scalar ordinary differential equation which gives a measure of how far from stationarity the process is. The family of non-product target measures that we consider in this paper are found from discretization of a measure on an infinite dimensional Hilbert space; the discretised measure is defined by its density with respect to a Gaussian random field. The results of this paper demonstrate that, in the non-stationary regime, the cost of the algorithm is of O(N^(1/2)) in contrast to the stationary regime, where it is of O(N^(1/3)).https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/why6t-8q124How Deep Are Deep Gaussian Processes?
https://resolver.caltech.edu/CaltechAUTHORS:20181108-140320751
Authors: {'items': [{'id': 'Dunlop-M-M', 'name': {'family': 'Dunlop', 'given': 'Matthew M.'}, 'orcid': '0000-0001-7718-3755'}, {'id': 'Girolami-M-A', 'name': {'family': 'Girolami', 'given': 'Mark A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Teckentrup-A-L', 'name': {'family': 'Teckentrup', 'given': 'Aretha L.'}}]}
Year: 2018
DOI: 10.48550/arXiv.1711.11280
Recent research has shown the potential utility of deep Gaussian processes. These deep structures are probability distributions, designed through hierarchical construction, which are conditionally Gaussian. In this paper, the current published body of work is placed in a common framework and, through recursion, several classes of deep Gaussian processes are defined. The resulting samples generated from a deep Gaussian process have a Markovian structure with respect to the depth parameter, and the effective depth of the resulting process is interpreted in terms of the ergodicity, or non-ergodicity, of the resulting Markov chain. For the classes of deep Gaussian processes introduced, we provide results concerning their ergodicity and hence their effective depth. We also demonstrate how these processes may be used for inference; in particular we show how a Metropolis-within-Gibbs construction across the levels of the hierarchy can be used to derive sampling tools which are robust to the level of resolution used to represent the functions on a computer. For illustration, we consider the effect of ergodicity in some simple numerical examples.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/g83fp-50p88Mechanistic machine learning: how data assimilation leverages physiologic knowledge using Bayesian inference to forecast the future, infer the present, and phenotype
https://resolver.caltech.edu/CaltechAUTHORS:20181023-111929468
Authors: {'items': [{'id': 'Albers-D-J', 'name': {'family': 'Albers', 'given': 'David J.'}}, {'id': 'Levine-M-E', 'name': {'family': 'Levine', 'given': 'Matthew E.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}, {'id': 'Mamykina-L', 'name': {'family': 'Mamykina', 'given': 'Lena'}}, {'id': 'Gluckman-B', 'name': {'family': 'Gluckman', 'given': 'Bruce'}}, {'id': 'Hripcsak-G', 'name': {'family': 'Hripcsak', 'given': 'George'}}]}
Year: 2018
DOI: 10.1093/jamia/ocy106
PMCID: PMC6188514
We introduce data assimilation as a computational method that uses machine learning to combine data with human knowledge in the form of mechanistic models in order to forecast future states, to impute missing data from the past by smoothing, and to infer measurable and unmeasurable quantities that represent clinically and scientifically important phenotypes. We demonstrate the advantages it affords in the context of type 2 diabetes by showing how data assimilation can be used to forecast future glucose values, to impute previously missing glucose values, and to infer type 2 diabetes phenotypes. At the heart of data assimilation is the mechanistic model, here an endocrine model. Such models can vary in complexity, contain testable hypotheses about important mechanics that govern the system (eg, nutrition's effect on glucose), and, as such, constrain the model space, allowing for accurate estimation using very little data.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/1cn4k-3kh87Uncertainty quantification for semi-supervised multi-class classification in image processing and ego-motion analysis of body-worn videos
https://resolver.caltech.edu/CaltechAUTHORS:20190723-085611528
Authors: {'items': [{'id': 'Qiao-Yiling', 'name': {'family': 'Qiao', 'given': 'Yiling'}}, {'id': 'Shi-Chang', 'name': {'family': 'Shi', 'given': 'Chang'}}, {'id': 'Wang-Chenjian', 'name': {'family': 'Wang', 'given': 'Chenjian'}}, {'id': 'Li-Hao', 'name': {'family': 'Li', 'given': 'Hao'}}, {'id': 'Haberland-M', 'name': {'family': 'Haberland', 'given': 'Matt'}}, {'id': 'Luo-Xiyang', 'name': {'family': 'Luo', 'given': 'Xiyang'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Bertozzi-A-L', 'name': {'family': 'Bertozzi', 'given': 'Andrea L.'}}]}
Year: 2019
DOI: 10.2352/ISSN.2470-1173.2019.11.IPAS-264
Semi-supervised learning uses underlying relationships in data with a scarcity of ground-truth labels. In this paper, we introduce an uncertainty quantification (UQ) method for graph-based semi-supervised multi-class classification problems. We not only predict the class label for each data point, but also provide a confidence score for the prediction. We adopt a Bayesian approach and propose a graphical multi-class probit model together with an effective Gibbs sampling procedure. Furthermore, we propose a confidence measure for each data point that correlates with the classification performance. We use the empirical properties of the proposed confidence measure to guide the design of a human-in-the-loop system. The uncertainty quantification algorithm and the human-in-the-loop system are successfully applied to classification
problems in image processing and ego-motion analysis of
body-worn videos.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/jk6xa-btv35Reconciling Bayesian and Total Variation Methods for Binary Inversion
https://resolver.caltech.edu/CaltechAUTHORS:20190404-111026312
Authors: {'items': [{'id': 'Dunlop-M-M', 'name': {'family': 'Dunlop', 'given': 'Matthew M.'}, 'orcid': '0000-0001-7718-3755'}, {'id': 'Elliott-C-M', 'name': {'family': 'Elliott', 'given': 'Charles M.'}}, {'id': 'Hoang-Viet-Ha', 'name': {'family': 'Hoang', 'given': 'Viet Ha'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2019
DOI: 10.48550/arXiv.1706.01960
A central theme in classical algorithms for the reconstruction of discontinuous functions from observational data is perimeter regularization. On the other hand, sparse or noisy data often demands a probabilistic approach to the reconstruction of images, to enable uncertainty quantification; the Bayesian approach to inversion is a natural framework in which to carry this out. The link between Bayesian inversion methods and perimeter regularization, however, is not fully understood. In this paper two links are studied: (i) the MAP objective function of a suitably chosen phase-field Bayesian approach is shown to be closely related to a least squares plus perimeter regularization objective; (ii) sample paths of a suitably chosen Bayesian level set formulation are shown to possess finite perimeter and to have the ability to learn about the true perimeter. Furthermore, the level set approach is shown to lead to faster algorithms for uncertainty quantification than the phase field approach.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/0yvp3-rgm11Dimension-Robust MCMC in Bayesian Inverse Problems
https://resolver.caltech.edu/CaltechAUTHORS:20190404-111029769
Authors: {'items': [{'id': 'Chen-Victor', 'name': {'family': 'Chen', 'given': 'Victor'}}, {'id': 'Dunlop-M-M', 'name': {'family': 'Dunlop', 'given': 'Matthew M.'}, 'orcid': '0000-0001-7718-3755'}, {'id': 'Papaspiliopoulos-O', 'name': {'family': 'Papaspiliopoulos', 'given': 'Omiros'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2019
DOI: 10.48550/arXiv.1803.03344
The methodology developed in this article is motivated by a wide range of prediction and uncertainty quantification problems that arise in Statistics, Machine Learning and Applied Mathematics, such as non-parametric regression, multi-class classification and inversion of partial differential equations. One popular formulation of such problems is as Bayesian inverse problems, where a prior distribution is used to regularize inference on a high-dimensional latent state, typically a function or a field. It is common that such priors are non-Gaussian, for example piecewise-constant or heavy-tailed, and/or hierarchical, in the sense of involving a further set of low-dimensional parameters, which, for example, control the scale or smoothness of the latent state. In this formulation prediction and uncertainty quantification relies on efficient exploration of the posterior distribution of latent states and parameters. This article introduces a framework for efficient MCMC sampling in Bayesian inverse problems that capitalizes upon two fundamental ideas in MCMC, non-centred parameterisations of hierarchical models and dimension-robust samplers for latent Gaussian processes. Using a range of diverse applications we showcase that the proposed framework is dimension-robust, that is, the efficiency of the MCMC sampling does not deteriorate as the dimension of the latent state gets higher. We showcase the full potential of the machinery we develop in the article in semi-supervised multi-class classification, where our sampling algorithm is used within an active learning framework to guide the selection of input data to manually label in order to achieve high predictive accuracy with a minimal number of labelled data.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/7ezmd-00k93Data Assimilation and Inverse Problems
https://resolver.caltech.edu/CaltechAUTHORS:20190404-111038658
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}, {'id': 'Taeb-A', 'name': {'family': 'Taeb', 'given': 'Armeen'}, 'orcid': '0000-0002-5647-3160'}]}
Year: 2019
DOI: 10.48550/arXiv.1810.06191
These notes are designed with the aim of providing a clear and concise introduction to the subjects of Inverse Problems and Data Assimilation, and their inter-relations, together with citations to some relevant literature in this area. The first half of the notes is dedicated to studying the Bayesian framework for inverse problems. Techniques such as importance sampling and Markov Chain Monte Carlo (MCMC) methods are introduced; these methods have the desirable property that in the limit of an infinite number of samples they reproduce the full posterior distribution. Since it is often computationally intensive to implement these methods, especially in high dimensional problems, approximate techniques such as approximating the posterior by a Dirac or a Gaussian distribution are discussed. The second half of the notes cover data assimilation. This refers to a particular class of inverse problems in which the unknown parameter is the initial condition of a dynamical system, and in the stochastic dynamics case the subsequent states of the system, and the data comprises partial and noisy observations of that (possibly stochastic) dynamical system. We will also demonstrate that methods developed in data assimilation may be employed to study generic inverse problems, by introducing an artificial time to generate a sequence of probability measures interpolating from the prior to the posterior.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/38ac2-za747Hyperparameter Estimation in Bayesian MAP Estimation: Parameterizations and Consistency
https://resolver.caltech.edu/CaltechAUTHORS:20190722-134133717
Authors: {'items': [{'id': 'Dunlop-M-M', 'name': {'family': 'Dunlop', 'given': 'Matthew M.'}, 'orcid': '0000-0001-7718-3755'}, {'id': 'Helin-T', 'name': {'family': 'Helin', 'given': 'Tapio'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2019
DOI: 10.48550/arXiv.1905.04365
The Bayesian formulation of inverse problems is attractive for three primary reasons: it provides a clear modelling framework; means for uncertainty quantification; and it allows for principled learning of hyperparameters. The posterior distribution may be explored by sampling methods, but for many problems it is computationally infeasible to do so. In this situation maximum a posteriori (MAP) estimators are often sought. Whilst these are relatively cheap to compute, and have an attractive variational formulation, a key drawback is their lack of invariance under change of parameterization. This is a particularly significant issue when hierarchical priors are employed to learn hyperparameters. In this paper we study the effect of the choice of parameterization on MAP estimators when a conditionally Gaussian hierarchical prior distribution is employed. Specifically we consider the centred parameterization, the natural parameterization in which the unknown state is solved for directly, and the noncentred parameterization, which works with a whitened Gaussian as the unknown state variable, and arises when considering dimension-robust MCMC algorithms; MAP estimation is well-defined in the nonparametric setting only for the noncentred parameterization. However, we show that MAP estimates based on the noncentred parameterization are not consistent as estimators of hyperparameters; conversely, we show that limits of finite-dimensional centred MAP estimators are consistent as the dimension tends to infinity. We also consider empirical Bayesian hyperparameter estimation, show consistency of these estimates, and demonstrate that they are more robust with respect to noise than centred MAP estimates. An underpinning concept throughout is that hyperparameters may only be recovered up to measure equivalence, a well-known phenomenon in the context of the Ornstein-Uhlenbeck process.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/nt212-sbf55Analysis Of Momentum Methods
https://resolver.caltech.edu/CaltechAUTHORS:20190722-102107649
Authors: {'items': [{'id': 'Kovachki-N-B', 'name': {'family': 'Kovachki', 'given': 'Nikola B.'}, 'orcid': '0000-0002-3650-2972'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2019
DOI: 10.48550/arXiv.1906.04285
Gradient decent-based optimization methods underpin the parameter training which results in the impressive results now found when testing neural networks. Introducing stochasticity is key to their success in practical problems, and there is some understanding of the role of stochastic gradient decent in this context. Momentum modifications of gradient decent such as Polyak's Heavy Ball method (HB) and Nesterov's method of accelerated gradients (NAG), are widely adopted. In this work, our focus is on understanding the role of momentum in the training of neural networks, concentrating on the common situation in which the momentum contribution is fixed at each step of the algorithm; to expose the ideas simply we work in the deterministic setting. We show that, contrary to popular belief, standard implementations of fixed momentum methods do no more than act to rescale the learning rate. We achieve this by showing that the momentum method converges to a gradient flow, with a momentum-dependent time-rescaling, using the method of modified equations from numerical analysis. Further we show that the momentum method admits an exponentially attractive invariant manifold on which the dynamic reduces to a gradient flow with respect to a modified loss function, equal to the original one plus a small perturbation.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/7askx-9w059Parameter estimation for macroscopic pedestrian dynamics models from microscopic data
https://resolver.caltech.edu/CaltechAUTHORS:20190719-112058516
Authors: {'items': [{'id': 'Gomes-S-N', 'name': {'family': 'Gomes', 'given': 'Susana N.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Wolfram-M-T', 'name': {'family': 'Wolfram', 'given': 'Marie-Therese'}}]}
Year: 2019
DOI: 10.1137/18M1215980
In this paper we develop a framework for parameter estimation in macroscopic pedestrian models using individual trajectories---microscopic data. We consider a unidirectional flow of pedestrians in a corridor and assume that the velocity decreases with the average density according to the fundamental diagram. Our model is formed from a coupling between a density dependent stochastic differential equation and a nonlinear partial differential equation for the density, and is hence of McKean--Vlasov type. We discuss identifiability of the parameters appearing in the fundamental diagram from trajectories of individuals, and we introduce optimization and Bayesian methods to perform the identification. We analyze the performance of the developed methodologies in various situations, such as for different in- and outflow conditions, for varying numbers of individual trajectories, and for differing channel geometries.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/j6jx4-28457Ergodicity and Accuracy of Optimal Particle Filters for Bayesian Data Assimilation
https://resolver.caltech.edu/CaltechAUTHORS:20161221-161911353
Authors: {'items': [{'id': 'Kelly-David', 'name': {'family': 'Kelly', 'given': 'David'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2019
DOI: 10.1007/s11401-019-0161-5
Data assimilation refers to the methodology of combining dynamical models and observed data with the objective of improving state estimation. Most data assimilation algorithms are viewed as approximations of the Bayesian posterior (filtering distribution) on the signal given the observations. Some of these approximations are controlled, such as particle filters which may be refined to produce the true filtering distribution in the large particle number limit, and some are uncontrolled, such as ensemble Kalman filter methods which do not recover the true filtering distribution in the large ensemble limit. Other data assimilation algorithms, such as cycled 3DVAR methods, may be thought of as controlled estimators of the state, in the small observational noise scenario, but are also uncontrolled in general in relation to the true filtering distribution. For particle filters and ensemble Kalman filters it is of practical importance to understand how and why data assimilation methods can be effective when used with a fixed small number of particles, since for many large-scale applications it is not practical to deploy algorithms close to the large particle limit asymptotic. In this paper, the authors address this question for particle filters and, in particular, study their accuracy (in the small noise limit) and ergodicity (for noisy signal and observation) without appealing to the large particle number limit. The authors first overview the accuracy and minorization properties for the true filtering distribution, working in the setting of conditional Gaussianity for the dynamics-observation model. They then show that these properties are inherited by optimal particle filters for any fixed number of particles, and use the minorization to establish ergodicity of the filters. For completeness we also prove large particle number consistency results for the optimal particle filters, by writing the update equations for the underlying distributions as recursions. In addition to looking at the optimal particle filter with standard resampling, they derive all the above results for (what they term) the Gaussianized optimal particle filter and show that the theoretical properties are favorable for this method, when compared to the standard optimal particle filter.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/9jjf9-7mf09Ensemble Kalman Methods With Constraints
https://resolver.caltech.edu/CaltechAUTHORS:20190722-155445728
Authors: {'items': [{'id': 'Albers-D-J', 'name': {'family': 'Albers', 'given': 'David J.'}}, {'id': 'Blancquart-P-A', 'name': {'family': 'Blancquart', 'given': 'Paul-Adrien'}}, {'id': 'Levine-M-E', 'name': {'family': 'Levine', 'given': 'Matthew E.'}}, {'id': 'Seylabi-E-E', 'name': {'family': 'Seylabi', 'given': 'Elnaz Esmaeilzadeh'}, 'orcid': '0000-0003-0718-372X'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2019
DOI: 10.1088/1361-6420/ab1c09
PMCID: PMC7677878
Ensemble Kalman methods constitute an increasingly important tool in both state and parameter estimation problems. Their popularity stems from the derivative-free nature of the methodology which may be readily applied when computer code is available for the underlying state-space dynamics (for state estimation) or for the parameter-to-observable map (for parameter estimation). There are many applications in which it is desirable to enforce prior information in the form of equality or inequality constraints on the state or parameter. This paper establishes a general framework for doing so, describing a widely applicable methodology, a theory which justifies the methodology, and a set of numerical experiments exemplifying it.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/2y921-jsd59Ensemble Kalman Inversion: A Derivative-Free Technique For Machine Learning Tasks
https://resolver.caltech.edu/CaltechAUTHORS:20190404-111033209
Authors: {'items': [{'id': 'Kovachki-N-B', 'name': {'family': 'Kovachki', 'given': 'Nikola B.'}, 'orcid': '0000-0002-3650-2972'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2019
DOI: 10.1088/1361-6420/ab1c3a
The standard probabilistic perspective on machine learning gives rise to empirical risk-minimization tasks that are frequently solved by stochastic gradient descent (SGD) and variants thereof. We present a formulation of these tasks as classical inverse or filtering problems and, furthermore, we propose an efficient, gradient-free algorithm for finding a solution to these problems using ensemble Kalman inversion (EKI). The method is inherently parallelizable and is applicable to problems with non-differentiable loss functions, for which back-propagation is not possible. Applications of our approach include offline and online supervised learning with deep neural networks, as well as graph-based semi-supervised learning. The essence of the EKI procedure is an ensemble based approximate gradient descent in which derivatives are replaced by differences from within the ensemble. We suggest several modifications to the basic method, derived from empirically successful heuristics developed in the context of SGD. Numerical results demonstrate wide applicability and robustness of the proposed algorithm.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/jezv0-rzd57Diffusion Limit For The Random Walk Metropolis Algorithm Out Of stationarity
https://resolver.caltech.edu/CaltechAUTHORS:20161221-115035181
Authors: {'items': [{'id': 'Kuntz-J', 'name': {'family': 'Kuntz', 'given': 'Juan'}}, {'id': 'Ottobre-M', 'name': {'family': 'Ottobre', 'given': 'Michela'}, 'orcid': '0000-0002-8725-4278'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2019
DOI: 10.1214/18-AIHP929
The Random Walk Metropolis (RWM) algorithm is a Metropolis–Hastings Markov Chain Monte Carlo algorithm designed to sample from a given target distribution π^N with Lebesgue density on R^N. Like any other Metropolis–Hastings algorithm, RWM constructs a Markov chain by randomly proposing a new position (the "proposal move"), which is then accepted or rejected according to a rule which makes the chain reversible with respect to π^N. When the dimension N is large, a key question is to determine the optimal scaling with N of the proposal variance: if the proposal variance is too large, the algorithm will reject the proposed moves too often; if it is too small, the algorithm will explore the state space too slowly. Determining the optimal scaling of the proposal variance gives a measure of the cost of the algorithm as well. One approach to tackle this issue, which we adopt here, is to derive diffusion limits for the algorithm. Such an approach has been proposed in the seminal papers (Ann. Appl. Probab. 7 (1) (1997) 110–120; J. R. Stat. Soc. Ser. B. Stat. Methodol. 60 (1) (1998) 255–268). In particular, in (Ann. Appl. Probab. 7 (1) (1997) 110–120) the authors derive a diffusion limit for the RWM algorithm under the two following assumptions: (i) the algorithm is started in stationarity; (ii) the target measure π^N is in product form. The present paper considers the situation of practical interest in which both assumptions (i) and (ii) are removed. That is (a) we study the case (which occurs in practice) in which the algorithm is started out of stationarity and (b) we consider target measures which are in non-product form. Roughly speaking, we consider target measures that admit a density with respect to Gaussian; such measures arise in Bayesian nonparametric statistics and in the study of conditioned diffusions. We prove that, out of stationarity, the optimal scaling for the proposal variance is O(N^(−1)), as it is in stationarity. In this optimal scaling, a diffusion limit is obtained and the cost of reaching and exploring the invariant measure scales as O(N). Notice that the optimal scaling in and out of stationatity need not be the same in general, and indeed they differ e.g. in the case of the MALA algorithm (Stoch. Partial Differ. Equ. Anal Comput. 6 (3) (2018) 446–499). More importantly, our diffusion limit is given by a stochastic PDE, coupled to a scalar ordinary differential equation; such an ODE gives a measure of how far from stationarity the process is and can therefore be taken as an indicator of convergence. In this sense, this paper contributes understanding to the old-standing problem of monitoring convergence of MCMC algorithms.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/z0p06-evb79Strong convergence rates of probabilistic integrators for ordinary differential equations
https://resolver.caltech.edu/CaltechAUTHORS:20170612-123841285
Authors: {'items': [{'id': 'Lie-Han-Cheng', 'name': {'family': 'Lie', 'given': 'Han Cheng'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}, {'id': 'Sullivan-T-J', 'name': {'family': 'Sullivan', 'given': 'T. J.'}}]}
Year: 2019
DOI: 10.1007/s11222-019-09898-6
Probabilistic integration of a continuous dynamical system is a way of systematically introducing discretisation error, at scales no larger than errors introduced by standard numerical discretisation, in order to enable thorough exploration of possible responses of the system to inputs. It is thus a potentially useful approach in a number of applications such as forward uncertainty quantification, inverse problems, and data assimilation. We extend the convergence analysis of probabilistic integrators for deterministic ordinary differential equations, as proposed by Conrad et al. (Stat Comput 27(4):1065–1082, 2017. https://doi.org/10.1007/s11222-016-9671-0), to establish mean-square convergence in the uniform norm on discrete- or continuous-time solutions under relaxed regularity assumptions on the driving vector fields and their induced flows. Specifically, we show that randomised high-order integrators for globally Lipschitz flows and randomised Euler integrators for dissipative vector fields with polynomially bounded local Lipschitz constants all have the same mean-square convergence rate as their deterministic counterparts, provided that the variance of the integration noise is not of higher order than the corresponding deterministic integrator. These and similar results are proven for probabilistic integrators where the random perturbations may be state-dependent, non-Gaussian, or non-centred random variables.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/q0xxt-e3j79Interacting Langevin Diffusions: Gradient Structure and Ensemble Kalman Sampler
https://resolver.caltech.edu/CaltechAUTHORS:20190722-103410192
Authors: {'items': [{'id': 'Garbuno-Inigo-A', 'name': {'family': 'Garbuno-Inigo', 'given': 'Alfredo'}, 'orcid': '0000-0003-3279-619X'}, {'id': 'Hoffmann-Franca', 'name': {'family': 'Hoffmann', 'given': 'Franca'}, 'orcid': '0000-0002-1182-5521'}, {'id': 'Li-Wuchen', 'name': {'family': 'Li', 'given': 'Wuchen'}, 'orcid': '0000-0002-2218-5734'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2020
DOI: 10.1137/19M1251655
Solving inverse problems without the use of derivatives or adjoints of the forward model is highly desirable in many applications arising in science and engineering. In this paper we propose a new version of such a methodology, a framework for its analysis, and numerical evidence of the practicality of the method proposed. Our starting point is an ensemble of overdamped Langevin diffusions which interact through a single preconditioner computed as the empirical ensemble covariance. We demonstrate that the nonlinear Fokker--Planck equation arising from the mean-field limit of the associated stochastic differential equation (SDE) has a novel gradient flow structure, built on the Wasserstein metric and the covariance matrix of the noisy flow. Using this structure, we investigate large time properties of the Fokker--Planck equation, showing that its invariant measure coincides with that of a single Langevin diffusion, and demonstrating exponential convergence to the invariant measure in a number of settings. We introduce a new noisy variant on ensemble Kalman inversion (EKI) algorithms found from the original SDE by replacing exact gradients with ensemble differences; this defines the ensemble Kalman sampler (EKS). Numerical results are presented which demonstrate its efficacy as a derivative-free approximate sampler for the Bayesian posterior arising from inverse problems.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/3rdrt-vry57Inverse optimal transport
https://resolver.caltech.edu/CaltechAUTHORS:20190722-082837777
Authors: {'items': [{'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Wolfram-M-T', 'name': {'family': 'Wolfram', 'given': 'Marie-Therese'}, 'orcid': '0000-0003-1133-8253'}]}
Year: 2020
DOI: 10.1137/19M1261122
Discrete optimal transportation problems arise in various contexts in engineering, the sciences, and the social sciences. Often the underlying cost criterion is unknown, or only partly known, and the observed optimal solutions are corrupted by noise. In this paper we propose a systematic approach to infer unknown costs from noisy observations of optimal transportation plans. The algorithm requires only the ability to solve the forward optimal transport problem, which is a linear program, and to generate random numbers. It has a Bayesian interpretation and may also be viewed as a form of stochastic optimization. We illustrate the developed methodologies using the example of international migration flows. Reported migration flow data captures (noisily) the number of individuals moving from one country to another in a given period of time. It can be interpreted as a noisy observation of an optimal transportation map, with costs related to the geographical position of countries. We use a graph-based formulation of the problem, with countries at the nodes of graphs and nonzero weighted adjacencies only on edges between countries which share a border. We use the proposed algorithm to estimate the weights, which represent cost of transition, and to quantify uncertainty in these weights.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/msmac-29011Neural Operator: Graph Kernel Network for Partial Differential Equations
https://resolver.caltech.edu/CaltechAUTHORS:20200402-133318521
Authors: {'items': [{'id': 'Li-Zongyi', 'name': {'family': 'Li', 'given': 'Zongyi'}, 'orcid': '0000-0003-2081-9665'}, {'id': 'Kovachki-Nikola-B', 'name': {'family': 'Kovachki', 'given': 'Nikola'}, 'orcid': '0000-0002-3650-2972'}, {'id': 'Azizzadenesheli-Kamyar', 'name': {'family': 'Azizzadenesheli', 'given': 'Kamyar'}, 'orcid': '0000-0001-8507-1868'}, {'id': 'Liu-Burigede', 'name': {'family': 'Liu', 'given': 'Burigede'}, 'orcid': '0000-0002-6518-3368'}, {'id': 'Bhattacharya-K', 'name': {'family': 'Bhattacharya', 'given': 'Kaushik'}, 'orcid': '0000-0003-2908-5469'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}, {'id': 'Anandkumar-A', 'name': {'family': 'Anandkumar', 'given': 'Anima'}, 'orcid': '0000-0002-6974-6797'}]}
Year: 2020
DOI: 10.48550/arXiv.2003.03485
The classical development of neural networks has been primarily for mappings between a finite-dimensional Euclidean space and a set of classes, or between two finite-dimensional Euclidean spaces. The purpose of this work is to generalize neural networks so that they can learn mappings between infinite-dimensional spaces (operators). The key innovation in our work is that a single set of network parameters, within a carefully designed network architecture, may be used to describe mappings between infinite-dimensional spaces and between different finite-dimensional approximations of those spaces. We formulate approximation of the infinite-dimensional mapping by composing nonlinear activation functions and a class of integral operators. The kernel integration is computed by message passing on graph networks. This approach has substantial practical consequences which we will illustrate in the context of mappings between input data to partial differential equations (PDEs) and their solutions. In this context, such learned networks can generalize among different approximation methods for the PDE (such as finite difference or finite element methods) and among approximations corresponding to different underlying levels of resolution and discretization. Experiments confirm that the proposed graph kernel network does have the desired properties and show competitive performance compared to the state of the art solvers.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/k3t18-we744Tikhonov Regularization Within Ensemble Kalman Inversion
https://resolver.caltech.edu/CaltechAUTHORS:20190719-130631059
Authors: {'items': [{'id': 'Chada-N-K', 'name': {'family': 'Chada', 'given': 'Neil K.'}, 'orcid': '0000-0002-2180-0985'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Tong-Xin-T', 'name': {'family': 'Tong', 'given': 'Xin T.'}}]}
Year: 2020
DOI: 10.1137/19M1242331
Ensemble Kalman inversion is a parallelizable methodology for solving inverse or parameter estimation problems. Although it is based on ideas from Kalman filtering, it may be viewed as a derivative-free optimization method. In its most basic form it regularizes ill-posed inverse problems through the subspace property: the solution found is in the linear span of the initial ensemble employed. In this work we demonstrate how further regularization can be imposed, incorporating prior information about the underlying unknown. In particular we study how to impose Tikhonov-like Sobolev penalties. As well as introducing this modified ensemble Kalman inversion methodology, we also study its continuous-time limit, proving ensemble collapse; in the language of multi-agent optimization this may be viewed as reaching consensus. We also conduct a suite of numerical experiments to highlight the benefits of Tikhonov regularization in the ensemble inversion context.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/5mdy0-7xc75Diffusive optical tomography in the Bayesian framework
https://resolver.caltech.edu/CaltechAUTHORS:20190722-155900728
Authors: {'items': [{'id': 'Newton-Kit', 'name': {'family': 'Newton', 'given': 'Kit'}}, {'id': 'Li-Qin', 'name': {'family': 'Li', 'given': 'Qin'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}]}
Year: 2020
DOI: 10.1137/19M1247346
Many naturally occurring models in the sciences are well approximated by simplified models using multiscale techniques. In such settings it is natural to ask about the relationship between inverse problems defined by the original problem and by the multiscale approximation. We develop an approach to this problem and exemplify it in the context of optical tomographic imaging. Optical tomographic imaging is a technique for inferring the properties of biological tissue via measurements of the incoming and outgoing light intensity; it may be used as a medical imaging methodology. Mathematically, light propagation is modeled by the radiative transfer equation (RTE), and optical tomography amounts to reconstructing the scattering and the absorption coefficients in the RTE from boundary measurements. We study this problem in the Bayesian framework, focussing on the strong scattering regime. In this regime the forward RTE is close to the diffusion equation (DE). We study the RTE in the asymptotic regime where the forward problem approaches the DE and prove convergence of the inverse RTE to the inverse DE in both nonlinear and linear settings. Convergence is proved by studying the distance between the two posterior distributions using the Hellinger metric and using the Kullback--Leibler divergence.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/emf0z-1h684Model Reduction and Neural Networks for Parametric PDEs
https://resolver.caltech.edu/CaltechAUTHORS:20200527-074228185
Authors: {'items': [{'id': 'Bhattacharya-K', 'name': {'family': 'Bhattacharya', 'given': 'Kaushik'}, 'orcid': '0000-0003-2908-5469'}, {'id': 'Hosseini-Bamdad', 'name': {'family': 'Hosseini', 'given': 'Bamdad'}}, {'id': 'Kovachki-N-B', 'name': {'family': 'Kovachki', 'given': 'Nikola B.'}, 'orcid': '0000-0002-3650-2972'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2020
DOI: 10.48550/arXiv.2005.03180
We develop a general framework for data-driven approximation of input-output maps between infinite-dimensional spaces. The proposed approach is motivated by the recent successes of neural networks and deep learning, in combination with ideas from model reduction. This combination results in a neural network approximation which, in principle, is defined on infinite-dimensional spaces and, in practice, is robust to the dimension of finite-dimensional approximations of these spaces required for computation. For a class of input-output maps, and suitably chosen probability measures on the inputs, we prove convergence of the proposed approximation methodology. Numerically we demonstrate the effectiveness of the method on a class of parametric elliptic PDE problems, showing convergence and robustness of the approximation scheme with respect to the size of the discretization, and compare our method with existing algorithms from the literature.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/0e45m-qwh51Site Characterization at Downhole Arrays by Joint Inversion of Dispersion Data and Acceleration Time Series
https://resolver.caltech.edu/CaltechAUTHORS:20200506-121245893
Authors: {'items': [{'id': 'Seylabi-Elnaz-E', 'name': {'family': 'Seylabi', 'given': 'Elnaz'}, 'orcid': '0000-0003-0718-372X'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Asimaki-D', 'name': {'family': 'Asimaki', 'given': 'Domniki'}, 'orcid': '0000-0002-3008-8088'}]}
Year: 2020
DOI: 10.1785/0120190256
We present a sequential data assimilation algorithm based on the ensemble Kalman inversion to estimate the near‐surface shear‐wave velocity profile and damping; this is applicable when heterogeneous data and a priori information that can be represented in forms of (physical) equality and inequality constraints in the inverse problem are available. Although noninvasive methods, such as surface‐wave testing, are efficient and cost‐effective methods for inferring an V_S profile, one should acknowledge that site characterization using inverse analyses can yield erroneous results associated with the lack of inverse problem uniqueness. One viable solution to alleviate the unsuitability of the inverse problem is to enrich the prior knowledge and/or the data space with complementary observations. In the case of noninvasive methods, the pertinent data are the dispersion curve of surface waves, typically resolved by means of active source methods at high frequencies and passive methods at low frequencies. To improve the inverse problem suitability, horizontal‐to‐vertical spectral ratio data are commonly used jointly with the dispersion data in the inversion. In this article, we show that the joint inversion of dispersion and strong‐motion downhole array data can also reduce the margins of uncertainty in the V_S profile estimation. This is because acceleration time series recorded at downhole arrays include both body and surface waves and therefore can enrich the observational data space in the inverse problem setting. We also show how the proposed algorithm can be modified to systematically incorporate physical constraints that further enhance its suitability. We use both synthetic and real data to examine the performance of the proposed framework in estimation of the V_S profile and damping at the Garner Valley downhole array and compare them against the V_S estimations in previous studies.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/r52yq-jk734Consistency of Semi-Supervised Learning Algorithms on Graphs: Probit and One-Hot Methods
https://resolver.caltech.edu/CaltechAUTHORS:20190722-075729960
Authors: {'items': [{'id': 'Hoffmann-Franca', 'name': {'family': 'Hoffmann', 'given': 'Franca'}, 'orcid': '0000-0002-1182-5521'}, {'id': 'Hosseini-Bamdad', 'name': {'family': 'Hosseini', 'given': 'Bamdad'}}, {'id': 'Ren-Zhi', 'name': {'family': 'Ren', 'given': 'Zhi'}, 'orcid': '0000-0001-9812-0251'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2020
DOI: 10.48550/arXiv.1906.07658
Graph-based semi-supervised learning is the problem of propagating labels from a small number of labelled data points to a larger set of unlabelled data. This paper is concerned with the consistency of optimization-based techniques for such problems, in the limit where the labels have small noise and the underlying unlabelled data is well clustered. We study graph-based probit for binary classification, and a natural generalization of this method to multi-class classification using one-hot encoding. The resulting objective function to be optimized comprises the sum of a quadratic form defined through a rational function of the graph Laplacian, involving only the unlabelled data, and a fidelity term involving only the labelled data. The consistency analysis sheds light on the choice of the rational function defining the optimization.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/nggk0-ajn12Reconciling Bayesian and Perimeter Regularization for Binary Inversion
https://resolver.caltech.edu/CaltechAUTHORS:20170612-125032088
Authors: {'items': [{'id': 'Dunbar-O-R-A', 'name': {'family': 'Dunbar', 'given': 'Oliver R. A.'}, 'orcid': '0000-0001-7374-0382'}, {'id': 'Dunlop-M-M', 'name': {'family': 'Dunlop', 'given': 'Matthew M.'}, 'orcid': '0000-0001-7718-3755'}, {'id': 'Elliott-C-M', 'name': {'family': 'Elliott', 'given': 'Charles M.'}}, {'id': 'Hoang-Viet-Ha', 'name': {'family': 'Hoang', 'given': 'Viet Ha'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2020
DOI: 10.1137/18M1179559
A central theme in classical algorithms for the reconstruction of discontinuous functions from observational data is perimeter regularization via the use of total variation. On the other hand, sparse or noisy data often demand a probabilistic approach to the reconstruction of images, to enable uncertainty quantification; the Bayesian approach to inversion, which itself introduces a form of regularization, is a natural framework in which to carry this out. In this paper the link between Bayesian inversion methods and perimeter regularization is explored. In this paper two links are studied: (i) the maximum a posteriori objective function of a suitably chosen Bayesian phase-field approach is shown to be closely related to a least squares plus perimeter regularization objective; (ii) sample paths of a suitably chosen Bayesian level set formulation are shown to possess a finite perimeter and to have the ability to learn about the true perimeter.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/4je7e-n1w42Large Data and Zero Noise Limits of Graph-Based Semi-Supervised Learning Algorithms
https://resolver.caltech.edu/CaltechAUTHORS:20190404-103712251
Authors: {'items': [{'id': 'Dunlop-M-M', 'name': {'family': 'Dunlop', 'given': 'Matthew M.'}, 'orcid': '0000-0001-7718-3755'}, {'id': 'Slepčev-D', 'name': {'family': 'Slepčev', 'given': 'Dejan'}, 'orcid': '0000-0002-7600-1144'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Thorpe-M', 'name': {'family': 'Thorpe', 'given': 'Matthew'}, 'orcid': '0000-0003-2480-5404'}]}
Year: 2020
DOI: 10.1016/j.acha.2019.03.005
Scalings in which the graph Laplacian approaches a differential operator in the large graph limit are used to develop understanding of a number of algorithms for semi-supervised learning; in particular the extension, to this graph setting, of the probit algorithm, level set and kriging methods, are studied. Both optimization and Bayesian approaches are considered, based around a regularizing quadratic form found from an affine transformation of the Laplacian, raised to a, possibly fractional, exponent. Conditions on the parameters defining this quadratic form are identified under which well-defined limiting continuum analogues of the optimization and Bayesian semi-supervised learning problems may be found, thereby shedding light on the design of algorithms in the large graph setting. The large graph limits of the optimization formulations are tackled through Γ−convergence, using the recently introduced TL^p metric. The small labelling noise limits of the Bayesian formulations are also identified, and contrasted with pre-existing harmonic function approaches to the problem.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/rxe0z-j2258Fourier Neural Operator for Parametric Partial Differential Equations
https://resolver.caltech.edu/CaltechAUTHORS:20201106-120140981
Authors: {'items': [{'id': 'Li-Zongyi', 'name': {'family': 'Li', 'given': 'Zongyi'}, 'orcid': '0000-0003-2081-9665'}, {'id': 'Kovachki-N-B', 'name': {'family': 'Kovachki', 'given': 'Nikola'}, 'orcid': '0000-0002-3650-2972'}, {'id': 'Azizzadenesheli-K', 'name': {'family': 'Azizzadenesheli', 'given': 'Kamyar'}, 'orcid': '0000-0001-8507-1868'}, {'id': 'Liu-Burigede', 'name': {'family': 'Liu', 'given': 'Burigede'}, 'orcid': '0000-0002-6518-3368'}, {'id': 'Bhattacharya-K', 'name': {'family': 'Bhattacharya', 'given': 'Kaushik'}, 'orcid': '0000-0003-2908-5469'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}}, {'id': 'Anandkumar-A', 'name': {'family': 'Anandkumar', 'given': 'Anima'}}]}
Year: 2020
DOI: 10.48550/arXiv.2010.08895
The classical development of neural networks has primarily focused on learning mappings between finite-dimensional Euclidean spaces. Recently, this has been generalized to neural operators that learn mappings between function spaces. For partial differential equations (PDEs), neural operators directly learn the mapping from any functional parametric dependence to the solution. Thus, they learn an entire family of PDEs, in contrast to classical methods which solve one instance of the equation. In this work, we formulate a new neural operator by parameterizing the integral kernel directly in Fourier space, allowing for an expressive and efficient architecture. We perform experiments on Burgers' equation, Darcy flow, and the Navier-Stokes equation (including the turbulent regime). Our Fourier neural operator shows state-of-the-art performance compared to existing neural network methodologies and it is up to three orders of magnitude faster compared to traditional PDE solvers.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/hpbg9-9ea84Learning Stochastic Closures Using Ensemble Kalman Inversion
https://resolver.caltech.edu/CaltechAUTHORS:20201109-140955956
Authors: {'items': [{'id': 'Schneider-T', 'name': {'family': 'Schneider', 'given': 'Tapio'}, 'orcid': '0000-0001-5687-2287'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Wu-Jin-Long', 'name': {'family': 'Wu', 'given': 'Jin-Long'}}]}
Year: 2020
DOI: 10.48550/arXiv.2004.08376
Although the governing equations of many systems, when derived from first principles, may be viewed as known, it is often too expensive to numerically simulate all the interactions within the first principles description. Therefore researchers often seek simpler descriptions that describe complex phenomena without numerically resolving all the interacting components. Stochastic differential equations (SDEs) arise naturally as models in this context. The growth in data acquisition provides an opportunity for the systematic derivation of SDE models in many disciplines. However, inconsistencies between SDEs and real data at small time scales often cause problems, when standard statistical methodology is applied to parameter estimation. The incompatibility between SDEs and real data can be addressed by deriving sufficient statistics from the time-series data and learning parameters of SDEs based on these. Following this approach, we formulate the fitting of SDEs to sufficient statistics from real data as an inverse problem and demonstrate that this inverse problem can be solved by using ensemble Kalman inversion (EKI). Furthermore, we create a framework for non-parametric learning of drift and diffusion terms by introducing hierarchical, refineable parameterizations of unknown functions, using Gaussian process regression. We demonstrate the proposed methodology for the fitting of SDE models, first in a simulation study with a noisy Lorenz 63 model, and then in other applications, including dimension reduction starting from various deterministic chaotic systems arising in the atmospheric sciences, large-scale pattern modeling in climate dynamics, and simplified models for key observables arising in molecular dynamics. The results confirm that the proposed methodology provides a robust and systematic approach to fitting SDE models to real data.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/jqa78-jcw61Ensemble Kalman Inversion for Sparse Learning of Dynamical Systems from Time-Averaged Data
https://resolver.caltech.edu/CaltechAUTHORS:20201109-141011032
Authors: {'items': [{'id': 'Schneider-T', 'name': {'family': 'Schneider', 'given': 'Tapio'}, 'orcid': '0000-0001-5687-2287'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}, {'id': 'Wu-Jin-Long', 'name': {'family': 'Wu', 'given': 'Jin-Long'}}]}
Year: 2020
DOI: 10.48550/arXiv.2007.06175
Enforcing sparse structure within learning has led to significant advances in the field of data-driven discovery of dynamical systems. However, such methods require access not only to time-series of the state of the dynamical system, but also to the time derivative. In many applications, the data are available only in the form of time-averages such as moments and autocorrelation functions. We propose a sparse learning methodology to discover the vector fields defining a (possibly stochastic or partial) differential equation, using only time-averaged statistics. Such a formulation of sparse learning naturally leads to a nonlinear inverse problem to which we apply the methodology of ensemble Kalman inversion (EKI). EKI is chosen because it may be formulated in terms of the iterative solution of quadratic optimization problems; sparsity is then easily imposed. We then apply the EKI-based sparse learning methodology to various examples governed by stochastic differential equations (a noisy Lorenz 63 system), ordinary differential equations (Lorenz 96 system and coalescence equations), and a partial differential equation (the Kuramoto-Sivashinsky equation). The results demonstrate that time-averaged statistics can be used for data-driven discovery of differential equations using sparse EKI. The proposed sparse learning methodology extends the scope of data-driven discovery of differential equations to previously challenging applications and data-acquisition scenarios.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/1h946-00r40Drift Estimation of Multiscale Diffusions Based on Filtered Data
https://resolver.caltech.edu/CaltechAUTHORS:20201109-141017891
Authors: {'items': [{'id': 'Abdulle-Assyr', 'name': {'family': 'Abdulle', 'given': 'Assyr'}, 'orcid': '0000-0002-5687-9742'}, {'id': 'Garegnani-Giacomo', 'name': {'family': 'Garegnani', 'given': 'Giacomo'}, 'orcid': '0000-0001-7700-1157'}, {'id': 'Pavliotis-G-A', 'name': {'family': 'Pavliotis', 'given': 'Grigorios A.'}, 'orcid': '0000-0002-3468-9227'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Zanoni-Andrea', 'name': {'family': 'Zanoni', 'given': 'Andrea'}}]}
Year: 2020
DOI: 10.1007/s10208-021-09541-9
We study the problem of drift estimation for two-scale continuous time series. We set ourselves in the framework of overdamped Langevin equations, for which a single-scale surrogate homogenized equation exists. In this setting, estimating the drift coefficient of the homogenized equation requires pre-processing of the data, often in the form of subsampling; this is because the two-scale equation and the homogenized single-scale equation are incompatible at small scales, generating mutually singular measures on the path space. We avoid subsampling and work instead with filtered data, found by application of an appropriate kernel function, and compute maximum likelihood estimators based on the filtered process. We show that the estimators we propose are asymptotically unbiased and demonstrate numerically the advantages of our method with respect to subsampling. Finally, we show how our filtered data methodology can be combined with Bayesian techniques and provide a full uncertainty quantification of the inference procedure.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/06dmv-szf11A Simple Modeling Framework For Prediction In The Human Glucose-Insulin System
https://resolver.caltech.edu/CaltechAUTHORS:20201109-140952547
Authors: {'items': [{'id': 'Albers-D-J', 'name': {'family': 'Albers', 'given': 'D. J.'}}, {'id': 'Levine-M-E', 'name': {'family': 'Levine', 'given': 'M. E.'}}, {'id': 'Sirlanci-Melike', 'name': {'family': 'Sirlanci', 'given': 'M.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}}]}
Year: 2020
DOI: 10.48550/arXiv.1910.14193
In this paper, we build a new, simple, and interpretable mathematical model to describe the human glucose-insulin system. Our ultimate goal is the robust control of the blood glucose (BG) level of individuals to a desired healthy range, by means of adjusting the amount of nutrition and/or external insulin appropriately. By constructing a simple yet flexible model class, with interpretable parameters, this general model can be specialized to work in different settings, such as type 2 diabetes mellitus (T2DM) and intensive care unit (ICU); different choices of appropriate model functions describing uptake of nutrition and removal of glucose differentiate between the models. In both cases, the available data is sparse and collected in clinical settings, major factors that have constrained our model choice to the simple form adopted.
The model has the form of a linear stochastic differential equation (SDE) to describe the evolution of the BG level. The model includes a term quantifying glucose removal from the bloodstream through the regulation system of the human body, and another two terms representing the effect of nutrition and externally delivered insulin. The parameters entering the equation must be learned in a patient-specific fashion, leading to personalized models. We present numerical results on patient-specific parameter estimation and future BG level forecasting in T2DM and ICU settings. The resulting model leads to the prediction of the BG level as an expected value accompanied by a band around this value which accounts for uncertainties in the prediction. Such predictions, then, have the potential for use as part of control systems which are robust to model imperfections and noisy data. Finally, a comparison of the predictive capability of the model with two different models specifically built for T2DM and ICU contexts is also performed.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/9q611-3gh94Multipole Graph Neural Operator for Parametric Partial Differential Equations
https://resolver.caltech.edu/CaltechAUTHORS:20201106-120222366
Authors: {'items': [{'id': 'Li-Zongyi', 'name': {'family': 'Li', 'given': 'Zongyi'}, 'orcid': '0000-0003-2081-9665'}, {'id': 'Kovachki-Nikola-B', 'name': {'family': 'Kovachki', 'given': 'Nikola'}, 'orcid': '0000-0002-3650-2972'}, {'id': 'Azizzadenesheli-Kamyar', 'name': {'family': 'Azizzadenesheli', 'given': 'Kamyar'}, 'orcid': '0000-0001-8507-1868'}, {'id': 'Liu-Burigede', 'name': {'family': 'Liu', 'given': 'Burigede'}, 'orcid': '0000-0002-6518-3368'}, {'id': 'Bhattacharya-K', 'name': {'family': 'Bhattacharya', 'given': 'Kaushik'}, 'orcid': '0000-0003-2908-5469'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Anandkumar-A', 'name': {'family': 'Anandkumar', 'given': 'Anima'}, 'orcid': '0000-0002-6974-6797'}]}
Year: 2020
DOI: 10.48550/arXiv.2006.09535
One of the main challenges in using deep learning-based methods for simulating physical systems and solving partial differential equations (PDEs) is formulating physics-based data in the desired structure for neural networks. Graph neural networks (GNNs) have gained popularity in this area since graphs offer a natural way of modeling particle interactions and provide a clear way of discretizing the continuum models. However, the graphs constructed for approximating such tasks usually ignore long-range interactions due to unfavorable scaling of the computational complexity with respect to the number of nodes. The errors due to these approximations scale with the discretization of the system, thereby not allowing for generalization under mesh-refinement. Inspired by the classical multipole methods, we purpose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity. Our multi-level formulation is equivalent to recursively adding inducing points to the kernel matrix, unifying GNNs with multi-resolution matrix factorization of the kernel. Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/gqmwj-t9b36Continuous Time Analysis of Momentum Methods
https://resolver.caltech.edu/CaltechAUTHORS:20210503-091850360
Authors: {'items': [{'id': 'Kovachki-Nikola-B', 'name': {'family': 'Kovachki', 'given': 'Nikola B.'}, 'orcid': '0000-0002-3650-2972'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2021
Gradient descent-based optimization methods underpin the parameter training of neural networks, and hence comprise a significant component in the impressive test results found in a number of applications. Introducing stochasticity is key to their success in practical problems, and there is some understanding of the role of stochastic gradient descent in this context. Momentum modifications of gradient descent such as Polyak's Heavy Ball method (HB) and Nesterov's method of accelerated gradients (NAG), are also widely adopted. In this work our focus is on understanding the role of momentum in the training of neural networks, concentrating on the common situation in which the momentum contribution is fixed at each step of the algorithm. To expose the ideas simply we work in the deterministic setting. Our approach is to derive continuous time approximations of the discrete algorithms; these continuous time approximations provide insights into the mechanisms at play within the discrete algorithms. We prove three such approximations. Firstly we show that standard implementations of fixed momentum methods approximate a time-rescaled gradient descent flow, asymptotically as the learning rate shrinks to zero; this result does not distinguish momentum methods from pure gradient descent, in the limit of vanishing learning rate. We then proceed to prove two results aimed at understanding the observed practical advantages of fixed momentum methods over gradient descent, when implemented in the non-asymptotic regime with fixed small, but non-zero, learning rate. We achieve this by proving approximations to continuous time limits in which the small but fixed learning rate appears as a parameter; this is known as the method of modified equations in the numerical analysis literature, recently rediscovered as the high resolution ODE approximation in the machine learning context. In our second result we show that the momentum method is approximated by a continuous time gradient flow, with an additional momentum-dependent second order time-derivative correction, proportional to the learning rate; this may be used to explain the stabilizing effect of momentum algorithms in their transient phase. Furthermore in a third result we show that the momentum methods admit an exponentially attractive invariant manifold on which the dynamics reduces, approximately, to a gradient flow with respect to a modified loss function, equal to the original loss function plus a small perturbation proportional to the learning rate; this small correction provides convexification of the loss function and encodes additional robustness present in momentum methods, beyond the transient phase.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/f6qbz-xyp80Calibrate, emulate, sample
https://resolver.caltech.edu/CaltechAUTHORS:20200402-140348174
Authors: {'items': [{'id': 'Cleary-Emmet', 'name': {'family': 'Cleary', 'given': 'Emmet'}}, {'id': 'Garbuno-Inigo-Alfredo', 'name': {'family': 'Garbuno-Inigo', 'given': 'Alfredo'}, 'orcid': '0000-0003-3279-619X'}, {'id': 'Lan-Shiwei', 'name': {'family': 'Lan', 'given': 'Shiwei'}, 'orcid': '0000-0002-9167-3715'}, {'id': 'Schneider-T', 'name': {'family': 'Schneider', 'given': 'Tapio'}, 'orcid': '0000-0001-5687-2287'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2021
DOI: 10.1016/j.jcp.2020.109716
Many parameter estimation problems arising in applications can be cast in the framework of Bayesian inversion. This allows not only for an estimate of the parameters, but also for the quantification of uncertainties in the estimates. Often in such problems the parameter-to-data map is very expensive to evaluate, and computing derivatives of the map, or derivative-adjoints, may not be feasible. Additionally, in many applications only noisy evaluations of the map may be available. We propose an approach to Bayesian inversion in such settings that builds on the derivative-free optimization capabilities of ensemble Kalman inversion methods. The overarching approach is to first use ensemble Kalman sampling (EKS) to calibrate the unknown parameters to fit the data; second, to use the output of the EKS to emulate the parameter-to-data map; third, to sample from an approximate Bayesian posterior distribution in which the parameter-to-data map is replaced by its emulator. This results in a principled approach to approximate Bayesian inference that requires only a small number of evaluations of the (possibly noisy approximation of the) parameter-to-data map. It does not require derivatives of this map, but instead leverages the documented power of ensemble Kalman methods. Furthermore, the EKS has the desirable property that it evolves the parameter ensemble towards the regions in which the bulk of the parameter posterior mass is located, thereby locating them well for the emulation phase of the methodology. In essence, the EKS methodology provides a cheap solution to the design problem of where to place points in parameter space to efficiently train an emulator of the parameter-to-data map for the purposes of Bayesian inversion.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/grfyb-v6b07Learning Dissipative Dynamics in Chaotic Systems
https://resolver.caltech.edu/CaltechAUTHORS:20210719-210135878
Authors: {'items': [{'id': 'Li-Zongyi', 'name': {'family': 'Li', 'given': 'Zongyi'}, 'orcid': '0000-0003-2081-9665'}, {'id': 'Kovachki-Nikola-B', 'name': {'family': 'Kovachki', 'given': 'Nikola'}, 'orcid': '0000-0002-3650-2972'}, {'id': 'Azizzadenesheli-Kamyar', 'name': {'family': 'Azizzadenesheli', 'given': 'Kamyar'}, 'orcid': '0000-0001-8507-1868'}, {'id': 'Liu-Burigede', 'name': {'family': 'Liu', 'given': 'Burigede'}, 'orcid': '0000-0002-6518-3368'}, {'id': 'Bhattacharya-K', 'name': {'family': 'Bhattacharya', 'given': 'Kaushik'}, 'orcid': '0000-0003-2908-5469'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Anandkumar-A', 'name': {'family': 'Anandkumar', 'given': 'Anima'}, 'orcid': '0000-0002-6974-6797'}]}
Year: 2021
DOI: 10.48550/arXiv.2106.06898
Chaotic systems are notoriously challenging to predict because of their sensitivity to perturbations and errors due to time stepping. Despite this unpredictable behavior, for many dissipative systems the statistics of the long term trajectories are governed by an invariant measure supported on a set, known as the global attractor; for many problems this set is finite dimensional, even if the state space is infinite dimensional. For Markovian systems, the statistical properties of long-term trajectories are uniquely determined by the solution operator that maps the evolution of the system over arbitrary positive time increments. In this work, we propose a machine learning framework to learn the underlying solution operator for dissipative chaotic systems, showing that the resulting learned operator accurately captures short-time trajectories and long-time statistical behavior. Using this framework, we are able to predict various statistics of the invariant measure for the turbulent Kolmogorov Flow dynamics with Reynolds numbers up to 5000.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/wm6xz-zgz78Consistency of empirical Bayes and kernel flow for hierarchical parameter estimation
https://resolver.caltech.edu/CaltechAUTHORS:20201109-141002843
Authors: {'items': [{'id': 'Chen-Yifang', 'name': {'family': 'Chen', 'given': 'Yifang'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2021
DOI: 10.1090/mcom/3649
Gaussian process regression has proven very powerful in statistics, machine learning and inverse problems. A crucial aspect of the success of this methodology, in a wide range of applications to complex and real-world problems, is hierarchical modeling and learning of hyperparameters. The purpose of this paper is to study two paradigms of learning hierarchical parameters: one is from the probabilistic Bayesian perspective, in particular, the empirical Bayes approach that has been largely used in Bayesian statistics; the other is from the deterministic and approximation theoretic view, and in particular the kernel flow algorithm that was proposed recently in the machine learning literature. Analysis of their consistency in the large data limit, as well as explicit identification of their implicit bias in parameter learning, are established in this paper for a Matérn-like model on the torus. A particular technical challenge we overcome is the learning of the regularity parameter in the Matérn-like field, for which consistency results have been very scarce in the spatial statistics literature. Moreover, we conduct extensive numerical experiments beyond the Matérn-like model, comparing the two algorithms further. These experiments demonstrate learning of other hierarchical parameters, such as amplitude and lengthscale; they also illustrate the setting of model misspecification in which the kernel flow approach could show superior performance to the more traditional empirical Bayes approach.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/ernn8-n0t25Kernel Analog Forecasting: Multiscale Test Problems
https://resolver.caltech.edu/CaltechAUTHORS:20201109-140959408
Authors: {'items': [{'id': 'Burov-Dmitry', 'name': {'family': 'Burov', 'given': 'Dmitry'}}, {'id': 'Giannakis-Dimitrios', 'name': {'family': 'Giannakis', 'given': 'Dimitrios'}}, {'id': 'Manohar-Krithika', 'name': {'family': 'Manohar', 'given': 'Krithika'}, 'orcid': '0000-0002-1582-6767'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2021
DOI: 10.1137/20M1338289
Data-driven prediction is becoming increasingly widespread as the volume of data available grows and as algorithmic development matches this growth. The nature of the predictions made and the manner in which they should be interpreted depend crucially on the extent to which the variables chosen for prediction are Markovian or approximately Markovian. Multiscale systems provide a framework in which this issue can be analyzed. In this work kernel analog forecasting methods are studied from the perspective of data generated by multiscale dynamical systems. The problems chosen exhibit a variety of different Markovian closures, using both averaging and homogenization; furthermore, settings where scale separation is not present and the predicted variables are non-Markovian are also considered. The studies provide guidance for the interpretation of data-driven prediction methods when used in practice.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/75pwt-vdd55A Framework for Machine Learning of Model Error in Dynamical Systems
https://resolver.caltech.edu/CaltechAUTHORS:20210719-210139286
Authors: {'items': [{'id': 'Levine-Matthew-E', 'name': {'family': 'Levine', 'given': 'Matthew E.'}, 'orcid': '0000-0002-5627-3169'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2021
DOI: 10.48550/arXiv.2107.06658
The development of data-informed predictive models for dynamical systems is of widespread interest in many disciplines. We present a unifying framework for blending mechanistic and machine-learning approaches to identify dynamical systems from data. We compare pure data-driven learning with hybrid models which incorporate imperfect domain knowledge. We cast the problem in both continuous- and discrete-time, for problems in which the model error is memoryless and in which it has significant memory, and we compare data-driven and hybrid approaches experimentally. Our formulation is agnostic to the chosen machine learning model.
Using Lorenz '63 and Lorenz '96 Multiscale systems, we find that hybrid methods substantially outperform solely data-driven approaches in terms of data hunger, demands for model complexity, and overall predictive performance. We also find that, while a continuous-time framing allows for robustness to irregular sampling and desirable domain-interpretability, a discrete-time framing can provide similar or better predictive performance, especially when data are undersampled and the vector field cannot be resolved.
We study model error from the learning theory perspective, defining excess risk and generalization error; for a linear model of the error used to learn about ergodic dynamical systems, both errors are bounded by terms that diminish with the square-root of T. We also illustrate scenarios that benefit from modeling with memory, proving that continuous-time recurrent neural networks (RNNs) can, in principle, learn memory-dependent model error and reconstruct the original system arbitrarily well; numerical results depict challenges in representing memory by this approach. We also connect RNNs to reservoir computing and thereby relate the learning of memory-dependent error to recent work on supervised learning between Banach spaces using random features.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/g2d05-3sh70Neural Operator: Learning Maps Between Function Spaces
https://resolver.caltech.edu/CaltechAUTHORS:20210831-204010794
Authors: {'items': [{'id': 'Kovachki-Nikola-B', 'name': {'family': 'Kovachki', 'given': 'Nikola'}, 'orcid': '0000-0002-3650-2972'}, {'id': 'Li-Zongyi', 'name': {'family': 'Li', 'given': 'Zongyi'}, 'orcid': '0000-0003-2081-9665'}, {'id': 'Liu-Burigede', 'name': {'family': 'Liu', 'given': 'Burigede'}, 'orcid': '0000-0002-6518-3368'}, {'id': 'Azizzadenesheli-Kamyar', 'name': {'family': 'Azizzadenesheli', 'given': 'Kamyar'}, 'orcid': '0000-0001-8507-1868'}, {'id': 'Bhattacharya-K', 'name': {'family': 'Bhattacharya', 'given': 'Kaushik'}, 'orcid': '0000-0003-2908-5469'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Anandkumar-A', 'name': {'family': 'Anandkumar', 'given': 'Anima'}}]}
Year: 2021
DOI: 10.48550/arXiv.2108.08481
The classical development of neural networks has primarily focused on learning mappings between finite dimensional Euclidean spaces or finite sets. We propose a generalization of neural networks tailored to learn operators mapping between infinite dimensional function spaces. We formulate the approximation of operators by composition of a class of linear integral operators and nonlinear activation functions, so that the composed operator can approximate complex nonlinear operators. Furthermore, we introduce four classes of operator parameterizations: graph-based operators, low-rank operators, multipole graph-based operators, and Fourier operators and describe efficient algorithms for computing with each one. The proposed neural operators are resolution-invariant: they share the same network parameters between different discretizations of the underlying function spaces and can be used for zero-shot super-resolutions. Numerically, the proposed models show superior performance compared to existing machine learning based methodologies on Burgers' equation, Darcy flow, and the Navier-Stokes equation, while being several order of magnitude faster compared to conventional PDE solvers.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/h5ry0-fsp13Calibration and Uncertainty Quantification of Convective Parameters in an Idealized GCM
https://resolver.caltech.edu/CaltechAUTHORS:20210113-143919927
Authors: {'items': [{'id': 'Dunbar-Oliver-R-A', 'name': {'family': 'Dunbar', 'given': 'Oliver R. A.'}, 'orcid': '0000-0001-7374-0382'}, {'id': 'Garbuno-Inigo-Alfredo', 'name': {'family': 'Garbuno-Inigo', 'given': 'Alfredo'}, 'orcid': '0000-0003-3279-619X'}, {'id': 'Schneider-T', 'name': {'family': 'Schneider', 'given': 'Tapio'}, 'orcid': '0000-0001-5687-2287'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2021
DOI: 10.1029/2020MS002454
Parameters in climate models are usually calibrated manually, exploiting only small subsets of the available data. This precludes both optimal calibration and quantification of uncertainties. Traditional Bayesian calibration methods that allow uncertainty quantification are too expensive for climate models; they are also not robust in the presence of internal climate variability. For example, Markov chain Monte Carlo (MCMC) methods typically require O(10⁵) model runs and are sensitive to internal variability noise, rendering them infeasible for climate models. Here we demonstrate an approach to model calibration and uncertainty quantification that requires only O(10²) model runs and can accommodate internal climate variability. The approach consists of three stages: (a) a calibration stage uses variants of ensemble Kalman inversion to calibrate a model by minimizing mismatches between model and data statistics; (b) an emulation stage emulates the parameter-to-data map with Gaussian processes (GP), using the model runs in the calibration stage for training; (c) a sampling stage approximates the Bayesian posterior distributions by sampling the GP emulator with MCMC. We demonstrate the feasibility and computational efficiency of this calibrate-emulate-sample (CES) approach in a perfect-model setting. Using an idealized general circulation model, we estimate parameters in a simple convection scheme from synthetic data generated with the model. The CES approach generates probability distributions of the parameters that are good approximations of the Bayesian posteriors, at a fraction of the computational cost usually required to obtain them. Sampling from this approximate posterior allows the generation of climate predictions with quantified parametric uncertainties.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/5x2v8-6ma54The Random Feature Model for Input-Output Maps between Banach Spaces
https://resolver.caltech.edu/CaltechAUTHORS:20200527-073449881
Authors: {'items': [{'id': 'Nelsen-Nicholas-H', 'name': {'family': 'Nelsen', 'given': 'Nicholas H.'}, 'orcid': '0000-0002-8328-1199'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2021
DOI: 10.1137/20M133957X
Well known to the machine learning community, the random feature model is a parametric approximation to kernel interpolation or regression methods. It is typically used to approximate functions mapping a finite-dimensional input space to the real line. In this paper, we instead propose a methodology for use of the random feature model as a data-driven surrogate for operators that map an input Banach space to an output Banach space. Although the methodology is quite general, we consider operators defined by partial differential equations (PDEs); here, the inputs and outputs are themselves functions, with the input parameters being functions required to specify the problem, such as initial data or coefficients, and the outputs being solutions of the problem. Upon discretization, the model inherits several desirable attributes from this infinite-dimensional viewpoint, including mesh-invariant approximation error with respect to the true PDE solution map and the capability to be trained at one mesh resolution and then deployed at different mesh resolutions. We view the random feature model as a nonintrusive data-driven emulator, provide a mathematical framework for its interpretation, and demonstrate its ability to efficiently and accurately approximate the nonlinear parameter-to-solution maps of two prototypical PDEs arising in physical science and engineering applications: the viscous Burgers' equation and a variable coefficient elliptic equation.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/5mzbm-t1t09Posterior consistency of semi-supervised regression on graphs
https://resolver.caltech.edu/CaltechAUTHORS:20201109-141014452
Authors: {'items': [{'id': 'Bertozzi-Andrea-L', 'name': {'family': 'Bertozzi', 'given': 'Andrea L.'}, 'orcid': '0000-0003-0396-7391'}, {'id': 'Hosseini-Bamdad', 'name': {'family': 'Hosseini', 'given': 'Bamdad'}}, {'id': 'Li-Hao', 'name': {'family': 'Li', 'given': 'Hao'}}, {'id': 'Miller-Kevin-UCLA', 'name': {'family': 'Miller', 'given': 'Kevin'}, 'orcid': '0000-0003-4050-1849'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2021
DOI: 10.1088/1361-6420/ac1e80
Graph-based semi-supervised regression (SSR) involves estimating the value of a function on a weighted graph from its values (labels) on a small subset of the vertices; it can be formulated as a Bayesian inverse problem. This paper is concerned with the consistency of SSR in the context of classification, in the setting where the labels have small noise and the underlying graph weighting is consistent with well-clustered vertices. We present a Bayesian formulation of SSR in which the weighted graph defines a Gaussian prior, using a graph Laplacian, and the labeled data defines a likelihood. We analyze the rate of contraction of the posterior measure around the ground truth in terms of parameters that quantify the small label error and inherent clustering in the graph. We obtain bounds on the rates of contraction and illustrate their sharpness through numerical experiments. The analysis also gives insight into the choice of hyperparameters that enter the definition of the prior.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/hshnm-bcq59Solving and learning nonlinear PDEs with Gaussian processes
https://resolver.caltech.edu/CaltechAUTHORS:20210719-210146136
Authors: {'items': [{'id': 'Chen-Yifan', 'name': {'family': 'Chen', 'given': 'Yifan'}}, {'id': 'Hosseini-Bamdad', 'name': {'family': 'Hosseini', 'given': 'Bamdad'}}, {'id': 'Owhadi-H', 'name': {'family': 'Owhadi', 'given': 'Houman'}, 'orcid': '0000-0002-5677-1600'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2021
DOI: 10.1016/j.jcp.2021.110668
We introduce a simple, rigorous, and unified framework for solving nonlinear partial differential equations (PDEs), and for solving inverse problems (IPs) involving the identification of parameters in PDEs, using the framework of Gaussian processes. The proposed approach: (1) provides a natural generalization of collocation kernel methods to nonlinear PDEs and IPs; (2) has guaranteed convergence for a very general class of PDEs, and comes equipped with a path to compute error bounds for specific PDE approximations; (3) inherits the state-of-the-art computational complexity of linear solvers for dense kernel matrices. The main idea of our method is to approximate the solution of a given PDE as the maximum a posteriori (MAP) estimator of a Gaussian process conditioned on solving the PDE at a finite number of collocation points. Although this optimization problem is infinite-dimensional, it can be reduced to a finite-dimensional one by introducing additional variables corresponding to the values of the derivatives of the solution at collocation points; this generalizes the representer theorem arising in Gaussian process regression. The reduced optimization problem has the form of a quadratic objective function subject to nonlinear constraints; it is solved with a variant of the Gauss–Newton method. The resulting algorithm (a) can be interpreted as solving successive linearizations of the nonlinear PDE, and (b) in practice is found to converge in a small number of iterations (2 to 10), for a wide range of PDEs. Most traditional approaches to IPs interleave parameter updates with numerical solution of the PDE; our algorithm solves for both parameter and PDE solution simultaneously. Experiments on nonlinear elliptic PDEs, Burgers' equation, a regularized Eikonal equation, and an IP for permeability identification in Darcy flow illustrate the efficacy and scope of our framework.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/zjv0e-bhc08Spectral analysis of weighted Laplacians arising in data clustering
https://resolver.caltech.edu/CaltechAUTHORS:20200331-075759863
Authors: {'items': [{'id': 'Hoffmann-Franca', 'name': {'family': 'Hoffmann', 'given': 'Franca'}, 'orcid': '0000-0002-1182-5521'}, {'id': 'Hosseini-Bamdad', 'name': {'family': 'Hosseini', 'given': 'Bamdad'}}, {'id': 'Oberai-Assad-A', 'name': {'family': 'Oberai', 'given': 'Assad A.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2022
DOI: 10.1016/j.acha.2021.07.004
Graph Laplacians computed from weighted adjacency matrices are widely used to identify geometric structure in data, and clusters in particular; their spectral properties play a central role in a number of unsupervised and semi-supervised learning algorithms. When suitably scaled, graph Laplacians approach limiting continuum operators in the large data limit. Studying these limiting operators, therefore, sheds light on learning algorithms. This paper is devoted to the study of a parameterized family of divergence form elliptic operators that arise as the large data limit of graph Laplacians. The link between a three-parameter family of graph Laplacians and a three-parameter family of differential operators is explained. The spectral properties of these differential operators are analyzed in the situation where the data comprises of two nearly separated clusters, in a sense which is made precise. In particular, we investigate how the spectral gap depends on the three parameters entering the graph Laplacian, and on a parameter measuring the size of the perturbation from the perfectly clustered case. Numerical results are presented which exemplify the analysis and which extend it in the following ways: the computations study situations in which there are two nearly separated clusters, but which violate the assumptions used in our theory; situations in which more than two clusters are present, also going beyond our theory; and situations which demonstrate the relevance of our studies of differential operators for the understanding of finite data problems via the graph Laplacian. The findings provide insight into parameter choices made in learning algorithms which are based on weighted adjacency matrices; they also provide the basis for analysis of the consistency of various unsupervised and semi-supervised learning algorithms, in the large data limit.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/knq5x-x3h73A learning-based multiscale method and its application to inelastic impact problems
https://resolver.caltech.edu/CaltechAUTHORS:20210225-132721680
Authors: {'items': [{'id': 'Liu-Burigede', 'name': {'family': 'Liu', 'given': 'Burigede'}, 'orcid': '0000-0002-6518-3368'}, {'id': 'Kovachki-Nikola-B', 'name': {'family': 'Kovachki', 'given': 'Nikola'}, 'orcid': '0000-0002-3650-2972'}, {'id': 'Li-Zongyi', 'name': {'family': 'Li', 'given': 'Zongyi'}}, {'id': 'Azizzadenesheli-Kamyar', 'name': {'family': 'Azizzadenesheli', 'given': 'Kamyar'}, 'orcid': '0000-0001-8507-1868'}, {'id': 'Anandkumar-A', 'name': {'family': 'Anandkumar', 'given': 'Anima'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Bhattacharya-K', 'name': {'family': 'Bhattacharya', 'given': 'Kaushik'}, 'orcid': '0000-0003-2908-5469'}]}
Year: 2022
DOI: 10.1016/j.jmps.2021.104668
The macroscopic properties of materials that we observe and exploit in engineering application result from complex interactions between physics at multiple length and time scales: electronic, atomistic, defects, domains etc. Multiscale modeling seeks to understand these interactions by exploiting the inherent hierarchy where the behavior at a coarser scale regulates and averages the behavior at a finer scale. This requires the repeated solution of computationally expensive finer-scale models, and often a priori knowledge of those aspects of the finer-scale behavior that affect the coarser scale (order parameters, state variables, descriptors, etc.). We address this challenge in a two-scale setting where we learn the fine-scale behavior from off-line calculations and then use the learnt behavior directly in coarse scale calculations. The approach builds on the recent success of deep neural networks by combining their approximation power in high dimensions with ideas from model reduction. It results in a neural network approximation that has high fidelity, is computationally inexpensive, is independent of the need for a priori knowledge, and can be used directly in the coarse scale calculations. We demonstrate the approach on problems involving the impact of magnesium, a promising light-weight structural and protective material.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/j2js9-txr69Ensemble-Based Experimental Design for Targeted High-Resolution Simulations to Inform Climate Models
https://resolver.caltech.edu/CaltechAUTHORS:20220119-572479000
Authors: {'items': [{'id': 'Dunbar-Oliver-R-A', 'name': {'family': 'Dunbar', 'given': 'Oliver R. A.'}, 'orcid': '0000-0001-7374-0382'}, {'id': 'Howland-Michael-F', 'name': {'family': 'Howland', 'given': 'Michael F.'}, 'orcid': '0000-0002-2878-3874'}, {'id': 'Schneider-T', 'name': {'family': 'Schneider', 'given': 'Tapio'}, 'orcid': '0000-0001-5687-2287'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2022
DOI: 10.1002/essoar.10510142.1
Targeted high-resolution simulations driven by a general circulation model (GCM) can be used to calibrate GCM parameterizations of processes that are globally unresolvable but can be resolved in limited-area simulations. This raises the question of where to place high-resolution simulations to be maximally informative about the uncertain parameterizations in the global model. Here we construct an ensemble-based parallel algorithm to locate regions that maximize the uncertainty reduction, or information gain, in the uncertainty quantification of GCM parameters with regional data. The algorithm is based on a Bayesian framework that exploits a quantified posterior distribution on GCM parameters as a measure of uncertainty. The algorithm is embedded in the recently developed calibrate-emulate-sample (CES) framework, which performs efficient model calibration and uncertainty quantification with only O(10²) forward model evaluations, compared with O(10⁵) forward model evaluations typically needed for traditional approaches to Bayesian calibration. We demonstrate the algorithm with an idealized GCM, with which we generate surrogates of high-resolution data. In this setting, we calibrate parameters and quantify uncertainties in a quasi-equilibrium convection scheme. We consider (i) localization in space for a statistically stationary problem, and (ii) localization in space and time for a seasonally varying problem. In these proof-of-concept applications, the calculated information gain reflects the reduction in parametric uncertainty obtained from Bayesian inference when harnessing a targeted sample of data. The largest information gain results from regions near the intertropical convergence zone (ITCZ) and indeed the algorithm automatically targets these regions for data collection.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/hye2r-0jx31Derivative-Free Bayesian Inversion Using Multiscale Dynamics
https://resolver.caltech.edu/CaltechAUTHORS:20210719-210152979
Authors: {'items': [{'id': 'Pavliotis-Grigorios-A', 'name': {'family': 'Pavliotis', 'given': 'G. A.'}, 'orcid': '0000-0002-3468-9227'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Vaes-Urbain', 'name': {'family': 'Vaes', 'given': 'U.'}, 'orcid': '0000-0002-7629-7184'}]}
Year: 2022
DOI: 10.1137/21M1397416
Inverse problems are ubiquitous because they formalize the integration of data with mathematical models. In many scientific applications the forward model is expensive to evaluate, and adjoint computations are difficult to employ; in this setting derivative-free methods which involve a small number of forward model evaluations are an attractive proposition. Ensemble Kalman-based interacting particle systems (and variants such as consensus-based and unscented Kalman approaches) have proven empirically successful in this context, but suffer from the fact that they cannot be systematically refined to return the true solution, except in the setting of linear forward models [A. Garbuno-Inigo et al., SIAM J. Appl. Dyn. Syst., 19 (2020), pp. 412-441]. In this paper, we propose a new derivative-free approach to Bayesian inversion, which may be employed for posterior sampling or for maximum a posteriori estimation, and may be systematically refined. The method relies on a fast/slow system of stochastic differential equations for the local approximation of the gradient of the log-likelihood appearing in a Langevin diffusion. Furthermore the method may be preconditioned by use of information from ensemble Kalman--based methods (and variants), providing a methodology which leverages the documented advantages of those methods, while also being provably refinable. We define the methodology, highlighting its flexibility and many variants, provide a theoretical analysis of the proposed approach, and demonstrate its efficacy by means of numerical experiments.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/dne8g-tx986Multiscale modeling of materials: Computing, data science, uncertainty and goal-oriented optimization
https://resolver.caltech.edu/CaltechAUTHORS:20220121-968309000
Authors: {'items': [{'id': 'Kovachki-Nikola-B', 'name': {'family': 'Kovachki', 'given': 'Nikola'}, 'orcid': '0000-0002-3650-2972'}, {'id': 'Liu-Burigede', 'name': {'family': 'Liu', 'given': 'Burigede'}, 'orcid': '0000-0002-6518-3368'}, {'id': 'Sun-Xingsheng', 'name': {'family': 'Sun', 'given': 'Xingsheng'}, 'orcid': '0000-0003-1527-789X'}, {'id': 'Zhou-Hao', 'name': {'family': 'Zhou', 'given': 'Hao'}, 'orcid': '0000-0002-6011-6422'}, {'id': 'Bhattacharya-K', 'name': {'family': 'Bhattacharya', 'given': 'Kaushik'}, 'orcid': '0000-0003-2908-5469'}, {'id': 'Ortiz-M', 'name': {'family': 'Ortiz', 'given': 'Michael'}, 'orcid': '0000-0001-5877-4824'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2022
DOI: 10.1016/j.mechmat.2021.104156
The recent decades have seen various attempts at accelerating the process of developing materials targeted towards specific applications. The performance required for a particular application leads to the choice of a particular material system whose properties are optimized by manipulating its underlying microstructure through processing. The specific configuration of the structure is then designed by characterizing the material in detail, and using this characterization along with physical principles in system level simulations and optimization. These have been advanced by multiscale modeling of materials, high-throughput experimentations, materials data-bases, topology optimization and other ideas. Still, developing materials for extreme applications involving large deformation, high strain rates and high temperatures remains a challenge. This article reviews a number of recent methods that advance the goal of designing materials targeted by specific applications.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/cjvbg-kbb29Consensus-based sampling
https://resolver.caltech.edu/CaltechAUTHORS:20210719-210142693
Authors: {'items': [{'id': 'Carrillo-José-Antonio', 'name': {'family': 'Carrillo', 'given': 'J. A.'}, 'orcid': '0000-0001-8819-4660'}, {'id': 'Hoffmann-Franca', 'name': {'family': 'Hoffmann', 'given': 'F.'}, 'orcid': '0000-0002-1182-5521'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'A. M.'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Vaes-Urbain', 'name': {'family': 'Vaes', 'given': 'U.'}, 'orcid': '0000-0002-7629-7184'}]}
Year: 2022
DOI: 10.1111/sapm.12470
We propose a novel method for sampling and optimization tasks based on a stochastic interacting particle system. We explain how this method can be used for the following two goals: (i) generating approximate samples from a given target distribution and (ii) optimizing a given objective function. The approach is derivative-free and affine invariant, and is therefore well-suited for solving inverse problems defined by complex forward models: (i) allows generation of samples from the Bayesian posterior and (ii) allows determination of the maximum a posteriori estimator. We investigate the properties of the proposed family of methods in terms of various parameter choices, both analytically and by means of numerical simulations. The analysis and numerical simulation establish that the method has potential for general purpose optimization tasks over Euclidean space; contraction properties of the algorithm are established under suitable conditions, and computational experiments demonstrate wide basins of attraction for various specific problems. The analysis and experiments also demonstrate the potential for the sampling methodology in regimes in which the target distribution is unimodal and close to Gaussian; indeed we prove that the method recovers a Laplace approximation to the measure in certain parametric regimes and provide numerical evidence that this Laplace approximation attracts a large set of initial conditions in a number of examples.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/c3k3x-c1365Convergence Rates for Learning Linear Operators from Noisy Data
https://resolver.caltech.edu/CaltechAUTHORS:20220524-180322099
Authors: {'items': [{'id': 'de-Hoop-Maarten-V', 'name': {'family': 'de Hoop', 'given': 'Maarten V.'}, 'orcid': '0000-0002-6333-0379'}, {'id': 'Kovachki-Nikola-B', 'name': {'family': 'Kovachki', 'given': 'Nikola B.'}, 'orcid': '0000-0002-3650-2972'}, {'id': 'Nelsen-Nicholas-H', 'name': {'family': 'Nelsen', 'given': 'Nicholas H.'}, 'orcid': '0000-0002-8328-1199'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2022
DOI: 10.48550/arXiv.2108.12515
We study the Bayesian inverse problem of learning a linear operator on a Hilbert space from its noisy pointwise evaluations on random input data. Our framework assumes that this target operator is self-adjoint and diagonal in a basis shared with the Gaussian prior and noise covariance operators arising from the imposed statistical model and is able to handle target operators that are compact, bounded, or even unbounded. We establish posterior contraction rates with respect to a family of Bochner norms as the number of data tend to infinity and derive related lower bounds on the estimation error. In the large data limit, we also provide asymptotic convergence rates of suitably defined excess risk and generalization gap functionals associated with the posterior mean point estimator. In doing so, we connect the posterior consistency results to nonparametric learning theory. Furthermore, these convergence rates highlight and quantify the difficulty of learning unbounded linear operators in comparison with the learning of bounded or compact ones. Numerical experiments confirm the theory and demonstrate that similar conclusions may be expected in more general problem settings.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/eweb1-3wn20Ensemble Inference Methods for Models With Noisy and Expensive Likelihoods
https://resolver.caltech.edu/CaltechAUTHORS:20210412-121307581
Authors: {'items': [{'id': 'Dunbar-Oliver-R-A', 'name': {'family': 'Dunbar', 'given': 'Oliver R. A.'}, 'orcid': '0000-0001-7374-0382'}, {'id': 'Duncan-Andrew-B', 'name': {'family': 'Duncan', 'given': 'Andrew B.'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Wolfram-Marie-Therese', 'name': {'family': 'Wolfram', 'given': 'Marie-Therese'}, 'orcid': '0000-0003-1133-8253'}]}
Year: 2022
DOI: 10.1137/21M1410853
The increasing availability of data presents an opportunity to calibrate unknown parameters which appear in complex models of phenomena in the biomedical, physical, and social sciences. However, model complexity often leads to parameter-to-data maps which are expensive to evaluate and are only available through noisy approximations. This paper is concerned with the use of interacting particle systems for the solution of the resulting inverse problems for parameters. Of particular interest is the case where the available forward model evaluations are subject to rapid fluctuations, in parameter space, superimposed on the smoothly varying large-scale parametric structure of interest. A motivating example from climate science is presented, and ensemble Kalman methods (which do not use the derivative of the parameter-to-data map) are shown, empirically, to perform well. Multiscale analysis is then used to analyze the behavior of interacting particle system algorithms when rapid fluctuations, which we refer to as noise, pollute the large-scale parametric dependence of the parameter-to-data map. Ensemble Kalman methods and Langevin-based methods (the latter use the derivative of the parameter-to-data map) are compared in this light. The ensemble Kalman methods are shown to behave favorably in the presence of noise in the parameter-to-data map, whereas Langevin methods are adversely affected. On the other hand, Langevin methods have the correct equilibrium distribution in the setting of noise-free forward models, while ensemble Kalman methods only provide an uncontrolled approximation, except in the linear case. Therefore a new class of algorithms, ensemble Gaussian process samplers, which combine the benefits of both ensemble Kalman and Langevin methods, are introduced and shown to perform favorably.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/cx7ps-01463Iterated Kalman methodology for inverse problems
https://resolver.caltech.edu/CaltechAUTHORS:20210719-210149563
Authors: {'items': [{'id': 'Huang-Zhengyu-Daniel', 'name': {'family': 'Huang', 'given': 'Daniel Zhengyu'}}, {'id': 'Schneider-T', 'name': {'family': 'Schneider', 'given': 'Tapio'}, 'orcid': '0000-0001-5687-2287'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2022
DOI: 10.1016/j.jcp.2022.111262
This paper is focused on the optimization approach to the solution of inverse problems. We introduce a stochastic dynamical system in which the parameter-to-data map is embedded, with the goal of employing techniques from nonlinear Kalman filtering to estimate the parameter given the data. The extended Kalman filter (which we refer to as ExKI in the context of inverse problems) can be effective for some inverse problems approached this way, but is impractical when the forward map is not readily differentiable and is given as a black box, and also for high dimensional parameter spaces because of the need to propagate large covariance matrices. Application of ensemble Kalman filters, for example use of the ensemble Kalman inversion (EKI) algorithm, has emerged as a useful tool which overcomes both of these issues: it is derivative free and works with a low-rank covariance approximation formed from the ensemble. In this paper, we work with the ExKI, EKI, and a variant on EKI which we term unscented Kalman inversion (UKI).
The paper contains two main contributions. Firstly, we identify a novel stochastic dynamical system in which the parameter-to-data map is embedded. We present theory in the linear case to show exponential convergence of the mean of the filtering distribution to the solution of a regularized least squares problem. This is in contrast to previous work in which the EKI has been employed where the dynamical system used leads to algebraic convergence to an unregularized problem. Secondly, we show that the application of the UKI to this novel stochastic dynamical system yields improved inversion results, in comparison with the application of EKI to the same novel stochastic dynamical system.
The numerical experiments include proof-of-concept linear examples and various applied nonlinear inverse problems: learning of permeability parameters in subsurface flow; learning the damage field from structure deformation; learning the Navier-Stokes initial condition from solution data at positive times; learning subgrid-scale parameters in a general circulation model (GCM) from time-averaged statistics.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/nj6wn-jwq16Ensemble-Based Experimental Design for Targeting Data Acquisition to Inform Climate Models
https://resolver.caltech.edu/CaltechAUTHORS:20220926-576391900.2
Authors: {'items': [{'id': 'Dunbar-Oliver-R-A', 'name': {'family': 'Dunbar', 'given': 'Oliver R. A.'}, 'orcid': '0000-0001-7374-0382'}, {'id': 'Howland-Michael-F', 'name': {'family': 'Howland', 'given': 'Michael F.'}, 'orcid': '0000-0002-2878-3874'}, {'id': 'Schneider-T', 'name': {'family': 'Schneider', 'given': 'Tapio'}, 'orcid': '0000-0001-5687-2287'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}}]}
Year: 2022
DOI: 10.1029/2022ms002997
Data required to calibrate uncertain general circulation model (GCM) parameterizations are often only available in limited regions or time periods, for example, observational data from field campaigns, or data generated in local high-resolution simulations. This raises the question of where and when to acquire additional data to be maximally informative about parameterizations in a GCM. Here we construct a new ensemble-based parallel algorithm to automatically target data acquisition to regions and times that maximize the uncertainty reduction, or information gain, about GCM parameters. The algorithm uses a Bayesian framework that exploits a quantified distribution of GCM parameters as a measure of uncertainty. This distribution is informed by time-averaged climate statistics restricted to local regions and times. The algorithm is embedded in the recently developed calibrate-emulate-sample framework, which performs efficient model calibration and uncertainty quantification with only O(10²) model evaluations, compared with O(10⁵) evaluations typically needed for traditional approaches to Bayesian calibration. We demonstrate the algorithm with an idealized GCM, with which we generate surrogates of local data. In this perfect-model setting, we calibrate parameters and quantify uncertainties in a quasi-equilibrium convection scheme in the GCM. We consider targeted data that are (a) localized in space for statistically stationary simulations, and (b) localized in space and time for seasonally varying simulations. In these proof-of-concept applications, the calculated information gain reflects the reduction in parametric uncertainty obtained from Bayesian inference when harnessing a targeted sample of data. The largest information gain typically, but not always, results from regions near the intertropical convergence zone.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/sdv6s-rr183Ensemble Kalman inversion for sparse learning of dynamical systems from time-averaged data
https://resolver.caltech.edu/CaltechAUTHORS:20221013-45138000.1
Authors: {'items': [{'id': 'Schneider-T', 'name': {'family': 'Schneider', 'given': 'Tapio'}, 'orcid': '0000-0001-5687-2287'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Wu-Jin-Long', 'name': {'family': 'Wu', 'given': 'Jin-Long'}}]}
Year: 2022
DOI: 10.1016/j.jcp.2022.111559
Enforcing sparse structure within learning has led to significant advances in the field of data-driven discovery of dynamical systems. However, such methods require access not only to timeseries of the state of the dynamical system, but also to the time derivative. In many applications, the data are available only in the form of time-averages such as moments and autocorrelation functions. We propose a sparse learning methodology to discover the vector fields defining a (possibly stochastic or partial) differential equation, using only time-averaged statistics. Such a formulation of sparse learning naturally leads to a nonlinear inverse problem to which we apply the methodology of ensemble Kalman inversion (EKI). EKI is chosen because it may be formulated in terms of the iterative solution of quadratic optimization problems; sparsity is then easily imposed. We then apply the EKI-based sparse learning methodology to various examples governed by stochastic differential equations (a noisy Lorenz 63 system), ordinary differential equations (Lorenz 96 system and coalescence equations), and a partial differential equation (the Kuramoto-Sivashinsky equation). The results demonstrate that time-averaged statistics can be used for data-driven discovery of differential equations using sparse EKI. The proposed sparse learning methodology extends the scope of data-driven discovery of differential equations to previously challenging applications and data-acquisition scenarios.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/qafpz-pzq58Second Order Ensemble Langevin Method for Sampling and Inverse Problems
https://resolver.caltech.edu/CaltechAUTHORS:20221221-222944367
Authors: {'items': [{'id': 'Liu-Ziming', 'name': {'family': 'Liu', 'given': 'Ziming'}}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Wang-Yixuan', 'name': {'family': 'Wang', 'given': 'Yixuan'}, 'orcid': '0000-0001-7305-5422'}]}
Year: 2022
DOI: 10.48550/arXiv.2208.04506
We propose a sampling method based on an ensemble approximation of second order Langevin dynamics. The log target density is appended with a quadratic term in an auxiliary momentum variable and damped-driven Hamiltonian dynamics introduced; the resulting stochastic differential equation is invariant to the Gibbs measure, with marginal on the position coordinates given by the target. A preconditioner based on covariance under the law of the dynamics does not change this invariance property, and is introduced to accelerate convergence to the Gibbs measure. The resulting mean-field dynamics may be approximated by an ensemble method; this results in a gradient-free and affine-invariant stochastic dynamical system. Numerical results demonstrate its potential as the basis for a numerical sampler in Bayesian inverse problems.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/e011a-65k64Learning Markovian Homogenized Models in Viscoelasticity
https://resolver.caltech.edu/CaltechAUTHORS:20230613-155502989
Authors: {'items': [{'id': 'Bhattacharya-K', 'name': {'family': 'Bhattacharya', 'given': 'Kaushik'}, 'orcid': '0000-0003-2908-5469'}, {'id': 'Liu-Burigede', 'name': {'family': 'Liu', 'given': 'Burigede'}, 'orcid': '0000-0002-6518-3368'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Trautner-Margaret', 'name': {'family': 'Trautner', 'given': 'Margaret'}, 'orcid': '0000-0001-9937-8393'}]}
Year: 2023
DOI: 10.1137/22M1499200
Fully resolving dynamics of materials with rapidly varying features involves expensive fine-scale computations which need to be conducted on macroscopic scales. The theory of homogenization provides an approach for deriving effective macroscopic equations which eliminates the small scales by exploiting scale separation. An accurate homogenized model avoids the computationally expensive task of numerically solving the underlying balance laws at a fine scale, thereby rendering a numerical solution of the balance laws more computationally tractable. In complex settings, homogenization only defines the constitutive model implicitly, and machine learning can be used to learn the constitutive model explicitly from localized fine-scale simulations. In the case of one-dimensional viscoelasticity, the linearity of the model allows for a complete analysis. We establish that the homogenized constitutive model may be approximated by a recurrent neural network that captures the memory. The memory is encapsulated in the evolution of an appropriate finite set of hidden variables, which are discovered through the learning process and dependent on the history of the strain. Simulations are presented which validate the theory. Guidance for the learning of more complex models, such as arise in plasticity, using similar techniques, is given.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/tsw3z-9yb69Convergence Rates for Learning Linear Operators from Noisy Data
https://resolver.caltech.edu/CaltechAUTHORS:20230613-730765600.19
Authors: {'items': [{'id': 'de-Hoop-Maarten-V', 'name': {'family': 'de Hoop', 'given': 'Maarten V.'}, 'orcid': '0000-0002-6333-0379'}, {'id': 'Kovachki-Nikola-B', 'name': {'family': 'Kovachki', 'given': 'Nikola B.'}, 'orcid': '0000-0002-3650-2972'}, {'id': 'Nelsen-Nicholas-H', 'name': {'family': 'Nelsen', 'given': 'Nicholas H.'}, 'orcid': '0000-0002-8328-1199'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2023
DOI: 10.1137/21m1442942
This paper studies the learning of linear operators between infinite-dimensional Hilbert spaces. The training data comprises pairs of random input vectors in a Hilbert space and their noisy images under an unknown self-adjoint linear operator. Assuming that the operator is diagonalizable in a known basis, this work solves the equivalent inverse problem of estimating the operator's eigenvalues given the data. Adopting a Bayesian approach, the theoretical analysis establishes posterior contraction rates in the infinite data limit with Gaussian priors that are not directly linked to the forward map of the inverse problem. The main results also include learning-theoretic generalization error guarantees for a wide range of distribution shifts. These convergence rates quantify the effects of data smoothness and true eigenvalue decay or growth, for compact or unbounded operators, respectively, on sample complexity. Numerical evidence supports the theory in diagonal and nondiagonal settings.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/m684g-6g048A simple modeling framework for prediction in the human glucose–insulin system
https://authors.library.caltech.edu/records/9ezja-9a679
Authors: {'items': [{'id': 'Sirlanci-Melike', 'name': {'family': 'Sirlanci', 'given': 'Melike'}, 'orcid': '0000-0002-4749-4752'}, {'id': 'Levine-Matthew-E', 'name': {'family': 'Levine', 'given': 'Matthew E.'}, 'orcid': '0000-0002-5627-3169'}, {'id': 'Low-Wang-Cecilia-C', 'name': {'family': 'Low Wang', 'given': 'Cecilia C.'}, 'orcid': '0000-0001-8557-5417'}, {'id': 'Albers-David-J', 'name': {'family': 'Albers', 'given': 'David J.'}, 'orcid': '0000-0002-5369-526X'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew M.'}, 'orcid': '0000-0001-9091-7266'}]}
Year: 2023
DOI: 10.1063/5.0146808
Forecasting blood glucose (BG) levels with routinely collected data is useful for glycemic management. BG dynamics are nonlinear, complex, and nonstationary, which can be represented by nonlinear models. However, the sparsity of routinely collected data creates parameter identifiability issues when high-fidelity complex models are used, thereby resulting in inaccurate forecasts. One can use models with reduced physiological fidelity for robust and accurate parameter estimation and forecasting with sparse data. For this purpose, we approximate the nonlinear dynamics of BG regulation by a linear stochastic differential equation: we develop a linear stochastic model, which can be specialized to different settings: type 2 diabetes mellitus (T2DM) and intensive care unit (ICU), with different choices of appropriate model functions. The model includes deterministic terms quantifying glucose removal from the bloodstream through the glycemic regulation system and representing the effect of nutrition and externally delivered insulin. The stochastic term encapsulates the BG oscillations. The model output is in the form of an expected value accompanied by a band around this value. The model parameters are estimated patient-specifically, leading to personalized models. The forecasts consist of values for BG mean and variation, quantifying possible high and low BG levels. Such predictions have potential use for glycemic management as part of control systems. We present experimental results on parameter estimation and forecasting in T2DM and ICU settings. We compare the model's predictive capability with two different nonlinear models built for T2DM and ICU contexts to have a sense of the level of prediction achieved by this model.https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/9ezja-9a679Harnessing AI and computing to advance climate modelling and prediction
https://authors.library.caltech.edu/records/z0c0r-2jt68
Authors: {'items': [{'id': 'Schneider-T', 'name': {'family': 'Schneider', 'given': 'Tapio'}, 'orcid': '0000-0001-5687-2287'}, {'id': 'Behera-Swadhin', 'name': {'family': 'Behera', 'given': 'Swadhin'}, 'orcid': '0000-0001-8692-2388'}, {'id': 'Boccaletti-Giulio', 'name': {'family': 'Boccaletti', 'given': 'Giulio'}, 'orcid': '0009-0008-1072-7672'}, {'id': 'Deser-Clara', 'name': {'family': 'Deser', 'given': 'Clara'}, 'orcid': '0000-0002-5517-9103'}, {'id': 'Emanuel-Kerry', 'name': {'family': 'Emanuel', 'given': 'Kerry'}, 'orcid': '0000-0002-2066-2082'}, {'id': 'Ferrari-Raffaele', 'name': {'family': 'Ferrari', 'given': 'Raffaele'}, 'orcid': '0000-0003-1895-4294'}, {'id': 'Leung-L-Ruby', 'name': {'family': 'Leung', 'given': 'L. Ruby'}, 'orcid': '0000-0002-3221-9467'}, {'id': 'Lin-Ning', 'name': {'family': 'Lin', 'given': 'Ning'}, 'orcid': '0000-0002-5571-1606'}, {'id': 'Müller-Thomas', 'name': {'family': 'Müller', 'given': 'Thomas'}, 'orcid': '0000-0003-1225-1483'}, {'id': 'Navarra-Antonio', 'name': {'family': 'Navarra', 'given': 'Antonio'}}, {'id': 'Ndiaye-Ousmane', 'name': {'family': 'Ndiaye', 'given': 'Ousmane'}, 'orcid': '0000-0002-5048-4731'}, {'id': 'Stuart-A-M', 'name': {'family': 'Stuart', 'given': 'Andrew'}, 'orcid': '0000-0001-9091-7266'}, {'id': 'Tribbia-Joseph', 'name': {'family': 'Tribbia', 'given': 'Joseph'}, 'orcid': '0000-0003-1639-9688'}, {'id': 'Yamagata-Toshio', 'name': {'family': 'Yamagata', 'given': 'Toshio'}, 'orcid': '0000-0003-1267-2149'}]}
Year: 2023
DOI: 10.1038/s41558-023-01769-3
<p>There are contrasting views on how to produce the accurate predictions that are needed to guide climate change adaptation. Here, we argue for harnessing artificial intelligence, building on domain-specific knowledge and generating ensembles of moderately high-resolution (10–50 km) climate simulations as anchors for detailed hazard models.</p>https://authors.library.caltech.eduhttps://authors.library.caltech.edu/records/z0c0r-2jt68