Abstract: Enforcing sparse structure within learning has led to significant advances in the field of data-driven discovery of dynamical systems. However, such methods require access not only to timeseries of the state of the dynamical system, but also to the time derivative. In many applications, the data are available only in the form of time-averages such as moments and autocorrelation functions. We propose a sparse learning methodology to discover the vector fields defining a (possibly stochastic or partial) differential equation, using only time-averaged statistics. Such a formulation of sparse learning naturally leads to a nonlinear inverse problem to which we apply the methodology of ensemble Kalman inversion (EKI). EKI is chosen because it may be formulated in terms of the iterative solution of quadratic optimization problems; sparsity is then easily imposed. We then apply the EKI-based sparse learning methodology to various examples governed by stochastic differential equations (a noisy Lorenz 63 system), ordinary differential equations (Lorenz 96 system and coalescence equations), and a partial differential equation (the Kuramoto-Sivashinsky equation). The results demonstrate that time-averaged statistics can be used for data-driven discovery of differential equations using sparse EKI. The proposed sparse learning methodology extends the scope of data-driven discovery of differential equations to previously challenging applications and data-acquisition scenarios.

Publication: Journal of Computational Physics Vol.: 470ISSN: 0021-9991

ID: CaltechAUTHORS:20221013-45138000.1

]]>

Abstract: Data required to calibrate uncertain general circulation model (GCM) parameterizations are often only available in limited regions or time periods, for example, observational data from field campaigns, or data generated in local high-resolution simulations. This raises the question of where and when to acquire additional data to be maximally informative about parameterizations in a GCM. Here we construct a new ensemble-based parallel algorithm to automatically target data acquisition to regions and times that maximize the uncertainty reduction, or information gain, about GCM parameters. The algorithm uses a Bayesian framework that exploits a quantified distribution of GCM parameters as a measure of uncertainty. This distribution is informed by time-averaged climate statistics restricted to local regions and times. The algorithm is embedded in the recently developed calibrate-emulate-sample framework, which performs efficient model calibration and uncertainty quantification with only O(10²) model evaluations, compared with O(10⁵) evaluations typically needed for traditional approaches to Bayesian calibration. We demonstrate the algorithm with an idealized GCM, with which we generate surrogates of local data. In this perfect-model setting, we calibrate parameters and quantify uncertainties in a quasi-equilibrium convection scheme in the GCM. We consider targeted data that are (a) localized in space for statistically stationary simulations, and (b) localized in space and time for seasonally varying simulations. In these proof-of-concept applications, the calculated information gain reflects the reduction in parametric uncertainty obtained from Bayesian inference when harnessing a targeted sample of data. The largest information gain typically, but not always, results from regions near the intertropical convergence zone.

Publication: Journal of Advances in Modeling Earth Systems Vol.: 14 No.: 9 ISSN: 1942-2466

ID: CaltechAUTHORS:20220926-576391900.2

]]>

Abstract: This paper is focused on the optimization approach to the solution of inverse problems. We introduce a stochastic dynamical system in which the parameter-to-data map is embedded, with the goal of employing techniques from nonlinear Kalman filtering to estimate the parameter given the data. The extended Kalman filter (which we refer to as ExKI in the context of inverse problems) can be effective for some inverse problems approached this way, but is impractical when the forward map is not readily differentiable and is given as a black box, and also for high dimensional parameter spaces because of the need to propagate large covariance matrices. Application of ensemble Kalman filters, for example use of the ensemble Kalman inversion (EKI) algorithm, has emerged as a useful tool which overcomes both of these issues: it is derivative free and works with a low-rank covariance approximation formed from the ensemble. In this paper, we work with the ExKI, EKI, and a variant on EKI which we term unscented Kalman inversion (UKI). The paper contains two main contributions. Firstly, we identify a novel stochastic dynamical system in which the parameter-to-data map is embedded. We present theory in the linear case to show exponential convergence of the mean of the filtering distribution to the solution of a regularized least squares problem. This is in contrast to previous work in which the EKI has been employed where the dynamical system used leads to algebraic convergence to an unregularized problem. Secondly, we show that the application of the UKI to this novel stochastic dynamical system yields improved inversion results, in comparison with the application of EKI to the same novel stochastic dynamical system. The numerical experiments include proof-of-concept linear examples and various applied nonlinear inverse problems: learning of permeability parameters in subsurface flow; learning the damage field from structure deformation; learning the Navier-Stokes initial condition from solution data at positive times; learning subgrid-scale parameters in a general circulation model (GCM) from time-averaged statistics.

Publication: Journal of Computational Physics Vol.: 463ISSN: 0021-9991

ID: CaltechAUTHORS:20210719-210149563

]]>

Abstract: The increasing availability of data presents an opportunity to calibrate unknown parameters which appear in complex models of phenomena in the biomedical, physical, and social sciences. However, model complexity often leads to parameter-to-data maps which are expensive to evaluate and are only available through noisy approximations. This paper is concerned with the use of interacting particle systems for the solution of the resulting inverse problems for parameters. Of particular interest is the case where the available forward model evaluations are subject to rapid fluctuations, in parameter space, superimposed on the smoothly varying large-scale parametric structure of interest. A motivating example from climate science is presented, and ensemble Kalman methods (which do not use the derivative of the parameter-to-data map) are shown, empirically, to perform well. Multiscale analysis is then used to analyze the behavior of interacting particle system algorithms when rapid fluctuations, which we refer to as noise, pollute the large-scale parametric dependence of the parameter-to-data map. Ensemble Kalman methods and Langevin-based methods (the latter use the derivative of the parameter-to-data map) are compared in this light. The ensemble Kalman methods are shown to behave favorably in the presence of noise in the parameter-to-data map, whereas Langevin methods are adversely affected. On the other hand, Langevin methods have the correct equilibrium distribution in the setting of noise-free forward models, while ensemble Kalman methods only provide an uncontrolled approximation, except in the linear case. Therefore a new class of algorithms, ensemble Gaussian process samplers, which combine the benefits of both ensemble Kalman and Langevin methods, are introduced and shown to perform favorably.

Publication: SIAM Journal on Applied Dynamical Systems Vol.: 21 No.: 2 ISSN: 1536-0040

ID: CaltechAUTHORS:20210412-121307581

]]>

Abstract: We propose a novel method for sampling and optimization tasks based on a stochastic interacting particle system. We explain how this method can be used for the following two goals: (i) generating approximate samples from a given target distribution and (ii) optimizing a given objective function. The approach is derivative-free and affine invariant, and is therefore well-suited for solving inverse problems defined by complex forward models: (i) allows generation of samples from the Bayesian posterior and (ii) allows determination of the maximum a posteriori estimator. We investigate the properties of the proposed family of methods in terms of various parameter choices, both analytically and by means of numerical simulations. The analysis and numerical simulation establish that the method has potential for general purpose optimization tasks over Euclidean space; contraction properties of the algorithm are established under suitable conditions, and computational experiments demonstrate wide basins of attraction for various specific problems. The analysis and experiments also demonstrate the potential for the sampling methodology in regimes in which the target distribution is unimodal and close to Gaussian; indeed we prove that the method recovers a Laplace approximation to the measure in certain parametric regimes and provide numerical evidence that this Laplace approximation attracts a large set of initial conditions in a number of examples.

Publication: Studies in Applied Mathematics Vol.: 148 No.: 3 ISSN: 0022-2526

ID: CaltechAUTHORS:20210719-210142693

]]>

Abstract: The recent decades have seen various attempts at accelerating the process of developing materials targeted towards specific applications. The performance required for a particular application leads to the choice of a particular material system whose properties are optimized by manipulating its underlying microstructure through processing. The specific configuration of the structure is then designed by characterizing the material in detail, and using this characterization along with physical principles in system level simulations and optimization. These have been advanced by multiscale modeling of materials, high-throughput experimentations, materials data-bases, topology optimization and other ideas. Still, developing materials for extreme applications involving large deformation, high strain rates and high temperatures remains a challenge. This article reviews a number of recent methods that advance the goal of designing materials targeted by specific applications.

Publication: Mechanics of Materials Vol.: 165ISSN: 0167-6636

ID: CaltechAUTHORS:20220121-968309000

]]>

Abstract: Inverse problems are ubiquitous because they formalize the integration of data with mathematical models. In many scientific applications the forward model is expensive to evaluate, and adjoint computations are difficult to employ; in this setting derivative-free methods which involve a small number of forward model evaluations are an attractive proposition. Ensemble Kalman-based interacting particle systems (and variants such as consensus-based and unscented Kalman approaches) have proven empirically successful in this context, but suffer from the fact that they cannot be systematically refined to return the true solution, except in the setting of linear forward models [A. Garbuno-Inigo et al., SIAM J. Appl. Dyn. Syst., 19 (2020), pp. 412-441]. In this paper, we propose a new derivative-free approach to Bayesian inversion, which may be employed for posterior sampling or for maximum a posteriori estimation, and may be systematically refined. The method relies on a fast/slow system of stochastic differential equations for the local approximation of the gradient of the log-likelihood appearing in a Langevin diffusion. Furthermore the method may be preconditioned by use of information from ensemble Kalman--based methods (and variants), providing a methodology which leverages the documented advantages of those methods, while also being provably refinable. We define the methodology, highlighting its flexibility and many variants, provide a theoretical analysis of the proposed approach, and demonstrate its efficacy by means of numerical experiments.

Publication: SIAM Journal on Applied Dynamical Systems Vol.: 21 No.: 1 ISSN: 1536-0040

ID: CaltechAUTHORS:20210719-210152979

]]>

Abstract: The macroscopic properties of materials that we observe and exploit in engineering application result from complex interactions between physics at multiple length and time scales: electronic, atomistic, defects, domains etc. Multiscale modeling seeks to understand these interactions by exploiting the inherent hierarchy where the behavior at a coarser scale regulates and averages the behavior at a finer scale. This requires the repeated solution of computationally expensive finer-scale models, and often a priori knowledge of those aspects of the finer-scale behavior that affect the coarser scale (order parameters, state variables, descriptors, etc.). We address this challenge in a two-scale setting where we learn the fine-scale behavior from off-line calculations and then use the learnt behavior directly in coarse scale calculations. The approach builds on the recent success of deep neural networks by combining their approximation power in high dimensions with ideas from model reduction. It results in a neural network approximation that has high fidelity, is computationally inexpensive, is independent of the need for a priori knowledge, and can be used directly in the coarse scale calculations. We demonstrate the approach on problems involving the impact of magnesium, a promising light-weight structural and protective material.

Publication: Journal of the Mechanics and Physics of Solids Vol.: 158ISSN: 0022-5096

ID: CaltechAUTHORS:20210225-132721680

]]>

Abstract: Graph Laplacians computed from weighted adjacency matrices are widely used to identify geometric structure in data, and clusters in particular; their spectral properties play a central role in a number of unsupervised and semi-supervised learning algorithms. When suitably scaled, graph Laplacians approach limiting continuum operators in the large data limit. Studying these limiting operators, therefore, sheds light on learning algorithms. This paper is devoted to the study of a parameterized family of divergence form elliptic operators that arise as the large data limit of graph Laplacians. The link between a three-parameter family of graph Laplacians and a three-parameter family of differential operators is explained. The spectral properties of these differential operators are analyzed in the situation where the data comprises of two nearly separated clusters, in a sense which is made precise. In particular, we investigate how the spectral gap depends on the three parameters entering the graph Laplacian, and on a parameter measuring the size of the perturbation from the perfectly clustered case. Numerical results are presented which exemplify the analysis and which extend it in the following ways: the computations study situations in which there are two nearly separated clusters, but which violate the assumptions used in our theory; situations in which more than two clusters are present, also going beyond our theory; and situations which demonstrate the relevance of our studies of differential operators for the understanding of finite data problems via the graph Laplacian. The findings provide insight into parameter choices made in learning algorithms which are based on weighted adjacency matrices; they also provide the basis for analysis of the consistency of various unsupervised and semi-supervised learning algorithms, in the large data limit.

Publication: Applied and Computational Harmonic Analysis Vol.: 56ISSN: 1063-5203

ID: CaltechAUTHORS:20200331-075759863

]]>

Abstract: We introduce a simple, rigorous, and unified framework for solving nonlinear partial differential equations (PDEs), and for solving inverse problems (IPs) involving the identification of parameters in PDEs, using the framework of Gaussian processes. The proposed approach: (1) provides a natural generalization of collocation kernel methods to nonlinear PDEs and IPs; (2) has guaranteed convergence for a very general class of PDEs, and comes equipped with a path to compute error bounds for specific PDE approximations; (3) inherits the state-of-the-art computational complexity of linear solvers for dense kernel matrices. The main idea of our method is to approximate the solution of a given PDE as the maximum a posteriori (MAP) estimator of a Gaussian process conditioned on solving the PDE at a finite number of collocation points. Although this optimization problem is infinite-dimensional, it can be reduced to a finite-dimensional one by introducing additional variables corresponding to the values of the derivatives of the solution at collocation points; this generalizes the representer theorem arising in Gaussian process regression. The reduced optimization problem has the form of a quadratic objective function subject to nonlinear constraints; it is solved with a variant of the Gauss–Newton method. The resulting algorithm (a) can be interpreted as solving successive linearizations of the nonlinear PDE, and (b) in practice is found to converge in a small number of iterations (2 to 10), for a wide range of PDEs. Most traditional approaches to IPs interleave parameter updates with numerical solution of the PDE; our algorithm solves for both parameter and PDE solution simultaneously. Experiments on nonlinear elliptic PDEs, Burgers' equation, a regularized Eikonal equation, and an IP for permeability identification in Darcy flow illustrate the efficacy and scope of our framework.

Publication: Journal of Computational Physics Vol.: 447ISSN: 0021-9991

ID: CaltechAUTHORS:20210719-210146136

]]>

Abstract: We study the problem of drift estimation for two-scale continuous time series. We set ourselves in the framework of overdamped Langevin equations, for which a single-scale surrogate homogenized equation exists. In this setting, estimating the drift coefficient of the homogenized equation requires pre-processing of the data, often in the form of subsampling; this is because the two-scale equation and the homogenized single-scale equation are incompatible at small scales, generating mutually singular measures on the path space. We avoid subsampling and work instead with filtered data, found by application of an appropriate kernel function, and compute maximum likelihood estimators based on the filtered process. We show that the estimators we propose are asymptotically unbiased and demonstrate numerically the advantages of our method with respect to subsampling. Finally, we show how our filtered data methodology can be combined with Bayesian techniques and provide a full uncertainty quantification of the inference procedure.

Publication: Foundations of Computational MathematicsISSN: 1615-3375

ID: CaltechAUTHORS:20201109-141017891

]]>

Abstract: Graph-based semi-supervised regression (SSR) involves estimating the value of a function on a weighted graph from its values (labels) on a small subset of the vertices; it can be formulated as a Bayesian inverse problem. This paper is concerned with the consistency of SSR in the context of classification, in the setting where the labels have small noise and the underlying graph weighting is consistent with well-clustered vertices. We present a Bayesian formulation of SSR in which the weighted graph defines a Gaussian prior, using a graph Laplacian, and the labeled data defines a likelihood. We analyze the rate of contraction of the posterior measure around the ground truth in terms of parameters that quantify the small label error and inherent clustering in the graph. We obtain bounds on the rates of contraction and illustrate their sharpness through numerical experiments. The analysis also gives insight into the choice of hyperparameters that enter the definition of the prior.

Publication: Inverse Problems Vol.: 37 No.: 10 ISSN: 0266-5611

ID: CaltechAUTHORS:20201109-141014452

]]>

Abstract: Well known to the machine learning community, the random feature model is a parametric approximation to kernel interpolation or regression methods. It is typically used to approximate functions mapping a finite-dimensional input space to the real line. In this paper, we instead propose a methodology for use of the random feature model as a data-driven surrogate for operators that map an input Banach space to an output Banach space. Although the methodology is quite general, we consider operators defined by partial differential equations (PDEs); here, the inputs and outputs are themselves functions, with the input parameters being functions required to specify the problem, such as initial data or coefficients, and the outputs being solutions of the problem. Upon discretization, the model inherits several desirable attributes from this infinite-dimensional viewpoint, including mesh-invariant approximation error with respect to the true PDE solution map and the capability to be trained at one mesh resolution and then deployed at different mesh resolutions. We view the random feature model as a nonintrusive data-driven emulator, provide a mathematical framework for its interpretation, and demonstrate its ability to efficiently and accurately approximate the nonlinear parameter-to-solution maps of two prototypical PDEs arising in physical science and engineering applications: the viscous Burgers' equation and a variable coefficient elliptic equation.

Publication: SIAM Journal on Scientific Computing Vol.: 43 No.: 5 ISSN: 1064-8275

ID: CaltechAUTHORS:20200527-073449881

]]>

Abstract: Parameters in climate models are usually calibrated manually, exploiting only small subsets of the available data. This precludes both optimal calibration and quantification of uncertainties. Traditional Bayesian calibration methods that allow uncertainty quantification are too expensive for climate models; they are also not robust in the presence of internal climate variability. For example, Markov chain Monte Carlo (MCMC) methods typically require O(10⁵) model runs and are sensitive to internal variability noise, rendering them infeasible for climate models. Here we demonstrate an approach to model calibration and uncertainty quantification that requires only O(10²) model runs and can accommodate internal climate variability. The approach consists of three stages: (a) a calibration stage uses variants of ensemble Kalman inversion to calibrate a model by minimizing mismatches between model and data statistics; (b) an emulation stage emulates the parameter-to-data map with Gaussian processes (GP), using the model runs in the calibration stage for training; (c) a sampling stage approximates the Bayesian posterior distributions by sampling the GP emulator with MCMC. We demonstrate the feasibility and computational efficiency of this calibrate-emulate-sample (CES) approach in a perfect-model setting. Using an idealized general circulation model, we estimate parameters in a simple convection scheme from synthetic data generated with the model. The CES approach generates probability distributions of the parameters that are good approximations of the Bayesian posteriors, at a fraction of the computational cost usually required to obtain them. Sampling from this approximate posterior allows the generation of climate predictions with quantified parametric uncertainties.

Publication: Journal of Advances in Modelling Earth Systems Vol.: 13 No.: 9 ISSN: 1942-2466

ID: CaltechAUTHORS:20210113-143919927

]]>

Abstract: Data-driven prediction is becoming increasingly widespread as the volume of data available grows and as algorithmic development matches this growth. The nature of the predictions made and the manner in which they should be interpreted depend crucially on the extent to which the variables chosen for prediction are Markovian or approximately Markovian. Multiscale systems provide a framework in which this issue can be analyzed. In this work kernel analog forecasting methods are studied from the perspective of data generated by multiscale dynamical systems. The problems chosen exhibit a variety of different Markovian closures, using both averaging and homogenization; furthermore, settings where scale separation is not present and the predicted variables are non-Markovian are also considered. The studies provide guidance for the interpretation of data-driven prediction methods when used in practice.

Publication: Multiscale Modeling and Simulation Vol.: 19 No.: 2 ISSN: 1540-3459

ID: CaltechAUTHORS:20201109-140959408

]]>

Abstract: Gaussian process regression has proven very powerful in statistics, machine learning and inverse problems. A crucial aspect of the success of this methodology, in a wide range of applications to complex and real-world problems, is hierarchical modeling and learning of hyperparameters. The purpose of this paper is to study two paradigms of learning hierarchical parameters: one is from the probabilistic Bayesian perspective, in particular, the empirical Bayes approach that has been largely used in Bayesian statistics; the other is from the deterministic and approximation theoretic view, and in particular the kernel flow algorithm that was proposed recently in the machine learning literature. Analysis of their consistency in the large data limit, as well as explicit identification of their implicit bias in parameter learning, are established in this paper for a Matérn-like model on the torus. A particular technical challenge we overcome is the learning of the regularity parameter in the Matérn-like field, for which consistency results have been very scarce in the spatial statistics literature. Moreover, we conduct extensive numerical experiments beyond the Matérn-like model, comparing the two algorithms further. These experiments demonstrate learning of other hierarchical parameters, such as amplitude and lengthscale; they also illustrate the setting of model misspecification in which the kernel flow approach could show superior performance to the more traditional empirical Bayes approach.

Publication: Mathematics of Computation Vol.: 90ISSN: 0025-5718

ID: CaltechAUTHORS:20201109-141002843

]]>

Abstract: Many parameter estimation problems arising in applications can be cast in the framework of Bayesian inversion. This allows not only for an estimate of the parameters, but also for the quantification of uncertainties in the estimates. Often in such problems the parameter-to-data map is very expensive to evaluate, and computing derivatives of the map, or derivative-adjoints, may not be feasible. Additionally, in many applications only noisy evaluations of the map may be available. We propose an approach to Bayesian inversion in such settings that builds on the derivative-free optimization capabilities of ensemble Kalman inversion methods. The overarching approach is to first use ensemble Kalman sampling (EKS) to calibrate the unknown parameters to fit the data; second, to use the output of the EKS to emulate the parameter-to-data map; third, to sample from an approximate Bayesian posterior distribution in which the parameter-to-data map is replaced by its emulator. This results in a principled approach to approximate Bayesian inference that requires only a small number of evaluations of the (possibly noisy approximation of the) parameter-to-data map. It does not require derivatives of this map, but instead leverages the documented power of ensemble Kalman methods. Furthermore, the EKS has the desirable property that it evolves the parameter ensemble towards the regions in which the bulk of the parameter posterior mass is located, thereby locating them well for the emulation phase of the methodology. In essence, the EKS methodology provides a cheap solution to the design problem of where to place points in parameter space to efficiently train an emulator of the parameter-to-data map for the purposes of Bayesian inversion.

Publication: Journal of Computational Physics Vol.: 424ISSN: 0021-9991

ID: CaltechAUTHORS:20200402-140348174

]]>

Abstract: Gradient descent-based optimization methods underpin the parameter training of neural networks, and hence comprise a significant component in the impressive test results found in a number of applications. Introducing stochasticity is key to their success in practical problems, and there is some understanding of the role of stochastic gradient descent in this context. Momentum modifications of gradient descent such as Polyak's Heavy Ball method (HB) and Nesterov's method of accelerated gradients (NAG), are also widely adopted. In this work our focus is on understanding the role of momentum in the training of neural networks, concentrating on the common situation in which the momentum contribution is fixed at each step of the algorithm. To expose the ideas simply we work in the deterministic setting. Our approach is to derive continuous time approximations of the discrete algorithms; these continuous time approximations provide insights into the mechanisms at play within the discrete algorithms. We prove three such approximations. Firstly we show that standard implementations of fixed momentum methods approximate a time-rescaled gradient descent flow, asymptotically as the learning rate shrinks to zero; this result does not distinguish momentum methods from pure gradient descent, in the limit of vanishing learning rate. We then proceed to prove two results aimed at understanding the observed practical advantages of fixed momentum methods over gradient descent, when implemented in the non-asymptotic regime with fixed small, but non-zero, learning rate. We achieve this by proving approximations to continuous time limits in which the small but fixed learning rate appears as a parameter; this is known as the method of modified equations in the numerical analysis literature, recently rediscovered as the high resolution ODE approximation in the machine learning context. In our second result we show that the momentum method is approximated by a continuous time gradient flow, with an additional momentum-dependent second order time-derivative correction, proportional to the learning rate; this may be used to explain the stabilizing effect of momentum algorithms in their transient phase. Furthermore in a third result we show that the momentum methods admit an exponentially attractive invariant manifold on which the dynamics reduces, approximately, to a gradient flow with respect to a modified loss function, equal to the original loss function plus a small perturbation proportional to the learning rate; this small correction provides convexification of the loss function and encodes additional robustness present in momentum methods, beyond the transient phase.

Publication: Journal of Machine Learning Research Vol.: 22 No.: 17 ISSN: 1533-7928

ID: CaltechAUTHORS:20210503-091850360

]]>

Abstract: Scalings in which the graph Laplacian approaches a differential operator in the large graph limit are used to develop understanding of a number of algorithms for semi-supervised learning; in particular the extension, to this graph setting, of the probit algorithm, level set and kriging methods, are studied. Both optimization and Bayesian approaches are considered, based around a regularizing quadratic form found from an affine transformation of the Laplacian, raised to a, possibly fractional, exponent. Conditions on the parameters defining this quadratic form are identified under which well-defined limiting continuum analogues of the optimization and Bayesian semi-supervised learning problems may be found, thereby shedding light on the design of algorithms in the large graph setting. The large graph limits of the optimization formulations are tackled through Γ−convergence, using the recently introduced TL^p metric. The small labelling noise limits of the Bayesian formulations are also identified, and contrasted with pre-existing harmonic function approaches to the problem.

Publication: Applied and Computational Harmonic Analysis Vol.: 49 No.: 2 ISSN: 1063-5203

ID: CaltechAUTHORS:20190404-103712251

]]>

Abstract: A central theme in classical algorithms for the reconstruction of discontinuous functions from observational data is perimeter regularization via the use of total variation. On the other hand, sparse or noisy data often demand a probabilistic approach to the reconstruction of images, to enable uncertainty quantification; the Bayesian approach to inversion, which itself introduces a form of regularization, is a natural framework in which to carry this out. In this paper the link between Bayesian inversion methods and perimeter regularization is explored. In this paper two links are studied: (i) the maximum a posteriori objective function of a suitably chosen Bayesian phase-field approach is shown to be closely related to a least squares plus perimeter regularization objective; (ii) sample paths of a suitably chosen Bayesian level set formulation are shown to possess a finite perimeter and to have the ability to learn about the true perimeter.

Publication: SIAM Journal on Scientific Computing Vol.: 42 No.: 4 ISSN: 1064-8275

ID: CaltechAUTHORS:20170612-125032088

]]>

Abstract: Graph-based semi-supervised learning is the problem of propagating labels from a small number of labelled data points to a larger set of unlabelled data. This paper is concerned with the consistency of optimization-based techniques for such problems, in the limit where the labels have small noise and the underlying unlabelled data is well clustered. We study graph-based probit for binary classification, and a natural generalization of this method to multi-class classification using one-hot encoding. The resulting objective function to be optimized comprises the sum of a quadratic form defined through a rational function of the graph Laplacian, involving only the unlabelled data, and a fidelity term involving only the labelled data. The consistency analysis sheds light on the choice of the rational function defining the optimization.

Publication: Journal of Machine Learning Research Vol.: 21ISSN: 1533-7928

ID: CaltechAUTHORS:20190722-075729960

]]>

Abstract: We present a sequential data assimilation algorithm based on the ensemble Kalman inversion to estimate the near‐surface shear‐wave velocity profile and damping; this is applicable when heterogeneous data and a priori information that can be represented in forms of (physical) equality and inequality constraints in the inverse problem are available. Although noninvasive methods, such as surface‐wave testing, are efficient and cost‐effective methods for inferring an V_S profile, one should acknowledge that site characterization using inverse analyses can yield erroneous results associated with the lack of inverse problem uniqueness. One viable solution to alleviate the unsuitability of the inverse problem is to enrich the prior knowledge and/or the data space with complementary observations. In the case of noninvasive methods, the pertinent data are the dispersion curve of surface waves, typically resolved by means of active source methods at high frequencies and passive methods at low frequencies. To improve the inverse problem suitability, horizontal‐to‐vertical spectral ratio data are commonly used jointly with the dispersion data in the inversion. In this article, we show that the joint inversion of dispersion and strong‐motion downhole array data can also reduce the margins of uncertainty in the V_S profile estimation. This is because acceleration time series recorded at downhole arrays include both body and surface waves and therefore can enrich the observational data space in the inverse problem setting. We also show how the proposed algorithm can be modified to systematically incorporate physical constraints that further enhance its suitability. We use both synthetic and real data to examine the performance of the proposed framework in estimation of the V_S profile and damping at the Garner Valley downhole array and compare them against the V_S estimations in previous studies.

Publication: Bulletin of the Seismological Society of America Vol.: 110 No.: 3 ISSN: 0037-1106

ID: CaltechAUTHORS:20200506-121245893

]]>

Abstract: Many naturally occurring models in the sciences are well approximated by simplified models using multiscale techniques. In such settings it is natural to ask about the relationship between inverse problems defined by the original problem and by the multiscale approximation. We develop an approach to this problem and exemplify it in the context of optical tomographic imaging. Optical tomographic imaging is a technique for inferring the properties of biological tissue via measurements of the incoming and outgoing light intensity; it may be used as a medical imaging methodology. Mathematically, light propagation is modeled by the radiative transfer equation (RTE), and optical tomography amounts to reconstructing the scattering and the absorption coefficients in the RTE from boundary measurements. We study this problem in the Bayesian framework, focussing on the strong scattering regime. In this regime the forward RTE is close to the diffusion equation (DE). We study the RTE in the asymptotic regime where the forward problem approaches the DE and prove convergence of the inverse RTE to the inverse DE in both nonlinear and linear settings. Convergence is proved by studying the distance between the two posterior distributions using the Hellinger metric and using the Kullback--Leibler divergence.

Publication: Multiscale Modeling and Simulation Vol.: 18 No.: 2 ISSN: 1540-3459

ID: CaltechAUTHORS:20190722-155900728

]]>

Abstract: Ensemble Kalman inversion is a parallelizable methodology for solving inverse or parameter estimation problems. Although it is based on ideas from Kalman filtering, it may be viewed as a derivative-free optimization method. In its most basic form it regularizes ill-posed inverse problems through the subspace property: the solution found is in the linear span of the initial ensemble employed. In this work we demonstrate how further regularization can be imposed, incorporating prior information about the underlying unknown. In particular we study how to impose Tikhonov-like Sobolev penalties. As well as introducing this modified ensemble Kalman inversion methodology, we also study its continuous-time limit, proving ensemble collapse; in the language of multi-agent optimization this may be viewed as reaching consensus. We also conduct a suite of numerical experiments to highlight the benefits of Tikhonov regularization in the ensemble inversion context.

Publication: SIAM Journal on Numerical Analysis Vol.: 58 No.: 2 ISSN: 0036-1429

ID: CaltechAUTHORS:20190719-130631059

]]>

Abstract: Discrete optimal transportation problems arise in various contexts in engineering, the sciences, and the social sciences. Often the underlying cost criterion is unknown, or only partly known, and the observed optimal solutions are corrupted by noise. In this paper we propose a systematic approach to infer unknown costs from noisy observations of optimal transportation plans. The algorithm requires only the ability to solve the forward optimal transport problem, which is a linear program, and to generate random numbers. It has a Bayesian interpretation and may also be viewed as a form of stochastic optimization. We illustrate the developed methodologies using the example of international migration flows. Reported migration flow data captures (noisily) the number of individuals moving from one country to another in a given period of time. It can be interpreted as a noisy observation of an optimal transportation map, with costs related to the geographical position of countries. We use a graph-based formulation of the problem, with countries at the nodes of graphs and nonzero weighted adjacencies only on edges between countries which share a border. We use the proposed algorithm to estimate the weights, which represent cost of transition, and to quantify uncertainty in these weights.

Publication: SIAM Journal on Applied Mathematics Vol.: 80 No.: 1 ISSN: 0036-1399

ID: CaltechAUTHORS:20190722-082837777

]]>

Abstract: Solving inverse problems without the use of derivatives or adjoints of the forward model is highly desirable in many applications arising in science and engineering. In this paper we propose a new version of such a methodology, a framework for its analysis, and numerical evidence of the practicality of the method proposed. Our starting point is an ensemble of overdamped Langevin diffusions which interact through a single preconditioner computed as the empirical ensemble covariance. We demonstrate that the nonlinear Fokker--Planck equation arising from the mean-field limit of the associated stochastic differential equation (SDE) has a novel gradient flow structure, built on the Wasserstein metric and the covariance matrix of the noisy flow. Using this structure, we investigate large time properties of the Fokker--Planck equation, showing that its invariant measure coincides with that of a single Langevin diffusion, and demonstrating exponential convergence to the invariant measure in a number of settings. We introduce a new noisy variant on ensemble Kalman inversion (EKI) algorithms found from the original SDE by replacing exact gradients with ensemble differences; this defines the ensemble Kalman sampler (EKS). Numerical results are presented which demonstrate its efficacy as a derivative-free approximate sampler for the Bayesian posterior arising from inverse problems.

Publication: SIAM Journal on Applied Dynamical Systems Vol.: 19 No.: 1 ISSN: 1536-0040

ID: CaltechAUTHORS:20190722-103410192

]]>

Abstract: Probabilistic integration of a continuous dynamical system is a way of systematically introducing discretisation error, at scales no larger than errors introduced by standard numerical discretisation, in order to enable thorough exploration of possible responses of the system to inputs. It is thus a potentially useful approach in a number of applications such as forward uncertainty quantification, inverse problems, and data assimilation. We extend the convergence analysis of probabilistic integrators for deterministic ordinary differential equations, as proposed by Conrad et al. (Stat Comput 27(4):1065–1082, 2017. https://doi.org/10.1007/s11222-016-9671-0), to establish mean-square convergence in the uniform norm on discrete- or continuous-time solutions under relaxed regularity assumptions on the driving vector fields and their induced flows. Specifically, we show that randomised high-order integrators for globally Lipschitz flows and randomised Euler integrators for dissipative vector fields with polynomially bounded local Lipschitz constants all have the same mean-square convergence rate as their deterministic counterparts, provided that the variance of the integration noise is not of higher order than the corresponding deterministic integrator. These and similar results are proven for probabilistic integrators where the random perturbations may be state-dependent, non-Gaussian, or non-centred random variables.

Publication: Statistics and Computing Vol.: 29 No.: 6 ISSN: 0960-3174

ID: CaltechAUTHORS:20170612-123841285

]]>

Abstract: The Random Walk Metropolis (RWM) algorithm is a Metropolis–Hastings Markov Chain Monte Carlo algorithm designed to sample from a given target distribution π^N with Lebesgue density on R^N. Like any other Metropolis–Hastings algorithm, RWM constructs a Markov chain by randomly proposing a new position (the “proposal move”), which is then accepted or rejected according to a rule which makes the chain reversible with respect to π^N. When the dimension N is large, a key question is to determine the optimal scaling with N of the proposal variance: if the proposal variance is too large, the algorithm will reject the proposed moves too often; if it is too small, the algorithm will explore the state space too slowly. Determining the optimal scaling of the proposal variance gives a measure of the cost of the algorithm as well. One approach to tackle this issue, which we adopt here, is to derive diffusion limits for the algorithm. Such an approach has been proposed in the seminal papers (Ann. Appl. Probab. 7 (1) (1997) 110–120; J. R. Stat. Soc. Ser. B. Stat. Methodol. 60 (1) (1998) 255–268). In particular, in (Ann. Appl. Probab. 7 (1) (1997) 110–120) the authors derive a diffusion limit for the RWM algorithm under the two following assumptions: (i) the algorithm is started in stationarity; (ii) the target measure π^N is in product form. The present paper considers the situation of practical interest in which both assumptions (i) and (ii) are removed. That is (a) we study the case (which occurs in practice) in which the algorithm is started out of stationarity and (b) we consider target measures which are in non-product form. Roughly speaking, we consider target measures that admit a density with respect to Gaussian; such measures arise in Bayesian nonparametric statistics and in the study of conditioned diffusions. We prove that, out of stationarity, the optimal scaling for the proposal variance is O(N^(−1)), as it is in stationarity. In this optimal scaling, a diffusion limit is obtained and the cost of reaching and exploring the invariant measure scales as O(N). Notice that the optimal scaling in and out of stationatity need not be the same in general, and indeed they differ e.g. in the case of the MALA algorithm (Stoch. Partial Differ. Equ. Anal Comput. 6 (3) (2018) 446–499). More importantly, our diffusion limit is given by a stochastic PDE, coupled to a scalar ordinary differential equation; such an ODE gives a measure of how far from stationarity the process is and can therefore be taken as an indicator of convergence. In this sense, this paper contributes understanding to the old-standing problem of monitoring convergence of MCMC algorithms.

Publication: Annales De l'Institut Henri Poincaré - Probabilitiés et Statistiques Vol.: 55 No.: 3 ISSN: 0246-0203

ID: CaltechAUTHORS:20161221-115035181

]]>

Abstract: The standard probabilistic perspective on machine learning gives rise to empirical risk-minimization tasks that are frequently solved by stochastic gradient descent (SGD) and variants thereof. We present a formulation of these tasks as classical inverse or filtering problems and, furthermore, we propose an efficient, gradient-free algorithm for finding a solution to these problems using ensemble Kalman inversion (EKI). The method is inherently parallelizable and is applicable to problems with non-differentiable loss functions, for which back-propagation is not possible. Applications of our approach include offline and online supervised learning with deep neural networks, as well as graph-based semi-supervised learning. The essence of the EKI procedure is an ensemble based approximate gradient descent in which derivatives are replaced by differences from within the ensemble. We suggest several modifications to the basic method, derived from empirically successful heuristics developed in the context of SGD. Numerical results demonstrate wide applicability and robustness of the proposed algorithm.

Publication: Inverse Problems Vol.: 35 No.: 9 ISSN: 0266-5611

ID: CaltechAUTHORS:20190404-111033209

]]>

Abstract: Ensemble Kalman methods constitute an increasingly important tool in both state and parameter estimation problems. Their popularity stems from the derivative-free nature of the methodology which may be readily applied when computer code is available for the underlying state-space dynamics (for state estimation) or for the parameter-to-observable map (for parameter estimation). There are many applications in which it is desirable to enforce prior information in the form of equality or inequality constraints on the state or parameter. This paper establishes a general framework for doing so, describing a widely applicable methodology, a theory which justifies the methodology, and a set of numerical experiments exemplifying it.

Publication: Inverse Problems Vol.: 35 No.: 9 ISSN: 0266-5611

ID: CaltechAUTHORS:20190722-155445728

]]>

Abstract: Data assimilation refers to the methodology of combining dynamical models and observed data with the objective of improving state estimation. Most data assimilation algorithms are viewed as approximations of the Bayesian posterior (filtering distribution) on the signal given the observations. Some of these approximations are controlled, such as particle filters which may be refined to produce the true filtering distribution in the large particle number limit, and some are uncontrolled, such as ensemble Kalman filter methods which do not recover the true filtering distribution in the large ensemble limit. Other data assimilation algorithms, such as cycled 3DVAR methods, may be thought of as controlled estimators of the state, in the small observational noise scenario, but are also uncontrolled in general in relation to the true filtering distribution. For particle filters and ensemble Kalman filters it is of practical importance to understand how and why data assimilation methods can be effective when used with a fixed small number of particles, since for many large-scale applications it is not practical to deploy algorithms close to the large particle limit asymptotic. In this paper, the authors address this question for particle filters and, in particular, study their accuracy (in the small noise limit) and ergodicity (for noisy signal and observation) without appealing to the large particle number limit. The authors first overview the accuracy and minorization properties for the true filtering distribution, working in the setting of conditional Gaussianity for the dynamics-observation model. They then show that these properties are inherited by optimal particle filters for any fixed number of particles, and use the minorization to establish ergodicity of the filters. For completeness we also prove large particle number consistency results for the optimal particle filters, by writing the update equations for the underlying distributions as recursions. In addition to looking at the optimal particle filter with standard resampling, they derive all the above results for (what they term) the Gaussianized optimal particle filter and show that the theoretical properties are favorable for this method, when compared to the standard optimal particle filter.

Publication: Chinese Annals of Mathematics, Series B Vol.: 40 No.: 5 ISSN: 0252-9599

ID: CaltechAUTHORS:20161221-161911353

]]>

Abstract: In this paper we develop a framework for parameter estimation in macroscopic pedestrian models using individual trajectories---microscopic data. We consider a unidirectional flow of pedestrians in a corridor and assume that the velocity decreases with the average density according to the fundamental diagram. Our model is formed from a coupling between a density dependent stochastic differential equation and a nonlinear partial differential equation for the density, and is hence of McKean--Vlasov type. We discuss identifiability of the parameters appearing in the fundamental diagram from trajectories of individuals, and we introduce optimization and Bayesian methods to perform the identification. We analyze the performance of the developed methodologies in various situations, such as for different in- and outflow conditions, for varying numbers of individual trajectories, and for differing channel geometries.

Publication: SIAM Journal on Applied Mathematics Vol.: 79 No.: 4 ISSN: 0036-1399

ID: CaltechAUTHORS:20190719-112058516

]]>

Abstract: Semi-supervised learning uses underlying relationships in data with a scarcity of ground-truth labels. In this paper, we introduce an uncertainty quantification (UQ) method for graph-based semi-supervised multi-class classification problems. We not only predict the class label for each data point, but also provide a confidence score for the prediction. We adopt a Bayesian approach and propose a graphical multi-class probit model together with an effective Gibbs sampling procedure. Furthermore, we propose a confidence measure for each data point that correlates with the classification performance. We use the empirical properties of the proposed confidence measure to guide the design of a human-in-the-loop system. The uncertainty quantification algorithm and the human-in-the-loop system are successfully applied to classification problems in image processing and ego-motion analysis of body-worn videos.

Publication: Electronic Imaging Vol.: 2019ISSN: 2470-1173

ID: CaltechAUTHORS:20190723-085611528

]]>

Abstract: We introduce data assimilation as a computational method that uses machine learning to combine data with human knowledge in the form of mechanistic models in order to forecast future states, to impute missing data from the past by smoothing, and to infer measurable and unmeasurable quantities that represent clinically and scientifically important phenotypes. We demonstrate the advantages it affords in the context of type 2 diabetes by showing how data assimilation can be used to forecast future glucose values, to impute previously missing glucose values, and to infer type 2 diabetes phenotypes. At the heart of data assimilation is the mechanistic model, here an endocrine model. Such models can vary in complexity, contain testable hypotheses about important mechanics that govern the system (eg, nutrition’s effect on glucose), and, as such, constrain the model space, allowing for accurate estimation using very little data.

Publication: Journal of the American Medical Informatics Association Vol.: 25 No.: 10 ISSN: 1067-5027

ID: CaltechAUTHORS:20181023-111929468

]]>

Abstract: Recent research has shown the potential utility of deep Gaussian processes. These deep structures are probability distributions, designed through hierarchical construction, which are conditionally Gaussian. In this paper, the current published body of work is placed in a common framework and, through recursion, several classes of deep Gaussian processes are defined. The resulting samples generated from a deep Gaussian process have a Markovian structure with respect to the depth parameter, and the effective depth of the resulting process is interpreted in terms of the ergodicity, or non-ergodicity, of the resulting Markov chain. For the classes of deep Gaussian processes introduced, we provide results concerning their ergodicity and hence their effective depth. We also demonstrate how these processes may be used for inference; in particular we show how a Metropolis-within-Gibbs construction across the levels of the hierarchy can be used to derive sampling tools which are robust to the level of resolution used to represent the functions on a computer. For illustration, we consider the effect of ergodicity in some simple numerical examples.

Publication: Journal of Machine Learning Research Vol.: 19 No.: 54 ISSN: 1533-7928

ID: CaltechAUTHORS:20181108-140320751

]]>

Abstract: The Metropolis-Adjusted Langevin Algorithm (MALA) is a Markov Chain Monte Carlo method which creates a Markov chain reversible with respect to a given target distribution, π^N, with Lebesgue density on R^N; it can hence be used to approximately sample the target distribution. When the dimension N is large a key question is to determine the computational cost of the algorithm as a function of N. The measure of efficiency that we consider in this paper is the expected squared jumping distance (ESJD), introduced in Roberts et al. (Ann Appl Probab 7(1):110–120, 1997). To determine how the cost of the algorithm (in terms of ESJD) increases with dimension N, we adopt the widely used approach of deriving a diffusion limit for the Markov chain produced by the MALA algorithm. We study this problem for a class of target measures which is not in product form and we address the situation of practical relevance in which the algorithm is started out of stationarity. We thereby significantly extend previous works which consider either measures of product form, when the Markov chain is started out of stationarity, or non-product measures (defined via a density with respect to a Gaussian), when the Markov chain is started in stationarity. In order to work in this non-stationary and non-product setting, significant new analysis is required. In particular, our diffusion limit comprises a stochastic PDE coupled to a scalar ordinary differential equation which gives a measure of how far from stationarity the process is. The family of non-product target measures that we consider in this paper are found from discretization of a measure on an infinite dimensional Hilbert space; the discretised measure is defined by its density with respect to a Gaussian random field. The results of this paper demonstrate that, in the non-stationary regime, the cost of the algorithm is of O(N^(1/2)) in contrast to the stationary regime, where it is of O(N^(1/3)).

Publication: Stochastics and Partial Differential Equations: Analysis and Computations Vol.: 6 No.: 3 ISSN: 2194-0401

ID: CaltechAUTHORS:20161220-175620681

]]>

Abstract: The use of ensemble methods to solve inverse problems is attractive because it is a derivative-free methodology which is also well-adapted to parallelization. In its basic iterative form the method produces an ensemble of solutions which lie in the linear span of the initial ensemble. Choice of the parameterization of the unknown field is thus a key component of the success of the method. We demonstrate how both geometric ideas and hierarchical ideas can be used to design effective parameterizations for a number of applied inverse problems arising in electrical impedance tomography, groundwater flow and source inversion. In particular we show how geometric ideas, including the level set method, can be used to reconstruct piecewise continuous fields, and we show how hierarchical methods can be used to learn key parameters in continuous fields, such as length-scales, resulting in improved reconstructions. Geometric and hierarchical ideas are combined in the level set method to find piecewise constant reconstructions with interfaces of unknown topology.

Publication: Inverse Problems Vol.: 34 No.: 5 ISSN: 0266-5611

ID: CaltechAUTHORS:20180413-092058450

]]>

Abstract: Classification of high dimensional data finds wide-ranging applications. In many of these applications equipping the resulting classification with a measure of uncertainty may be as important as the classification itself. In this paper we introduce, develop algorithms for, and investigate the properties of a variety of Bayesian models for the task of binary classification; via the posterior distribution on the classification labels, these methods automatically give measures of uncertainty. The methods are all based on the graph formulation of semisupervised learning. We provide a unified framework which brings together a variety of methods that have been introduced in different communities within the mathematical sciences. We study probit classification [C. K. Williams and C. E. Rasmussen, “Gaussian Processes for Regression,” in Advances in Neural Information Processing Systems 8, MIT Press, 1996, pp. 514--520] in the graph-based setting, generalize the level-set method for Bayesian inverse problems [M. A. Iglesias, Y. Lu, and A. M. Stuart, Interfaces Free Bound., 18 (2016), pp. 181--217] to the classification setting, and generalize the Ginzburg--Landau optimization-based classifier [A. L. Bertozzi and A. Flenner, Multiscale Model. Simul., 10 (2012), pp. 1090--1118], [Y. Van Gennip and A. L. Bertozzi, Adv. Differential Equations, 17 (2012), pp. 1115--1180] to a Bayesian setting. We also show that the probit and level-set approaches are natural relaxations of the harmonic function approach introduced in [X. Zhu et al., “Semi-supervised Learning Using Gaussian Fields and Harmonic Functions,” in ICML, Vol. 3, 2003, pp. 912--919]. We introduce efficient numerical methods, suited to large datasets, for both MCMC-based sampling and gradient-based MAP estimation. Through numerical experiments we study classification accuracy and uncertainty quantification for our models; these experiments showcase a suite of datasets commonly used to evaluate graph-based semisupervised learning algorithms.

Publication: SIAM/ASA Journal on Uncertainty Quantification Vol.: 6 No.: 2 ISSN: 2166-2525

ID: CaltechAUTHORS:20170712-141757416

]]>

Abstract: We consider stochastic semi-linear evolution equations which are driven by additive, spatially correlated, Wiener noise, and in particular consider problems of heat equation (analytic semigroup) and damped-driven wave equations (bounded semigroup) type. We discretize these equations by means of a spectral Galerkin projection, and we study the approximation of the probability distribution of the trajectories: test functions are regular, but depend on the values of the process on the interval [0, T ]. We introduce a new approach in the context of quantative weak error analysis for discretization of SPDEs. The weak error is formulated using a deterministic function (Itô map) of the stochastic convolution found when the nonlinear term is dropped. The regularity properties of the Itô map are exploited, and in particular second-order Taylor expansions employed, to transfer the error from spectral approximation of the stochastic convolution into the weak error of interest. We prove that the weak rate of convergence is twice the strong rate of convergence in two situations. First, we assume that the covariance operator commutes with the generator of the semigroup: the first order term in the weak error expansion cancels out thanks to an independence property. Second, we remove the commuting assumption, and extend the previous result, thanks to the analysis of a new error term depending on a commutator.

Publication: Journal of Computational Mathematics Vol.: 36 No.: 2 ISSN: 0254-9409

ID: CaltechAUTHORS:20161221-110122611

]]>

Abstract: We study the use of Gaussian process emulators to approximate the parameter-to-observation map or the negative log-likelihood in Bayesian inverse problems. We prove error bounds on the Hellinger distance between the true posterior distribution and various approximations based on the Gaussian process emulator. Our analysis includes approximations based on the mean of the predictive process, as well as approximations based on the full Gaussian process emulator. Our results show that the Hellinger distance between the true posterior and its approximations can be bounded by moments of the error in the emulator. Numerical results confirm our theoretical findings.

Publication: Mathematics of Computation Vol.: 87 No.: 310 ISSN: 0025-5718

ID: CaltechAUTHORS:20161221-104520265

]]>

Abstract: In computational inverse problems, it is common that a detailed and accurate forward model is approximated by a computationally less challenging substitute. The model reduction may be necessary to meet constraints in computing time when optimization algorithms are used to find a single estimate, or to speed up Markov chain Monte Carlo (MCMC) calculations in the Bayesian framework. The use of an approximate model introduces a discrepancy, or modeling error, that may have a detrimental effect on the solution of the ill-posed inverse problem, or it may severely distort the estimate of the posterior distribution. In the Bayesian paradigm, the modeling error can be considered as a random variable, and by using an estimate of the probability distribution of the unknown, one may estimate the probability distribution of the modeling error and incorporate it into the inversion. We introduce an algorithm which iterates this idea to update the distribution of the model error, leading to a sequence of posterior distributions that are demonstrated empirically to capture the underlying truth with increasing accuracy. Since the algorithm is not based on rejections, it requires only limited full model evaluations. We show analytically that, in the linear Gaussian case, the algorithm converges geometrically fast with respect to the number of iterations when the data is finite dimensional. For more general models, we introduce particle approximations of the iteratively generated sequence of distributions; we also prove that each element of the sequence converges in the large particle limit under a simplifying assumption. We show numerically that, as in the linear case, rapid convergence occurs with respect to the number of iterations. Additionally, we show through computed examples that point estimates obtained from this iterative algorithm are superior to those obtained by neglecting the model error.

Publication: Inverse Problems Vol.: 34 No.: 2 ISSN: 0266-5611

ID: CaltechAUTHORS:20170801-155824887

]]>

Abstract: We present an analysis of ensemble Kalman inversion, based on the continuous time limit of the algorithm. The analysis of the dynamical behaviour of the ensemble allows us to establish well-posedness and convergence results for a fixed ensemble size. We will build on recent results on the convergence in the noise-free case and generalise them to the case of noisy observational data, in particular the influence of the noise on the convergence will be investigated, both theoretically and numerically. We focus on linear inverse problems where a very complete theoretical analysis is possible.

Publication: Applicable Analysis: An International Journal Vol.: 97 No.: 1 ISSN: 0003-6811

ID: CaltechAUTHORS:20180102-081954370

]]>

Abstract: Climate projections continue to be marred by large uncertainties, which originate in processes that need to be parameterized, such as clouds, convection, and ecosystems. But rapid progress is now within reach. New computational tools and methods from data assimilation and machine learning make it possible to integrate global observations and local high-resolution simulations in an Earth system model (ESM) that systematically learns from both and quantifies uncertainties. Here we propose a blueprint for such an ESM. We outline how parameterization schemes can learn from global observations and targeted high-resolution simulations, for example, of clouds and convection, through matching low-order statistics between ESMs, observations, and high-resolution simulations. We illustrate learning algorithms for ESMs with a simple dynamical system that shares characteristics of the climate system; and we discuss the opportunities the proposed framework presents and the challenges that remain to realize it.

Publication: Geophysical Research Letters Vol.: 44 No.: 24 ISSN: 0094-8276

ID: CaltechAUTHORS:20171201-113659166

]]>

Abstract: This paper concerns the approximation of probability measures on R^d with respect to the Kullback-Leibler divergence. Given an admissible target measure, we show the existence of the best approximation, with respect to this divergence, from certain sets of Gaussian measures and Gaussian mixtures. The asymptotic behavior of such best approximations is then studied in the small parameter limit where the measure concentrates; this asympotic behavior is characterized using Γ-convergence. The theory developed is then applied to understand the frequentist consistency of Bayesian inverse problems in finite dimensions. For a fixed realization of additive observational noise, we show the asymptotic normality of the posterior measure in the small noise limit. Taking into account the randomness of the noise, we prove a Bernstein-Von Mises type result for the posterior measure.

Publication: SIAM/ASA Journal on Uncertainty Quantification Vol.: 5 No.: 1 ISSN: 2166-2525

ID: CaltechAUTHORS:20161221-163341129

]]>

Abstract: The level set approach has proven widely successful in the study of inverse problems for interfaces, since its systematic development in the 1990s. Recently it has been employed in the context of Bayesian inversion, allowing for the quantification of uncertainty within the reconstruction of interfaces. However, the Bayesian approach is very sensitive to the length and amplitude scales in the prior probabilistic model. This paper demonstrates how the scale-sensitivity can be circumvented by means of a hierarchical approach, using a single scalar parameter. Together with careful consideration of the development of algorithms which encode probability measure equivalences as the hierarchical parameter is varied, this leads to well-defined Gibbs-based MCMC methods found by alternating Metropolis–Hastings updates of the level set function and the hierarchical parameter. These methods demonstrably outperform non-hierarchical Bayesian level set methods.

Publication: Statistics and Computing Vol.: 27 No.: 6 ISSN: 0960-3174

ID: CaltechAUTHORS:20161109-074003000

]]>

Abstract: We study Gaussian approximations to the distribution of a diffusion. The approximations are easy to compute: they are defined by two simple ordinary differential equations for the mean and the covariance. Time correlations can also be computed via solution of a linear stochastic differential equation. We show, using the Kullback–Leibler divergence, that the approximations are accurate in the small noise regime. An analogous discrete time setting is also studied. The results provide both theoretical support for the use of Gaussian processes in the approximation of diffusions, and methodological guidance in the construction of Gaussian approximations in applications.

Publication: Communications in Mathematical Sciences Vol.: 15 No.: 7 ISSN: 1539-6746

ID: CaltechAUTHORS:20161220-181911579

]]>

Abstract: The basic idea of importance sampling is to use independent samples from a proposal measure in order to approximate expectations with respect to a target measure. It is key to understand how many samples are required in order to guarantee accurate approximations. Intuitively, some notion of distance between the target and the proposal should determine the computational cost of the method. A major challenge is to quantify this distance in terms of parameters or statistics that are pertinent for the practitioner. The subject has attracted substantial interest from within a variety of communities. The objective of this paper is to overview and unify the resulting literature by creating an overarching framework. A general theory is presented, with a focus on the use of importance sampling in Bayesian inverse problems and filtering.

Publication: Statistical Science Vol.: 32 No.: 3 ISSN: 0883-4237

ID: CaltechAUTHORS:20161221-114242057

]]>

Abstract: This paper is concerned with transition paths within the framework of the overdamped Langevin dynamics model of chemical reactions. We aim to give an efficient description of typical transition paths in the small temperature regime. We adopt a variational point of view and seek the best Gaussian approximation, with respect to Kullback--Leibler divergence, of the non-Gaussian distribution of the diffusion process. We interpret the mean of this Gaussian approximation as the “most likely path,” and the covariance operator as a means to capture the typical fluctuations around this most likely path. We give an explicit expression for the Kullback--Leibler divergence in terms of the mean and the covariance operator for a natural class of Gaussian approximations and show the existence of minimizers for the variational problem. Then the low temperature limit is studied via Γ-convergence of the associated variational problem. The limiting functional consists of two parts: The first part depends only on the mean and coincides with the Γ-limit of the rescaled Freidlin--Wentzell rate functional. The second part depends on both the mean and the covariance operator and is minimized if the dynamics are given by a time-inhomogenous Ornstein--Uhlenbeck process found by linearization of the Langevin dynamics around the Freidlin--Wentzell minimizer.

Publication: SIAM Journal on Mathematical Analysis Vol.: 49 No.: 4 ISSN: 0036-1410

ID: CaltechAUTHORS:20170921-105427126

]]>

Abstract: In this paper, we present a formal quantification of uncertainty induced by numerical solutions of ordinary and partial differential equation models. Numerical solutions of differential equations contain inherent uncertainties due to the finite-dimensional approximation of an unknown and implicitly defined function. When statistically analysing models based on differential equations describing physical, or other naturally occurring, phenomena, it can be important to explicitly account for the uncertainty introduced by the numerical method. Doing so enables objective determination of this source of uncertainty, relative to other uncertainties, such as those caused by data contaminated with noise or model error induced by missing physical or inadequate descriptors. As ever larger scale mathematical models are being used in the sciences, often sacrificing complete resolution of the differential equation on the grids used, formally accounting for the uncertainty in the numerical method is becoming increasingly more important. This paper provides the formal means to incorporate this uncertainty in a statistical model and its subsequent analysis. We show that a wide variety of existing solvers can be randomised, inducing a probability measure over the solutions of such differential equations. These measures exhibit contraction to a Dirac measure around the true unknown solution, where the rates of convergence are consistent with the underlying deterministic numerical method. Furthermore, we employ the method of modified equations to demonstrate enhanced rates of convergence to stochastic perturbations of the original deterministic problem. Ordinary differential equations and elliptic partial differential equations are used to illustrate the approach to quantify uncertainty in both the statistical analysis of the forward and inverse problems.

Publication: Statistics and Computing Vol.: 27 No.: 4 ISSN: 0960-3174

ID: CaltechAUTHORS:20170609-133754387

]]>

Abstract: The ensemble Kalman filter (EnKF) is a widely used methodology for state estimation in partial, noisily observed dynamical systems, and for parameter estimation in inverse problems. Despite its widespread use in the geophysical sciences, and its gradual adoption in many other areas of application, analysis of the method is in its infancy. Furthermore, much of the existing analysis deals with the large ensemble limit, far from the regime in which the method is typically used. The goal of this paper is to analyze the method when applied to inverse problems with fixed ensemble size. A continuous-time limit is derived and the long-time behavior of the resulting dynamical system is studied. Most of the rigorous analysis is confined to the linear forward problem, where we demonstrate that the continuous time limit of the EnKF corresponds to a set of gradient flows for the data misfit in each ensemble member, coupled through a common pre-conditioner which is the empirical covariance matrix of the ensemble. Numerical results demonstrate that the conclusions of the analysis extend beyond the linear inverse problem setting. Numerical experiments are also given which demonstrate the benefits of various extensions of the basic methodology.

Publication: SIAM Journal on Numerical Analysis Vol.: 55 No.: 3 ISSN: 0036-1429

ID: CaltechAUTHORS:20161221-112013537

]]>

Abstract: We are interested in computing the expectation of a functional of a PDE solution under a Bayesian posterior distribution. Using Bayes's rule, we reduce the problem to estimating the ratio of two related prior expectations. For a model elliptic problem, we provide a full convergence and complexity analysis of the ratio estimator in the case where Monte Carlo, quasi-Monte Carlo, or multilevel Monte Carlo methods are used as estimators for the two prior expectations. We show that the computational complexity of the ratio estimator to achieve a given accuracy is the same as the corresponding complexity of the individual estimators for the numerator and the denominator. We also include numerical simulations, in the context of the model elliptic problem, which demonstrate the effectiveness of the approach.

Publication: SIAM/ASA Journal on Uncertainty Quantification Vol.: 5 No.: 1 ISSN: 2166-2525

ID: CaltechAUTHORS:20161221-105527857

]]>

Abstract: Bayesian inverse problems often involve sampling posterior distributions on infinite-dimensional function spaces. Traditional Markov chain Monte Carlo (MCMC) algorithms are characterized by deteriorating mixing times upon mesh-refinement, when the finite-dimensional approximations become more accurate. Such methods are typically forced to reduce step-sizes as the discretization gets finer, and thus are expensive as a function of dimension. Recently, a new class of MCMC methods with mesh-independent convergence times has emerged. However, few of them take into account the geometry of the posterior informed by the data. At the same time, recently developed geometric MCMC algorithms have been found to be powerful in exploring complicated distributions that deviate significantly from elliptic Gaussian laws, but are in general computationally intractable for models defined in infinite dimensions. In this work, we combine geometric methods on a finite-dimensional subspace with mesh-independent infinite-dimensional approaches. Our objective is to speed up MCMC mixing times, without significantly increasing the computational cost per step (for instance, in comparison with the vanilla preconditioned Crank–Nicolson (pCN) method). This is achieved by using ideas from geometric MCMC to probe the complex structure of an intrinsic finite-dimensional subspace where most data information concentrates, while retaining robust mixing times as the dimension grows by using pCN-like methods in the complementary subspace. The resulting algorithms are demonstrated in the context of three challenging inverse problems arising in subsurface flow, heat conduction and incompressible flow control. The algorithms exhibit up to two orders of magnitude improvement in sampling efficiency when compared with the pCN method.

Publication: Journal of Computational Physics Vol.: 335ISSN: 0021-9991

ID: CaltechAUTHORS:20161220-181119556

]]>

Abstract: Filtering is concerned with the sequential estimation of the state, and uncertainties, of a Markovian system, given noisy observations. It is particularly difficult to achieve accurate filtering in complex dynamical systems, such as those arising in turbulence, in which effective low-dimensional representation of the desired probability distribution is challenging. Nonetheless recent advances have shown considerable success in filtering based on certain carefully chosen simplifications of the underlying system, which allow closed form filters. This leads to filtering algorithms with significant, but judiciously chosen, model error. The purpose of this article is to analyze the effectiveness of these simplified filters, and to suggest modifications of them which lead to improved filtering in certain time-scale regimes. We employ a Markov switching process for the true signal underlying the data, rather than working with a fully resolved DNS PDE model. Such Markov switching models haven been demonstrated to provide an excellent surrogate test-bed for the turbulent bursting phenomena which make filtering of complex physical models, such as those arising in atmospheric sciences, so challenging.

Publication: Communications in Mathematical Sciences Vol.: 15 No.: 2 ISSN: 1539-6746

ID: CaltechAUTHORS:20161221-112623998

]]>

Abstract: Ill-posed inverse problems are ubiquitous in applications. Understanding of algorithms for their solution has been greatly enhanced by a deep understanding of the linear inverse problem. In the applied communities ensemble-based filtering methods have recently been used to solve inverse problems by introducing an artificial dynamical system. This opens up the possibility of using a range of other filtering methods, such as 3DVAR and Kalman based methods, to solve inverse problems, again by introducing an artificial dynamical system. The aim of this paper is to analyze such methods in the context of the linear inverse problem. Statistical linear inverse problems are studied in the sense that the observational noise is assumed to be derived via realization of a Gaussian random variable. We investigate the asymptotic behavior of filter based methods for these inverse problems. Rigorous convergence rates are established for 3DVAR and for the Kalman filters, including minimax rates in some instances. Blowup of 3DVAR and a variant of its basic form is also presented, and optimality of the Kalman filter is discussed. These analyses reveal a close connection between (iterated) regularization schemes in deterministic inverse problems and filter based methods in data assimilation. Numerical experiments are presented to illustrate the theory.

Publication: Communications in Mathematical Sciences Vol.: 15 No.: 7 ISSN: 1539-6746

ID: CaltechAUTHORS:20161221-113147238

]]>

Abstract: We provide a rigorous Bayesian formulation of the EIT problem in an infinite dimensional setting, leading to well-posedness in the Hellinger metric with respect to the data. We focus particularly on the reconstruction of binary fields where the interface between different media is the primary unknown. We consider three different prior models -log-Gaussian, star-shaped and level set. Numerical simulations based on the implementation of MCMC are performed, illustrating the advantages and disadvantages of each type of prior in the reconstruction, in the case where the true conductivity is a binary field, and exhibiting the properties of the resulting posterior distribution.

Publication: Inverse Problems and Imaging Vol.: 10 No.: 4 ISSN: 1930-8345

ID: CaltechAUTHORS:20170113-072909521

]]>

Abstract: We study the inverse problem of estimating a field u^a from data comprising a finite set of nonlinear functionals of u^a , subject to additive noise; we denote this observed data by y. Our interest is in the reconstruction of piecewise continuous fields u^a in which the discontinuity set is described by a finite number of geometric parameters a. Natural applications include groundwater flow and electrical impedance tomography. We take a Bayesian approach, placing a prior distribution on u^a and determining the conditional distribution on u^a given the data y. It is then natural to study maximum a posterior (MAP) estimators. Recently (Dashti et al 2013 Inverse Problems 29 095017) it has been shown that MAP estimators can be characterised as minimisers of a generalised Onsager–Machlup functional, in the case where the prior measure is a Gaussian random field. We extend this theory to a more general class of prior distributions which allows for piecewise continuous fields. Specifically, the prior field is assumed to be piecewise Gaussian with random interfaces between the different Gaussians defined by a finite number of parameters. We also make connections with recent work on MAP estimators for linear problems and possibly non-Gaussian priors (Helin and Burger 2015 Inverse Problems 31 085009) which employs the notion of Fomin derivative. In showing applicability of our theory we focus on the groundwater flow and EIT models, though the theory holds more generally. Numerical experiments are implemented for the groundwater flow model, demonstrating the feasibility of determining MAP estimators for these piecewise continuous models, but also that the geometric formulation can lead to multiple nearby (local) MAP estimators. We relate these MAP estimators to the behaviour of output from MCMC samples of the posterior, obtained using a state-of-the-art function space Metropolis–Hastings method.

Publication: Inverse Problems Vol.: 32 No.: 10 ISSN: 0266-5611

ID: CaltechAUTHORS:20170612-142444027

]]>

Abstract: In the context of filtering chaotic dynamical systems it is well-known that partial observations, if sufficiently informative, can be used to control the inherent uncertainty due to chaos. The purpose of this paper is to investigate, both theoretically and numerically, conditions on the observations of chaotic systems under which they can be accurately filtered. In particular, we highlight the advantage of adaptive observation operators over fixed ones. The Lorenz ’96 model is used to exemplify our findings. We consider discrete-time and continuous-time observations in our theoretical developments. We prove that, for fixed observation operator, the 3DVAR filter can recover the system state within a neighbourhood determined by the size of the observational noise. It is required that a sufficiently large proportion of the state vector is observed, and an explicit form for such sufficient fixed observation operator is given. Numerical experiments, where the data is incorporated by use of the 3DVAR and extended Kalman filters, suggest that less informative fixed operators than given by our theory can still lead to accurate signal reconstruction. Adaptive observation operators are then studied numerically; we show that, for carefully chosen adaptive observation operators, the proportion of the state vector that needs to be observed is drastically smaller than with a fixed observation operator. Indeed, we show that the number of state coordinates that need to be observed may even be significantly smaller than the total number of positive Lyapunov exponents of the underlying system.

Publication: Physica D: Nonlinear Phenomena Vol.: 325ISSN: 0167-2789

ID: CaltechAUTHORS:20160715-152128338

]]>

Abstract: The process of extracting information from data has a long history (see, for example, [1]) stretching back over centuries. Because of the proliferation of data over the last few decades, and projections for its continued proliferation over coming decades, the term Data Science has emerged to describe the substantial current intellectual effort around research with the same overall goal, namely that of extracting information. The type of data currently available in all sorts of application domains is often massive in size, very heterogeneous and far from being collected under designed or controlled experimental conditions. Nonetheless, it contains information, often substantial information, and data science requires new interdisciplinary approaches to make maximal use of this information. Data alone is typically not that informative and (machine) learning from data needs conceptual frameworks. Mathematics and statistics are crucial for providing such conceptual frameworks. The frameworks enhance the understanding of fundamental phenomena, highlight limitations and provide a formalism for properly founded data analysis, information extraction and quantification of uncertainty, as well as for the analysis and development of algorithms that carry out these key tasks. In this personal commentary on data science and its relations to mathematics and statistics, we highlight three important aspects of the emerging field: Models, High-Dimensionality and Heterogeneity, and then conclude with a brief discussion of where the field is now and implications for the mathematical sciences.

Publication: EMS Newsletter Vol.: 100ISSN: 1027-488X

ID: CaltechAUTHORS:20161111-103206810

]]>

Abstract: We describe a new MCMC method optimized for the sampling of probability measures on Hilbert space which have a density with respect to a Gaussian; such measures arise in the Bayesian approach to inverse problems, and in conditioned diffusions. Our algorithm is based on two key design principles: (i) algorithms which are well defined in infinite dimensions result in methods which do not suffer from the curse of dimensionality when they are applied to approximations of the infinite dimensional target measure on R^N; (ii) nonreversible algorithms can have better mixing properties compared to their reversible counterparts. The method we introduce is based on the hybrid Monte Carlo algorithm, tailored to incorporate these two design principles. The main result of this paper states that the new algorithm, appropriately rescaled, converges weakly to a second order Langevin diffusion on Hilbert space; as a consequence the algorithm explores the approximate target measures on R^N in a number of steps which is independent of N. We also present the underlying theory for the limiting nonreversible diffusion on Hilbert space, including characterization of the invariant measure, and we describe numerical simulations demonstrating that the proposed method has favourable mixing properties as an MCMC algorithm.

Publication: Bernoulli Vol.: 22 No.: 1 ISSN: 1350-7265

ID: CaltechAUTHORS:20160715-161420502

]]>

Abstract: We introduce a level set based approach to Bayesian geometric inverse problems. In these problems the interface between different domains is the key unknown, and is realized as the level set of a function. This function itself becomes the object of the inference. Whilst the level set methodology has been widely used for the solution of geometric inverse problems, the Bayesian formulation that we develop here contains two significant advances: firstly it leads to a well-posed inverse problem in which the posterior distribution is Lipschitz with respect to the observed data; and secondly it leads to computationally expedient algorithms in which the level set itself is updated implicitly via the MCMC methodology applied to the level set function- no explicit velocity field is required for the level set interface. Applications are numerous and include medical imaging, modelling of subsurface formations and the inverse source problem; our theory is illustrated with computational results involving the last two applications.

Publication: Interfaces and Free Boundaries Vol.: 18 No.: 2 ISSN: 1463-9971

ID: CaltechAUTHORS:20161221-114630868

]]>

Abstract: The filtering distribution is a time-evolving probability distribution on the state of a dynamical system given noisy observations. We study the large-time asymptotics of this probability distribution for discrete-time, randomly initialized signals that evolve according to a deterministic map Ψ. The observations are assumed to comprise a low-dimensional projection of the signal, given by an operator P, subject to additive noise. We address the question of whether these observations contain sufficient information to accurately reconstruct the signal. In a general framework, we establish conditions on Ψ and P under which the filtering distributions concentrate around the signal in the small-noise, long-time asymptotic regime. Linear systems, the Lorenz '63 and '96 models, and the Navier--Stokes equation on a two-dimensional torus are within the scope of the theory. Our main findings come as a by-product of computable bounds, of independent interest, for suboptimal filters based on new variants of the 3DVAR filtering algorithm.

Publication: SIAM/ASA Journal on Uncertainty Quantification Vol.: 3 No.: 1 ISSN: 2166-2525

ID: CaltechAUTHORS:20160715-165131732

]]>

Abstract: In this article, we consider a Bayesian inverse problem associated to elliptic partial differential equations in two and three dimensions. This class of inverse problems is important in applications such as hydrology, but the complexity of the link function between unknown field and measurements can make it difficult to draw inference from the associated posterior. We prove that for this inverse problem a basic sequential Monte Carlo (SMC) method has a Monte Carlo rate of convergence with constants which are independent of the dimension of the discretization of the problem; indeed convergence of the SMC method is established in a function space setting. We also develop an enhancement of the SMC methods for inverse problems which were introduced in Kantas et al. (SIAM/ASA J Uncertain Quantif 2:464–489, 2014); the enhancement is designed to deal with the additional complexity of this elliptic inverse problem. The efficacy of the methodology and its desirable theoretical properties, are demonstrated for numerical examples in both two and three dimensions.

Publication: Statistics and Computing Vol.: 25 No.: 4 ISSN: 1573-1375

ID: CaltechAUTHORS:20160715-172126693

]]>

Abstract: Lateral diffusion of molecules on surfaces plays a very important role in various biological processes, including lipid transport across the cell membrane, synaptic transmission, and other phenomena such as exo- and endocytosis, signal transduction, chemotaxis, and cell growth. In many cases, the surfaces can possess spatial inhomogeneities and/or be rapidly changing shape. Using a generalization of the model for a thermally excited Helfrich elastic membrane, we consider the problem of lateral diffusion on quasi-planar surfaces, possessing both spatial and temporal fluctuations. Using results from homogenization theory, we show that, under the assumption of scale separation between the characteristic length and timescales of the membrane fluctuations and the characteristic scale of the diffusing particle, the lateral diffusion process can be well approximated by a Brownian motion on the plane with constant diffusion tensor DD that depends on a highly nonlinear way on the detailed properties of the surface. The effective diffusion tensor will depend on the relative scales of the spatial and temporal fluctuations, and for different scaling regimes, we prove the existence of a macroscopic limit in each case.

Publication: Journal of Nonlinear Science Vol.: 25 No.: 2 ISSN: 0938-8974

ID: CaltechAUTHORS:20160715-172927058

]]>

Abstract: In this paper we study algorithms to find a Gaussian approximation to a target measure defined on a Hilbert space of functions; the target measure itself is defined via its density with respect to a reference Gaussian measure. We employ the Kullback--Leibler divergence as a distance and find the best Gaussian approximation by minimizing this distance. It then follows that the approximate Gaussian must be equivalent to the Gaussian reference measure, defining a natural function space setting for the underlying calculus of variations problem. We introduce a computational algorithm which is well-adapted to the required minimization, seeking to find the mean as a function, and parameterizing the covariance in two different ways: through low rank perturbations of the reference covariance and through Schrödinger potential perturbations of the inverse reference covariance. Two applications are shown: to a nonlinear inverse problem in elliptic PDEs and to a conditioned diffusion process. These Gaussian approximations also serve to provide a preconditioned proposal distribution for improved preconditioned Crank--Nicolson Monte Carlo--Markov chain sampling of the target distribution. This approach is not only well-adapted to the high dimensional setting, but also behaves well with respect to small observational noise (resp., small temperatures) in the inverse problem (resp., conditioned diffusion).

Publication: SIAM Journal on Scientific Computing Vol.: 37 No.: 6 ISSN: 1064-8275

ID: CaltechAUTHORS:20160715-163821138

]]>

Abstract: The seamless integration of large data sets into sophisticated computational models provides one of the central challenges for the mathematical sciences in the 21st century. When the computational model is based on dynamical systems, and the data set is time ordered, the process of combining models and data is called data assimilation. The assimilation of data into computational models serves a wide spectrum of purposes ranging from model calibration and model comparison, all the way to the validation of novel model design principles.

Publication: SIAM News Vol.: 48 No.: 8 & 9 ISSN: 1557-9573

ID: CaltechAUTHORS:20161111-104214792

]]>

Abstract: In a variety of applications it is important to extract information from a probability measure μ on an infinite dimensional space. Examples include the Bayesian approach to inverse problems and (possibly conditioned) continuous time Markov processes. It may then be of interest to find a measure ν, from within a simple class of measures, which approximates μ. This problem is studied in the case where the Kullback--Leibler divergence is employed to measure the quality of the approximation. A calculus of variations viewpoint is adopted, and the particular case where ν is chosen from the set of Gaussian measures is studied in detail. Basic existence and uniqueness theorems are established, together with properties of minimizing sequences. Furthermore, parameterization of the class of Gaussians through the mean and inverse covariance is introduced, the need for regularization is explained, and a regularized minimization is studied in detail. The calculus of variations framework resulting from this work provides the appropriate underpinning for computational algorithms.

Publication: SIAM Journal on Mathematical Analysis Vol.: 47 No.: 6 ISSN: 0036-1410

ID: CaltechAUTHORS:20160715-170335769

]]>

Abstract: We study the problem of sampling high and infinite dimensional target measures arising in applications such as conditioned diffusions and inverse problems. We focus on those that arise from approximating measures on Hilbert spaces defined via a density with respect to a Gaussian reference measure. We consider the Metropolis–Hastings algorithm that adds an accept–reject mechanism to a Markov chain proposal in order to make the chain reversible with respect to the target measure. We focus on cases where the proposal is either a Gaussian random walk (RWM) with covariance equal to that of the reference measure or an Ornstein–Uhlenbeck proposal (pCN) for which the reference measure is invariant. Previous results in terms of scaling and diffusion limits suggested that the pCN has a convergence rate that is independent of the dimension while the RWM method has undesirable dimension-dependent behaviour. We confirm this claim by exhibiting a dimension-independent Wasserstein spectral gap for pCN algorithm for a large class of target measures. In our setting this Wasserstein spectral gap implies an L^2-spectral gap. We use both spectral gaps to show that the ergodic average satisfies a strong law of large numbers, the central limit theorem and nonasymptotic bounds on the mean square error, all dimension independent. In contrast we show that the spectral gap of the RWM algorithm applied to the reference measures degenerates as the dimension tends to infinity.

Publication: Annals of Applied Probability Vol.: 24 No.: 6 ISSN: 1050-5164

ID: CaltechAUTHORS:20160719-144104557

]]>

Abstract: In this paper, we consider the inverse problem of determining the permeability of the subsurface from hydraulic head measurements, within the framework of a steady Darcy model of groundwater flow. We study geometrically defined prior permeability fields, which admit layered, fault and channel structures, in order to mimic realistic subsurface features; within each layer we adopt either a constant or continuous function representation of the permeability. This prior model leads to a parameter identification problem for a finite number of unknown parameters determining the geometry, together with either a finite number of permeability values (in the constant case) or a finite number of fields (in the continuous function case). We adopt a Bayesian framework showing the existence and well-posedness of the posterior distribution. We also introduce novel Markov chain Monte Carlo (MCMC) methods, which exploit the different character of the geometric and permeability parameters, and build on recent advances in function space MCMC. These algorithms provide rigorous estimates of the permeability, as well as the uncertainty associated with it, and only require forward model evaluations. No adjoint solvers are required and hence the methodology is applicable to black-box forward models. We then use these methods to explore the posterior and to illustrate the methodology with numerical experiments.

Publication: Inverse Problems Vol.: 30 No.: 11 ISSN: 0266-5611

ID: CaltechAUTHORS:20160719-114834043

]]>

Abstract: The ensemble Kalman filter (EnKF) is a method for combining a dynamical model with data in a sequential fashion. Despite its widespread use, there has been little analysis of its theoretical properties. Many of the algorithmic innovations associated with the filter, which are required to make a useable algorithm in practice, are derived in an ad hoc fashion. The aim of this paper is to initiate the development of a systematic analysis of the EnKF, in particular to do so for small ensemble size. The perspective is to view the method as a state estimator, and not as an algorithm which approximates the true filtering distribution. The perturbed observation version of the algorithm is studied, without and with variance inflation. Without variance inflation well-posedness of the filter is established; with variance inflation accuracy of the filter, with respect to the true signal underlying the data, is established. The algorithm is considered in discrete time, and also for a continuous time limit arising when observations are frequent and subject to large noise. The underlying dynamical model, and assumptions about it, is sufficiently general to include the Lorenz '63 and '96 models, together with the incompressible Navier–Stokes equation on a two-dimensional torus. The analysis is limited to the case of complete observation of the signal with additive white noise. Numerical results are presented for the Navier–Stokes equation on a two-dimensional torus for both complete and partial observations of the signal with additive white noise.

Publication: Nonlinearity Vol.: 27 No.: 10 ISSN: 0951-7715

ID: CaltechAUTHORS:20160719-143029648

]]>

Abstract: Many inverse problems arising in applications come from continuum models where the unknown parameter is a field. In practice the unknown field is discretized, resulting in a problem in ℝ^N, with an understanding that refining the discretization, that is, increasing N, will often be desirable. In the context of Bayesian inversion this situation suggests the importance of two issues: (i) defining hyperparameters in such a way that they are interpretable in the continuum limit N →∞ and so that their values may be compared between different discretization levels; and (ii) understanding the efficiency of algorithms for probing the posterior distribution as a function of large $N.$ Here we address these two issues in the context of linear inverse problems subject to additive Gaussian noise within a hierarchical modeling framework based on a Gaussian prior for the unknown field and an inverse-gamma prior for a hyperparameter, namely the amplitude of the prior variance. The structure of the model is such that the Gibbs sampler can be easily implemented for probing the posterior distribution. Subscribing to the dogma that one should think infinite-dimensionally before implementing in finite dimensions, we present function space intuition and provide rigorous theory showing that as $N$ increases, the component of the Gibbs sampler for sampling the amplitude of the prior variance becomes increasingly slower. We discuss a reparametrization of the prior variance that is robust with respect to the increase in dimension; we give numerical experiments which exhibit that our reparametrization prevents the slowing down. Our intuition on the behavior of the prior hyperparameter, with and without reparametrization, is sufficiently general to include a broad class of nonlinear inverse problems as well as other families of hyperpriors.

Publication: SIAM/ASA Journal on Uncertainty Quantification Vol.: 2 No.: 1 ISSN: 2166-2525

ID: CaltechAUTHORS:20160719-141859444

]]>

Abstract: We consider a class of linear ill-posed inverse problems arising from inversion of a compact operator with singular values which decay exponentially to zero. We adopt a Bayesian approach, assuming a Gaussian prior on the unknown function. The observational noise is assumed to be Gaussian; as a consequence the prior is conjugate to the likelihood so that the posterior distribution is also Gaussian. We study Bayesian posterior consistency in the small observational noise limit. We assume that the forward operator and the prior and noise covariance operators commute with one another. We show how, for given smoothness assumptions on the truth, the scale parameter of the prior, which is a constant multiplier of the prior covariance operator, can be adjusted to optimize the rate of posterior contraction to the truth, and we explicitly compute the logarithmic rate.

Publication: Journal of Inverse and Ill-posed Problems Vol.: 22 No.: 3 ISSN: 1569-3945

ID: CaltechAUTHORS:20160719-151308932

]]>

Abstract: The Bayesian approach to inverse problems is of paramount importance in quantifying uncertainty about the input to, and the state of, a system of interest given noisy observations. Herein we consider the forward problem of the forced 2D Navier-Stokes equation. The inverse problem is to make inference concerning the forcing, and possibly the initial condition, given noisy observations of the velocity field. We place a prior on the forcing which is in the form of a spatially-correlated and temporally-white Gaussian process, and formulate the inverse problem for the posterior distribution. Given appropriate spatial regularity conditions, we show that the solution is a continuous function of the forcing. Hence, for appropriately chosen spatial regularity in the prior, the posterior distribution on the forcing is absolutely continuous with respect to the prior and is hence well-defined. Furthermore, it may then be shown that the posterior distribution is a continuous function of the data. We complement these theoretical results with numerical simulations showing the feasibility of computing the posterior distribution, and illustrating its properties.

Publication: Stochastic Partial Differential Equations: Analysis and Computations Vol.: 2 No.: 2 ISSN: 2194-041X

ID: CaltechAUTHORS:20160719-150657732

]]>

Abstract: Consider a probability measure on a Hilbert space defined via its density with respect to a Gaussian. The purpose of this paper is to demonstrate that an appropriately defined Markov chain, which is reversible with respect to the measure in question, exhibits a diffusion limit to a noisy gradient flow, also reversible with respect to the same measure. The Markov chain is defined by applying a Metropolis–Hastings accept–reject mechanism (Tierney, Ann Appl Probab 8:1–9, 1998) to an Ornstein–Uhlenbeck (OU) proposal which is itself reversible with respect to the underlying Gaussian measure. The resulting noisy gradient flow is a stochastic partial differential equation driven by a Wiener process with spatial correlation given by the underlying Gaussian structure. There are two primary motivations for this work. The first concerns insight into Monte Carlo Markov Chain (MCMC) methods for sampling of measures on a Hilbert space defined via a density with respect to a Gaussian measure. These measures must be approximated on finite dimensional spaces of dimension N in order to be sampled. A conclusion of the work herein is that MCMC methods based on prior-reversible OU proposals will explore the target measure in O(1) steps with respect to dimension N. This is to be contrasted with standard MCMC methods based on the random walk or Langevin proposals which require O(N) and O(N^(1/3)) steps respectively (Mattingly et al., Ann Appl Prob 2011; Pillai et al., Ann Appl Prob 22:2320–2356 2012). The second motivation relates to optimization. There are many applications where it is of interest to find global or local minima of a functional defined on an infinite dimensional Hilbert space. Gradient flow or steepest descent is a natural approach to this problem, but in its basic form requires computation of a gradient which, in some applications, may be an expensive or complex task. This paper shows that a stochastic gradient descent described by a stochastic partial differential equation can emerge from certain carefully specified Markov chains. This idea is well-known in the finite state (Kirkpatricket al., Science 220:671–680, 1983; Cerny, J Optim Theory Appl 45:41–51, 1985) or finite dimensional context (German, IEEE Trans Geosci Remote Sens 1:269–276, 1985; German, SIAM J Control Optim 24:1031, 1986; Chiang, SIAM J Control Optim 25:737–753, 1987; J Funct Anal 83:333–347, 1989). The novelty of the work in this paper is that the emergence of the noisy gradient flow is developed on an infinite dimensional Hilbert space. In the context of global optimization, when the noise level is also adjusted as part of the algorithm, methods of the type studied here go by the name of simulated–annealing; see the review (Bertsimas and Tsitsiklis, Stat Sci 8:10–15, 1993) for further references. Although we do not consider adjusting the noise-level as part of the algorithm, the noise strength is a tuneable parameter in our construction and the methods developed here could potentially be used to study simulated annealing in a Hilbert space setting. The transferable idea behind this work is that conceiving of algorithms directly in the infinite dimensional setting leads to methods which are robust to finite dimensional approximation. We emphasize that discretizing, and then applying standard finite dimensional techniques in ℝ^N, to either sample or optimize, can lead to algorithms which degenerate as the dimension N increases.

Publication: Stochastic Partial Differential Equations: Analysis and Computations Vol.: 2 No.: 2 ISSN: 2194-041X

ID: CaltechAUTHORS:20160719-145056000

]]>

Abstract: The problem of effectively combining data with a mathematical model constitutes a major challenge in applied mathematics. It is particular challenging for high-dimensional dynamical systems where data is received sequentially in time and the objective is to estimate the system state in an on-line fashion; this situation arises, for example, in weather forecasting. The sequential particle filter is then impractical and ad hoc filters, which employ some form of Gaussian approximation, are widely used. Prototypical of these ad hoc filters is the 3DVAR method. The goal of this paper is to analyze the 3DVAR method, using the Lorenz '63 model to exemplify the key ideas. The situation where the data is partial and noisy is studied, and both discrete time and continuous time data streams are considered. The theory demonstrates how the widely used technique of variance inflation acts to stabilize the filter, and hence leads to asymptotic accuracy.

Publication: Discrete and Continuous Dynamical Systems A Vol.: 34 No.: 3 ISSN: 1553-5231

ID: CaltechAUTHORS:20160719-152032603

]]>

Abstract: Quantifying uncertainty in the solution of inverse problems is an exciting area of research in the mathematical sciences, one that raises significant challenges at the interfaces between analysis, computation, probability, and statistics. The reach in terms of applicability is enormous, with diverse problems arising in the physical, biological, and social sciences, such as weather prediction, epidemiology, and traffic flow. Loosely speaking, inverse problems confront mathematical models with data so that we can deduce the inputs needed to run the models; knowledge of these inputs can then be used to make predictions, and even to devise control strategies based on the predictions. Both the models and the data are typically uncertain, as are the resulting deductions and predictions; as a consequence, any decisions or control strategies based on the predictions will be greatly improved if the uncertainty is made quantitative.

Publication: SIAM NewsISSN: 1557-9573

ID: CaltechAUTHORS:20161111-105218524

]]>

Abstract: We investigate the properties of the hybrid Monte Carlo algorithm (HMC) in high dimensions. HMC develops a Markov chain reversible with respect to a given target distribution Π using separable Hamiltonian dynamics with potential −logΠ−logΠ. The additional momentum variables are chosen at random from the Boltzmann distribution, and the continuous-time Hamiltonian dynamics are then discretised using the leapfrog scheme. The induced bias is removed via a Metropolis–Hastings accept/reject rule. In the simplified scenario of independent, identically distributed components, we prove that, to obtain an O(1) acceptance probability as the dimension dd of the state space tends to ∞, the leapfrog step size hh should be scaled as h=l×d^(−1/4). Therefore, in high dimensions, HMC requires O(d^(1/4)) steps to traverse the state space. We also identify analytically the asymptotically optimal acceptance probability, which turns out to be 0.651 (to three decimal places). This value optimally balances the cost of generating a proposal, which decreases as l increases (because fewer steps are required to reach the desired final integration time), against the cost related to the average number of proposals required to obtain acceptance, which increases as l increases.

Publication: Bernoulli Vol.: 19 No.: 5A ISSN: 1350-7265

ID: CaltechAUTHORS:20160726-155502558

]]>

Abstract: We consider a Bayesian nonparametric approach to a family of linear inverse problems in a separable Hilbert space setting with Gaussian noise. We assume Gaussian priors, which are conjugate to the model, and present a method of identifying the posterior using its precision operator. Working with the unbounded precision operator enables us to use partial differential equations (PDE) methodology to obtain rates of contraction of the posterior distribution to a Dirac measure centered on the true solution. Our methods assume a relatively weak relation between the prior covariance, noise covariance and forward operator, allowing for a wide range of applications.

Publication: Stochastic Processes and their Applications Vol.: 123 No.: 10 ISSN: 0304-4149

ID: CaltechAUTHORS:20160727-163931558

]]>

Abstract: We consider the inverse problem of estimating an unknown function u from noisy measurements y of a known, possibly nonlinear, map G applied to u. We adopt a Bayesian approach to the problem and work in a setting where the prior measure is specified as a Gaussian random field μ0. We work under a natural set of conditions on the likelihood which implies the existence of a well-posed posterior measure, μ^y. Under these conditions, we show that the maximum a posteriori (MAP) estimator is well defined as the minimizer of an Onsager–Machlup functional defined on the Cameron–Martin space of the prior; thus, we link a problem in probability with a problem in the calculus of variations. We then consider the case where the observational noise vanishes and establish a form of Bayesian posterior consistency for the MAP estimator. We also prove a similar result for the case where the observation of G(u) can be repeated as many times as desired with independent identically distributed noise. The theory is illustrated with examples from an inverse problem for the Navier–Stokes equation, motivated by problems arising in weather forecasting, and from the theory of conditioned diffusions, motivated by problems arising in molecular dynamics.

Publication: Inverse Problems Vol.: 29 No.: 9 ISSN: 0266-5611

ID: CaltechAUTHORS:20160727-154930631

]]>

Abstract: Many problems arising in applications result in the need to probe a probability distribution for functions. Examples include Bayesian nonparametric statistics and conditioned diffusion processes. Standard MCMC algorithms typically become arbitrarily slow under the mesh refinement dictated by nonparametric description of the unknown function. We describe an approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement. Gaussian processes or random fields are fields whose marginal distributions, when evaluated at any finite set of NNpoints, are ℝ^N-valued Gaussians. The algorithmic approach that we describe is applicable not only when the desired probability measure has density with respect to a Gaussian process or Gaussian random field reference measure, but also to some useful non-Gaussian reference measures constructed through random truncation. In the applications of interest the data is often sparse and the prior specification is an essential part of the overall modelling strategy. These Gaussian-based reference measures are a very flexible modelling tool, finding wide-ranging application. Examples are shown in density estimation, data assimilation in fluid mechanics, subsurface geophysics and image registration. The key design principle is to formulate the MCMC method so that it is, in principle, applicable for functions; this may be achieved by use of proposals based on carefully chosen time-discretizations of stochastic dynamical systems which exactly preserve the Gaussian reference measure. Taking this approach leads to many new algorithms which can be implemented via minor modification of existing algorithms, yet which show enormous speed-up on a wide range of applied problems.

Publication: Statistical Science Vol.: 28 No.: 3 ISSN: 0883-4237

ID: CaltechAUTHORS:20160727-155941152

]]>

Abstract: The Bayesian approach to inverse problems, in which the posterior probability distribution on an unknown field is sampled for the purposes of computing posterior expectations of quantities of interest, is starting to become computationally feasible for partial differential equation (PDE) inverse problems. Balancing the sources of error arising from finite-dimensional approximation of the unknown field, the PDE forward solution map and the sampling of the probability space under the posterior distribution are essential for the design of efficient computational Bayesian methods for PDE inverse problems. We study Bayesian inversion for a model elliptic PDE with an unknown diffusion coefficient. We provide complexity analyses of several Markov chain Monte Carlo (MCMC) methods for the efficient numerical evaluation of expectations under the Bayesian posterior distribution, given data δ. Particular attention is given to bounds on the overall work required to achieve a prescribed error level ε. Specifically, we first bound the computational complexity of 'plain' MCMC, based on combining MCMC sampling with linear complexity multi-level solvers for elliptic PDE. Our (new) work versus accuracy bounds show that the complexity of this approach can be quite prohibitive. Two strategies for reducing the computational complexity are then proposed and analyzed: first, a sparse, parametric and deterministic generalized polynomial chaos (gpc) 'surrogate' representation of the forward response map of the PDE over the entire parameter space, and, second, a novel multi-level Markov chain Monte Carlo strategy which utilizes sampling from a multi-level discretization of the posterior and the forward PDE. For both of these strategies, we derive asymptotic bounds on work versus accuracy, and hence asymptotic bounds on the computational complexity of the algorithms. In particular, we provide sufficient conditions on the regularity of the unknown coefficients of the PDE and on the approximation methods used, in order for the accelerations of MCMC resulting from these strategies to lead to complexity reductions over 'plain' MCMC algorithms for the Bayesian inversion of PDEs.

Publication: Inverse Problems Vol.: 29 No.: 8 ISSN: 0266-5611

ID: CaltechAUTHORS:20160727-163339156

]]>

Abstract: The 3DVAR filter is prototypical of methods used to combine observed data with a dynamical system, online, in order to improve estimation of the state of the system. Such methods are used for high dimensional data assimilation problems, such as those arising in weather forecasting. To gain understanding of filters in applications such as these, it is hence of interest to study their behaviour when applied to infinite dimensional dynamical systems. This motivates the study of the problem of accuracy and stability of 3DVAR filters for the Navier–Stokes equation. We work in the limit of high frequency observations and derive continuous time filters. This leads to a stochastic partial differential equation (SPDE) for state estimation, in the form of a damped-driven Navier–Stokes equation, with mean-reversion to the signal, and spatially-correlated time-white noise. Both forward and pullback accuracy and stability results are proved for this SPDE, showing in particular that when enough low Fourier modes are observed, and when the model uncertainty is larger than the data uncertainty in these modes (variance inflation), then the filter can lock on to a small neighbourhood of the true signal, recovering from order one initial error, if the error in the observed modes is small. Numerical examples are given to illustrate the theory.

Publication: Nonlinearity Vol.: 26 No.: 8 ISSN: 0951-7715

ID: CaltechAUTHORS:20160726-141854847

]]>

Abstract: The Bayesian framework is the standard approach for data assimilation in reservoir modeling. This framework involves characterizing the posterior distribution of geological parameters in terms of a given prior distribution and data from the reservoir dynamics, together with a forward model connecting the space of geological parameters to the data space. Since the posterior distribution quantifies the uncertainty in the geologic parameters of the reservoir, the characterization of the posterior is fundamental for the optimal management of reservoirs. Unfortunately, due to the large-scale highly nonlinear properties of standard reservoir models, characterizing the posterior is computationally prohibitive. Instead, more affordable ad hoc techniques, based on Gaussian approximations, are often used for characterizing the posterior distribution. Evaluating the performance of those Gaussian approximations is typically conducted by assessing their ability at reproducing the truth within the confidence interval provided by the ad hoc technique under consideration. This has the disadvantage of mixing up the approximation properties of the history matching algorithm employed with the information content of the particular observations used, making it hard to evaluate the effect of the ad hoc approximations alone. In this paper, we avoid this disadvantage by comparing the ad hoc techniques with a fully resolved state-of-the-art probing of the Bayesian posterior distribution. The ad hoc techniques whose performance we assess are based on (1) linearization around the maximum a posteriori estimate, (2) randomized maximum likelihood, and (3) ensemble Kalman filter-type methods. In order to fully resolve the posterior distribution, we implement a state-of-the art Markov chain Monte Carlo (MCMC) method that scales well with respect to the dimension of the parameter space, enabling us to study realistic forward models, in two space dimensions, at a high level of grid refinement. Our implementation of the MCMC method provides the gold standard against which the aforementioned Gaussian approximations are assessed. We present numerical synthetic experiments where we quantify the capability of each of the ad hoc Gaussian approximation in reproducing the mean and the variance of the posterior distribution (characterized via MCMC) associated to a data assimilation problem. Both single-phase and two-phase (oil–water) reservoir models are considered so that fundamental differences in the resulting forward operators are highlighted. The main objective of our controlled experiments was to exhibit the substantial discrepancies of the approximation properties of standard ad hoc Gaussian approximations. Numerical investigations of the type we present here will lead to the greater understanding of the cost-efficient, but ad hoc, Bayesian techniques used for data assimilation in petroleum reservoirs and hence ultimately to improved techniques with more accurate uncertainty quantification.

Publication: Computational Geosciences Vol.: 17 No.: 5 ISSN: 1420-0597

ID: CaltechAUTHORS:20160727-153428298

]]>

Abstract: The ensemble Kalman filter (EnKF) was introduced by Evensen in 1994 (Evensen 1994 J. Geophys. Res. 99 10143–62) as a novel method for data assimilation: state estimation for noisily observed time-dependent problems. Since that time it has had enormous impact in many application domains because of its robustness and ease of implementation, and numerical evidence of its accuracy. In this paper we propose the application of an iterative ensemble Kalman method for the solution of a wide class of inverse problems. In this context we show that the estimate of the unknown function that we obtain with the ensemble Kalman method lies in a subspace A spanned by the initial ensemble. Hence the resulting error may be bounded above by the error found from the best approximation in this subspace. We provide numerical experiments which compare the error incurred by the ensemble Kalman method for inverse problems with the error of the best approximation in A, and with variants on traditional least-squares approaches, restricted to the subspace A. In so doing we demonstrate that the ensemble Kalman method for inverse problems provides a derivative-free optimization method with comparable accuracy to that achieved by traditional least-squares approaches. Furthermore, we also demonstrate that the accuracy is of the same order of magnitude as that achieved by the best approximation. Three examples are used to demonstrate these assertions: inversion of a compact linear operator; inversion of piezometric head to determine hydraulic conductivity in a Darcy model of groundwater flow; and inversion of Eulerian velocity measurements at positive times to determine the initial condition in an incompressible fluid.

Publication: Inverse Problems Vol.: 29 No.: 4 ISSN: 0266-5611

ID: CaltechAUTHORS:20160727-180147046

]]>

Abstract: Data assimilation methodologies are designed to incorporate noisy observations of a physical system into an underlying model in order to infer the properties of the state of the system. Filters refer to a class of data assimilation algorithms designed to update the estimation of the state in an on-line fashion, as data is acquired sequentially. For linear problems subject to Gaussian noise, filtering can be performed exactly using the Kalman filter. For nonlinear systems filtering can be approximated in a systematic way by particle filters. However in high dimensions these particle filtering methods can break down. Hence, for the large nonlinear systems arising in applications such as oceanography and weather forecasting, various ad hoc filters are used, mostly based on making Gaussian approximations. The purpose of this work is to study the accuracy and stability properties of these ad hoc filters. We work in the context of the 2D incompressible Navier–Stokes equation, although the ideas readily generalize to a range of dissipative partial differential equations (PDEs). By working in this infinite dimensional setting we provide an analysis which is useful for the understanding of high dimensional filtering, and is robust to mesh-refinement. We describe theoretical results showing that, in the small observational noise limit, the filters can be tuned to perform accurately in tracking the signal itself (filter accuracy), provided the system is observed in a sufficiently large low dimensional space; roughly speaking this space should be large enough to contain the unstable modes of the linearized dynamics. The tuning corresponds to what is known as variance inflation in the applied literature. Numerical results are given which illustrate the theory. The positive results herein concerning filter stability complement recent numerical studies which demonstrate that the ad hoc filters can perform poorly in reproducing statistical variation about the true signal.

Publication: Physica D: Nonlinear Phenomena Vol.: 245 No.: 1 ISSN: 0167-2789

ID: CaltechAUTHORS:20160727-180601953

]]>

Abstract: We study a Bayesian approach to nonparametric estimation of the periodic drift function of a one-dimensional diffusion from continuous-time data. Rewriting the likelihood in terms of local time of the process, and specifying a Gaussian prior with precision operator of differential form, we show that the posterior is also Gaussian with the precision operator also of differential form. The resulting expressions are explicit and lead to algorithms which are readily implementable. Using new functional limit theorems for the local time of diffusions on the circle, we bound the rate at which the posterior contracts around the true drift function.

Publication: Stochastic Processes and their Applications Vol.: 123 No.: 2 ISSN: 0304-4149

ID: CaltechAUTHORS:20160727-175235615

]]>

Abstract: The Metropolis-adjusted Langevin (MALA) algorithm is a sampling algorithm which makes local moves by incorporating information about the gradient of the logarithm of the target density. In this paper we study the efficiency of MALA on a natural class of target measures supported on an infinite dimensional Hilbert space. These natural measures have density with respect to a Gaussian random field measure and arise in many applications such as Bayesian nonparametric statistics and the theory of conditioned diffusions. We prove that, started in stationarity, a suitably interpolated and scaled version of the Markov chain corresponding to MALA converges to an infinite dimensional diffusion process. Our results imply that, in stationarity, the MALA algorithm applied to an N-dimensional approximation of the target will take O(N^(1/3)) steps to explore the invariant measure, comparing favorably with the Random Walk Metropolis which was recently shown to require O(N) steps when applied to the same class of problems. As a by-product of the diffusion limit, it also follows that the MALA algorithm is optimized at an average acceptance probability of 0.574. Previous results were proved only for targets which are products of one-dimensional distributions, or for variants of this situation, limiting their applicability. The correlation in our target means that the rescaled MALA algorithm converges weakly to an infinite dimensional Hilbert space valued diffusion, and the limit cannot be described through analysis of scalar diffusions. The limit theorem is proved by showing that a drift-martingale decomposition of the Markov chain, suitably scaled, closely resembles a weak Euler–Maruyama discretization of the putative limit. An invariance principle is proved for the martingale, and a continuous mapping argument is used to complete the proof.

Publication: Annals of Applied Probability Vol.: 22 No.: 6 ISSN: 1050-5164

ID: CaltechAUTHORS:20160728-150141693

]]>

Abstract: Data assimilation leads naturally to a Bayesian formulation in which the posterior probability distribution of the system state, given all the observations on a time window of interest, plays a central conceptual role. The aim of this paper is to use this Bayesian posterior probability distribution as a gold standard against which to evaluate various commonly used data assimilation algorithms. A key aspect of geophysical data assimilation is the high dimensionality and limited predictability of the computational model. This paper examines the two-dimensional Navier–Stokes equations in a periodic geometry, which has these features and yet is tractable for explicit and accurate computation of the posterior distribution by state-of-the-art statistical sampling techniques. The commonly used algorithms that are evaluated, as quantified by the relative error in reproducing moments of the posterior, are four-dimensional variational data assimilation (4DVAR) and a variety of sequential filtering approximations based on three-dimensional variational data assimilation (3DVAR) and on extended and ensemble Kalman filters. The primary conclusions are that, under the assumption of a well-defined posterior probability distribution, (i) with appropriate parameter choices, approximate filters can perform well in reproducing the mean of the desired probability distribution, (ii) they do not perform as well in reproducing the covariance, and (iii) the error is compounded by the need to modify the covariance, in order to induce stability. Thus, filters can be a useful tool in predicting mean behavior but should be viewed with caution as predictors of uncertainty. These conclusions are intrinsic to the algorithms when assumptions underlying them are not valid and will not change if the model complexity is increased.

Publication: Monthly Weather Review Vol.: 140ISSN: 0027-0644

ID: CaltechAUTHORS:20160728-150615482

]]>

Abstract: We consider estimation of scalar functions that determine the dynamics of diffusion processes. It has been recently shown that nonparametric maximum likelihood estimation is ill-posed in this context. We adopt a probabilistic approach to regularize the problem by the adoption of a prior distribution for the unknown functional. A Gaussian prior measure is chosen in the function space by specifying its precision operator as an appropriate differential operator. We establish that a Bayesian–Gaussian conjugate analysis for the drift of one-dimensional nonlinear diffusions is feasible using high-frequency data, by expressing the loglikelihood as a quadratic function of the drift, with sufficient statistics given by the local time process and the end points of the observed path. Computationally efficient posterior inference is carried out using a finite element method. We embed this technology in partially observed situations and adopt a data augmentation approach whereby we iteratively generate missing data paths and draws from the unknown functional. Our methodology is applied to estimate the drift of models used in molecular dynamics and financial econometrics using high- and low-frequency observations. We discuss extensions to other partially observed schemes and connections to other types of nonparametric inference.

Publication: Biometrika Vol.: 99 No.: 3 ISSN: 1464-3510

ID: CaltechAUTHORS:20160728-152807075

]]>

Abstract: Diffusion limits of MCMC methods in high dimensions provide a useful theoretical tool for studying computational complexity. In particular, they lead directly to precise estimates of the number of steps required to explore the target measure, in stationarity, as a function of the dimension of the state space. However, to date such results have mainly been proved for target measures with a product structure, severely limiting their applicability. The purpose of this paper is to study diffusion limits for a class of naturally occurring high-dimensional measures found from the approximation of measures on a Hilbert space which are absolutely continuous with respect to a Gaussian reference measure. The diffusion limit of a random walk Metropolis algorithm to an infinite-dimensional Hilbert space valued SDE (or SPDE) is proved, facilitating understanding of the computational complexity of the algorithm.

Publication: Annals of Applied Probability Vol.: 22 No.: 3 ISSN: 1050-5164

ID: CaltechAUTHORS:20160728-154635836

]]>

Abstract: We consider the inverse problem of estimating a function u from noisy, possibly nonlinear, observations. We adopt a Bayesian approach to the problem. This approach has a long history for inversion, dating back to 1970, and has, over the last decade, gained importance as a practical tool. However most of the existing theory has been developed for Gaussian prior measures. Recently Lassas, Saksman and Siltanen (Inv. Prob. Imag. 2009) showed how to construct Besov prior measures, based on wavelet expansions with random coefficients, and used these prior measures to study linear inverse problems. In this paper we build on this development of Besov priors to include the case of nonlinear measurements. In doing so a key technical tool, established here, is a Fernique-like theorem for Besov measures. This theorem enables us to identify appropriate conditions on the forward solution operator which, when matched to properties of the prior Besov measure, imply the well-definedness and well-posedness of the posterior measure. We then consider the application of these results to the inverse problem of finding the diffusion coefficient of an elliptic partial differential equation, given noisy measurements of its solution.

Publication: Inverse Problems and Imaging Vol.: 6 No.: 2 ISSN: 1930-8345

ID: CaltechAUTHORS:20160728-153255881

]]>

Abstract: We present a parametric deterministic formulation of Bayesian inverse problems with an input parameter from infinite-dimensional, separable Banach spaces. In this formulation, the forward problems are parametric, deterministic elliptic partial differential equations, and the inverse problem is to determine the unknown, parametric deterministic coefficients from noisy observations comprising linear functionals of the solution. We prove a generalized polynomial chaos representation of the posterior density with respect to the prior measure, given noisy observational data. We analyze the sparsity of the posterior density in terms of the summability of the input data's coefficient sequence. The first step in this process is to estimate the fluctuations in the prior. We exhibit sufficient conditions on the prior model in order for approximations of the posterior density to converge at a given algebraic rate, in terms of the number N of unknowns appearing in the parametric representation of the prior measure. Similar sparsity and approximation results are also exhibited for the solution and covariance of the elliptic partial differential equation under the posterior. These results then form the basis for efficient uncertainty quantification, in the presence of data with noise.

Publication: Inverse Problems Vol.: 28 No.: 4 ISSN: 0266-5611

ID: CaltechAUTHORS:20160728-155039916

]]>

Abstract: Chemical reactions can be modeled via diffusion processes conditioned to make a transition between specified molecular configurations representing the state of the system before and after the chemical reaction. In particular the model of Brownian dynamics—gradient flow subject to additive noise—is frequently used. If the chemical reaction is specified to take place on a given time interval, then the most likely path taken by the system is a minimizer of the Onsager-Machlup functional. The Γ-limit of this functional is determined explicitly in the case where the temperature is small and the transition time scales as the inverse temperature.

Publication: Journal of Statistical Physics Vol.: 146 No.: 5 ISSN: 0022-4715

ID: CaltechAUTHORS:20160728-155454420

]]>

Abstract: The variational approach to data assimilation is a widely used methodology for both online prediction and for reanalysis. In either of these scenarios, it can be important to assess uncertainties in the assimilated state. Ideally, it is desirable to have complete information concerning the Bayesian posterior distribution for unknown state given data. We show that complete computational probing of this posterior distribution is now within the reach in the offline situation. We introduce a Markov chain–Monte Carlo (MCMC) method which enables us to directly sample from the Bayesian posterior distribution on the unknown functions of interest given observations. Since we are aware that these methods are currently too computationally expensive to consider using in an online filtering scenario, we frame this in the context of offline reanalysis. Using a simple random walk-type MCMC method, we are able to characterize the posterior distribution using only evaluations of the forward model of the problem, and of the model and data mismatch. No adjoint model is required for the method we use; however, more sophisticated MCMC methods are available which exploit derivative information. For simplicity of exposition, we consider the problem of assimilating data, either Eulerian or Lagrangian, into a low Reynolds number flow in a two-dimensional periodic geometry. We will show that in many cases it is possible to recover the initial condition and model error (which we describe as unknown forcing to the model) from data, and that with increasing amounts of informative data, the uncertainty in our estimations reduces.

Publication: International Journal for Numerical Methods in Fluids Vol.: 68 No.: 4 ISSN: 0271-2091

ID: CaltechAUTHORS:20160728-160352524

]]>

Abstract: We consider the inverse problem of determining the permeability from the pressure in a Darcy model of flow in a porous medium. Mathematically the problem is to find the diffusion coefficient for a linear uniformly elliptic partial differential equation in divergence form, in a bounded domain in dimension d ≤ 3, from measurements of the solution in the interior. We adopt a Bayesian approach to the problem. We place a prior random field measure on the log permeability, specified through the Karhunen–Loève expansion of its draws. We consider Gaussian measures constructed this way, and study the regularity of functions drawn from them. We also study the Lipschitz properties of the observation operator mapping the log permeability to the observations. Combining these regularity and continuity estimates, we show that the posterior measure is well defined on a suitable Banach space. Furthermore the posterior measure is shown to be Lipschitz with respect to the data in the Hellinger metric, giving rise to a form of well posedness of the inverse problem. Determining the posterior measure, given the data, solves the problem of uncertainty quantification for this inverse problem. In practice the posterior measure must be approximated in a finite dimensional space. We quantify the errors incurred by employing a truncated Karhunen–Loève expansion to represent this meausure. In particular we study weak convergence of a general class of locally Lipschitz functions of the log permeability, and apply this general theory to estimate errors in the posterior mean of the pressure and the pressure covariance, under refinement of the finite-dimensional Karhunen–Loève truncation.

Publication: SIAM Journal on Numerical Analysis Vol.: 49 No.: 6 ISSN: 0036-1429

ID: CaltechAUTHORS:20160728-160946614

]]>

Abstract: The Hybrid Monte Carlo (HMC) algorithm provides a framework for sampling from complex, high-dimensional target distributions. In contrast with standard Markov chain Monte Carlo (MCMC) algorithms, it generates nonlocal, nonsymmetric moves in the state space, alleviating random walk type behaviour for the simulated trajectories. However, similarly to algorithms based on random walk or Langevin proposals, the number of steps required to explore the target distribution typically grows with the dimension of the state space. We define a generalized HMC algorithm which overcomes this problem for target measures arising as finite-dimensional approximations of measures π which have density with respect to a Gaussian measure on an infinite-dimensional Hilbert space. The key idea is to construct an MCMC method which is well defined on the Hilbert space itself. We successively address the following issues in the infinite-dimensional setting of a Hilbert space: (i) construction of a probability measure Π in an enlarged phase space having the target π as a marginal, together with a Hamiltonian flow that preserves Π; (ii) development of a suitable geometric numerical integrator for the Hamiltonian flow; and (iii) derivation of an accept/reject rule to ensure preservation of Π when using the above numerical integrator instead of the actual Hamiltonian flow. Experiments are reported that compare the new algorithm with standard HMC and with a version of the Langevin MCMC method defined on a Hilbert space.

Publication: Stochastic Processes and their Applications Vol.: 121 No.: 10 ISSN: 0304-4149

ID: CaltechAUTHORS:20160804-150437807

]]>

Abstract: Filtering is a widely used methodology for the incorporation of observed data into time-evolving systems. It provides an online approach to state estimation inverse problems when data are acquired sequentially. The Kalman filter plays a central role in many applications because it is exact for linear systems subject to Gaussian noise, and because it forms the basis for many approximate filters which are used in high-dimensional systems. The aim of this paper is to study the effect of model error on the Kalman filter, in the context of linear wave propagation problems. A consistency result is proved when no model error is present, showing recovery of the true signal in the large data limit. This result, however, is not robust: it is also proved that arbitrarily small model error can lead to inconsistent recovery of the signal in the large data limit. If the model error is in the form of a constant shift to the velocity, the filtering and smoothing distributions only recover a partial Fourier expansion, a phenomenon related to aliasing. On the other hand, for a class of wave velocity model errors which are time dependent, it is possible to recover the filtering distribution exactly, but not the smoothing distribution. Numerical results are presented which corroborate the theory, and also propose a computational approach which overcomes the inconsistency in the presence of model error, by relaxing the model.

Publication: Inverse Problems Vol.: 27 No.: 9 ISSN: 0266-5611

ID: CaltechAUTHORS:20160801-175538072

]]>

Abstract: We provide an explicit rigorous derivation of a diffusion limit—a stochastic differential equation (SDE) with additive noise—from a deterministic skew-product flow. This flow is assumed to exhibit time-scale separation and has the form of a slowly evolving system driven by a fast chaotic flow. Under mild assumptions on the fast flow, we prove convergence to a SDE as the time-scale separation grows. In contrast to existing work, we do not require the flow to have good mixing properties. As a consequence, our results incorporate a large class of fast flows, including the classical Lorenz equations.

Publication: Nonlinearity Vol.: 24 No.: 4 ISSN: 0951-7715

ID: CaltechAUTHORS:20160804-164518594

]]>

Abstract: A series of recent articles introduced a method to construct stochastic partial differential equations (SPDEs) which are invariant with respect to the distribution of a given conditioned diffusion. These works are restricted to the case of elliptic diffusions where the drift has a gradient structure and the resulting SPDE is of second-order parabolic type. The present article extends this methodology to allow the construction of SPDEs which are invariant with respect to the distribution of a class of hypoelliptic diffusion processes, subject to a bridge conditioning, leading to SPDEs which are of fourth-order parabolic type. This allows the treatment of more realistic physical models, for example, one can use the resulting SPDE to study transitions between meta-stable states in mechanical systems with friction and noise. In this situation the restriction of the drift being a gradient can also be lifted.

Publication: Annals of Applied Probability Vol.: 21 No.: 2 ISSN: 1050-5164

ID: CaltechAUTHORS:20160804-162713014

]]>

Abstract: It is possible to implement importance sampling, and particle filter algorithms, where the importance sampling weight is random. Such random-weight algorithms have been shown to be efficient for inference for a class of diffusion models, as they enable inference without any (time discretization) approximation of the underlying diffusion model. One difficulty of implementing such random-weight algorithms is the requirement to have weights that are positive with probability 1. We show how Wald's identity for martingales can be used to ensure positive weights. We apply this idea to analysis of diffusion models from high frequency data. For a class of diffusion models we show how to implement a particle filter, which uses all the information in the data, but whose computational cost is independent of the frequency of the data. We use the Wald identity to implement a random-weight particle filter for these models which avoids time discretization error.

Publication: Journal of the Royal Statistical Society: Series B Vol.: 72 No.: 4 ISSN: 1369-7412

ID: CaltechAUTHORS:20170612-075052519

]]>

Abstract: It is possible to implement importance sampling, and particle filter algorithms, where the importance sampling weight is random. Such random-weight algorithms have been shown to be efficient for inference for a class of diffusion models, as they enable inference without any (time discretization) approximation of the underlying diffusion model. One difficulty of implementing such random-weight algorithms is the requirement to have weights that are positive with probability 1. We show how Wald's identity for martingales can be used to ensure positive weights. We apply this idea to analysis of diffusion models from high frequency data. For a class of diffusion models we show how to implement a particle filter, which uses all the information in the data, but whose computational cost is independent of the frequency of the data. We use the Wald identity to implement a random-weight particle filter for these models which avoids time discretization error.

Publication: Statistical Methodology Vol.: 72 No.: 4 ISSN: 1467-9868

ID: CaltechAUTHORS:20160804-164847166

]]>

Abstract: In the zero temperature limit, it is well known that in systems evolving via Brownian dynamics, the most likely transition path between reactant and product may be found as a minimizer of the Freidlin–Wentzell action functional. An analog for finite temperature transitions is given by the Onsager–Machlup functional. The purpose of this work is to investigate properties of Onsager–Machlup minimizers. We study transition paths for thermally activated molecules governed by the Langevin equation in the overdamped limit of Brownian dynamics. Using gradient descent in pathspace, we minimize the Onsager–Machlup functional for a range of model problems in one and two dimensions and then for some simple atomic models including Lennard-Jones seven-atom and 38-atom clusters, as well as for a model of vacancydiffusion in a planar crystal. Our results demonstrate interesting effects, which can occur at nonzero temperature, showing transition paths that could not be predicted on the basis of the zero temperature limit. However the results also demonstrate unphysical features associated with such Onsager–Machlup minimizers. As there is a growing literature that addresses transition path sampling by related techniques, these insights add a potentially useful perspective into the interpretation of this body of work.

Publication: Journal of Chemical Physics Vol.: 132 No.: 18 ISSN: 0021-9606

ID: CaltechAUTHORS:20161108-161530502

]]>

Abstract: Numerical approximation of the long time behavior of a stochastic differential equation (SDE) is considered. Error estimates for time-averaging estimators are obtained and then used to show that the stationary behavior of the numerical method converges to that of the SDE. The error analysis is based on using an associated Poisson equation for the underlying SDE. The main advantages of this approach are its simplicity and universality. It works equally well for a range of explicit and implicit schemes, including those with simple simulation of random variables, and for hypoelliptic SDEs. To simplify the exposition, we consider only the case where the state space of the SDE is a torus, and we study only smooth test functions. However, we anticipate that the approach can be applied more widely. An analogy between our approach and Stein’s method is indicated. Some practical implications of the results are discussed.

Publication: SIAM Journal of Numerical Analysis Vol.: 48 No.: 2 ISSN: 0036-1429

ID: CaltechAUTHORS:20160804-165401982

]]>

Abstract: The subject of inverse problems in differential equations is of enormous practical importance, and has also generated substantial mathematical and computational innovation. Typically some form of regularization is required to ameliorate ill-posed behaviour. In this article we review the Bayesian approach to regularization, developing a function space viewpoint on the subject. This approach allows for a full characterization of all possible solutions, and their relative probabilities, whilst simultaneously forcing significant modelling issues to be addressed in a clear and precise fashion. Although expensive to implement, this approach is starting to lie within the range of the available computational resources in many application areas. It also allows for the quantification of uncertainty and risk, something which is increasingly demanded by these applications. Furthermore, the approach is conceptually important for the understanding of simpler, computationally expedient approaches to inverse problems. We demonstrate that, when formulated in a Bayesian fashion, a wide range of inverse problems share a common mathematical framework, and we high- light a theory of well-posedness which stems from this. The well-posedness theory provides the basis for a number of stability and approximation results which we describe. We also review a range of algorithmic approaches which are used when adopting the Bayesian approach to inverse problems. These include MCMC methods, filtering and the variational approach.

Publication: Acta Numerica Vol.: 19ISSN: 0962-4929

ID: CaltechAUTHORS:20161111-112136150

]]>

Abstract: Inverse problems are often ill posed, with solutions that depend sensitively on data. In any numerical approach to the solution of such problems, regularization of some form is needed to counteract the resulting instability. This paper is based on an approach to regularization, employing a Bayesian formulation of the problem, which leads to a notion of well posedness for inverse problems, at the level of probability measures. The stability which results from this well posedness may be used as the basis for quantifying the approximation, in finite dimensional spaces, of inverse problems for functions. This paper contains a theory which utilizes this stability property to estimate the distance between the true and approximate posterior distributions, in the Hellinger metric, in terms of error estimates for approximation of the underlying forward problem. This is potentially useful as it allows for the transfer of estimates from the numerical analysis of forward problems into estimates for the solution of the related inverse problem. It is noteworthy that, when the prior is a Gaussian random field model, controlling differences in the Hellinger metric leads to control on the differences between expected values of polynomially bounded functions and operators, including the mean and covariance operator. The ideas are applied to some non-Gaussian inverse problems where the goal is determination of the initial condition for the Stokes or Navier–Stokes equation from Lagrangian and Eulerian observations, respectively.

Publication: SIAM Journal on Numerical Analysis Vol.: 48 No.: 1 ISSN: 0036-1429

ID: CaltechAUTHORS:20160804-170531840

]]>

Abstract: In this paper we establish a mathematical framework for a range of inverse problems for functions, given a finite set of noisy observations. The problems are hence underdetermined and are often ill-posed. We study these problems from the viewpoint of Bayesian statistics, with the resulting posterior probability measure being defined on a space of functions. We develop an abstract framework for such problems which facilitates application of an infinite-dimensional version of Bayes theorem, leads to a well-posedness result for the posterior measure (continuity in a suitable probability metric with respect to changes in data), and also leads to a theory for the existence of maximizing the posterior probability (MAP) estimators for such Bayesian inverse problems on function space. A central idea underlying these results is that continuity properties and bounds on the forward model guide the choice of the prior measure for the inverse problem, leading to the desired results on well-posedness and MAP estimators; the PDE analysis and probability theory required are thus clearly dileneated, allowing a straightforward derivation of results. We show that the abstract theory applies to some concrete applications of interest by studying problems arising from data assimilation in fluid mechanics. The objective is to make inference about the underlying velocity field, on the basis of either Eulerian or Lagrangian observations. We study problems without model error, in which case the inference is on the initial condition, and problems with model error in which case the inference is on the initial condition and on the driving noise process or, equivalently, on the entire time-dependent velocity field. In order to undertake a relatively uncluttered mathematical analysis we consider the two-dimensional Navier–Stokes equation on a torus. The case of Eulerian observations—direct observations of the velocity field itself—is then a model for weather forecasting. The case of Lagrangian observations—observations of passive tracers advected by the flow—is then a model for data arising in oceanography. The methodology which we describe herein may be applied to many other inverse problems in which it is of interest to find, given observations, an infinite-dimensional object, such as the initial condition for a PDE. A similar approach might be adopted, for example, to determine an appropriate mathematical setting for the inverse problem of determining an unknown tensor arising in a constitutive law for a PDE, given observations of the solution. The paper is structured so that the abstract theory can be read independently of the particular problems in fluid mechanics which are subsequently studied by application of the theory.

Publication: Inverse Problems Vol.: 25 No.: 11 ISSN: 0266-5611

ID: CaltechAUTHORS:20160805-151904215

]]>

Abstract: In applications such as molecular dynamics it is of interest to fit Smoluchowski and Langevin equations to data. Practitioners often achieve this by a variety of seemingly ad hoc procedures such as fitting to the empirical measure generated by the data and fitting to properties of autocorrelation functions. Statisticians, on the other hand, often use estimation procedures, which fit diffusion processes to data by applying the maximum likelihood principle to the path-space density of the desired model equations, and through knowledge of the properties of quadratic variation. In this paper we show that the procedures used by practitioners and statisticians to fit drift functions are, in fact, closely related and can be thought of as two alternative ways to regularize the (singular) likelihood function for the drift. We also present the results of numerical experiments which probe the relative efficacy of the two approaches to model identification and compare them with other methods such as the minimum distance estimator.

Publication: Multiscale Modeling and Simulation Vol.: 8 No.: 1 ISSN: 1540-3459

ID: CaltechAUTHORS:20160805-152349945

]]>

Abstract: We study the problem of parameter estimation using maximum likelihood for fast/slow systems of stochastic differential equations. Our aim is to shed light on the problem of model/data mismatch at small scales. We consider two classes of fast/slow problems for which a closed coarse-grained equation for the slow variables can be rigorously derived, which we refer to as averaging and homogenization problems. We ask whether, given data from the slow variable in the fast/slow system, we can correctly estimate parameters in the drift of the coarse-grained equation for the slow variable, using maximum likelihood. We show that, whereas the maximum likelihood estimator is asymptotically unbiased for the averaging problem, for the homogenization problem maximum likelihood fails unless we subsample the data at an appropriate rate. An explicit formula for the asymptotic error in the log-likelihood function is presented. Our theory is applied to two simple examples from molecular dynamics.

Publication: Stochastic Processes and their Applications Vol.: 119 No.: 10 ISSN: 0304-4149

ID: CaltechAUTHORS:20160805-153633492

]]>

Abstract: We investigate local MCMC algorithms, namely the random-walk Metropolis and the Langevin algorithms, and identify the optimal choice of the local step-size as a function of the dimension n of the state space, asymptotically as n→∞. We consider target distributions defined as a change of measure from a product law. Such structures arise, for instance, in inverse problems or Bayesian contexts when a product prior is combined with the likelihood. We state analytical results on the asymptotic behavior of the algorithms under general conditions on the change of measure. Our theory is motivated by applications on conditioned diffusion processes and inverse problems related to the 2D Navier–Stokes equation.

Publication: Annals of Applied Probability Vol.: 19 No.: 3 ISSN: 1050-5164

ID: CaltechAUTHORS:20160805-153017689

]]>

Abstract: In this paper we study the problem of the numerical calculation (by Monte Carlo methods) of the effective diffusivity for a particle moving in a periodic divergent-free velocity field, in the limit of vanishing molecular diffusion. In this limit traditional numerical methods typically fail, since they do not represent accurately the geometry of the underlying deterministic dynamics. We propose a stochastic splitting method that takes into account the volume-preserving property of the equations of motion in the absence of noise, and when inertial effects can be neglected. An extension of the method is then proposed for the cases where the noise has a non-trivial time-correlation structure and when inertial effects cannot be neglected. The method of modified equations is used to explain failings of Euler-based methods. The new stochastic geometric integrators are shown to outperform standard Euler-based integrators. Various asymptotic limits of physical interest are investigated by means of numerical experiments, using the new integrators.

Publication: Journal of Computational Physics Vol.: 288 No.: 4 ISSN: 0021-9991

ID: CaltechAUTHORS:20160805-155749338

]]>

Abstract: Hypoelliptic diffusion processes can be used to model a variety of phenomena in applications ranging from molecular dynamics to audio signal analysis. We study parameter estimation for such processes in situations where we observe some components of the solution at discrete times. Since exact likelihoods for the transition densities are typically not known, approximations are used that are expected to work well in the limit of small intersample times Δt and large total observation times N Δt. Hypoellipticity together with partial observation leads to ill conditioning requiring a judicious combination of approximate likelihoods for the various parameters to be estimated. We combine these in a deterministic scan Gibbs sampler alternating between missing data in the unobserved solution components, and parameters. Numerical experiments illustrate asymptotic consistency of the method when applied to simulated data. The paper concludes with an application of the Gibbs sampler to molecular dynamics data.

Publication: Journal of the Royal Society: Series B (Statistical Methodology) Vol.: 71 No.: 1 ISSN: 1467-9868

ID: CaltechAUTHORS:20161108-165631300

]]>

Abstract: Hypoelliptic diffusion processes can be used to model a variety of phenomena in applications ranging from molecular dynamics to audio signal analysis. We study parameter estimation for such processes in situations where we observe some components of the solution at discrete times. Since exact likelihoods for the transition densities are typically not known, approximations are used that are expected to work well in the limit of small intersample times Δt and large total observation times N Δt. Hypoellipticity together with partial observation leads to ill conditioning requiring a judicious combination of approximate likelihoods for the various parameters to be estimated. We combine these in a deterministic scan Gibbs sampler alternating between missing data in the unobserved solution components, and parameters. Numerical experiments illustrate asymptotic consistency of the method when applied to simulated data. The paper concludes with an application of the Gibbs sampler to molecular dynamics data.

Publication: Statistical Methodology Vol.: 71 No.: 1 ISSN: 1467-9868

ID: CaltechAUTHORS:20160805-155341773

]]>

Abstract: We present and study a Langevin MCMC approach for sampling nonlinear diffusion bridges. The method is based on recent theory concerning stochastic partial differential equations (SPDEs) reversible with respect to the target bridge, derived by applying the Langevin idea on the bridge pathspace. In the process, a Random-Walk Metropolis algorithm and an Independence Sampler are also obtained. The novel algorithmic idea of the paper is that proposed moves for the MCMC algorithm are determined by discretising the SPDEs in the time direction using an implicit scheme, parametrised by θ ∈ [0,1]. We show that the resulting infinite-dimensional MCMC sampler is well-defined only if θ = 1/2, when the MCMC proposals have the correct quadratic variation. Previous Langevin-based MCMC methods used explicit schemes, corresponding to θ = 0. The significance of the choice θ = 1/2 is inherited by the finite-dimensional approximation of the algorithm used in practice. We present numerical results illustrating the phenomenon and the theory that explains it. Diffusion bridges (with additive noise) are representative of the family of laws defined as a change of measure from Gaussian distributions on arbitrary separable Hilbert spaces; the analysis in this paper can be readily extended to target laws from this family and an example from signal processing illustrates this fact.

Publication: Stochastics and Dynamics Vol.: 8 No.: 3 ISSN: 1793-6799

ID: CaltechAUTHORS:20160805-165106874

]]>

Abstract: The bulk of this paper contains a concise mathematical overview of the subject of data assimilation, highlighting three primary ideas: (i) the standard optimization approaches of 3DVAR, 4DVAR and weak constraint 4DVAR are described and their interrelations explained; (ii) statistical analogues of these approaches are then introduced, leading to filtering (generalizing 3DVAR) and a form of smoothing (generalizing 4DVAR and weak constraint 4DVAR) and the optimization methods are shown to be maximum a posteriori estimators for the probability distributions implied by these statistical approaches; and (iii) by taking a general dynamical systems perspective on the subject it is shown that the incorporation of Lagrangian data can be handled by a straightforward extension of the preceding concepts. We argue that the smoothing approach to data assimilation, based on statistical analogues of 4DVAR and weak constraint 4DVAR, provides the optimal solution to the assimilation of space–time distributed data into a model. The optimal solution obtained is a probability distribution on the relevant class of functions (initial conditions or time-dependent solutions). The approach is a useful one in the first instance because it clarifies the notion of what is the optimal solution, thereby providing a benchmark against which existing approaches can be evaluated. In the longer term it also provides the potential for new methods to create ensembles of solutions to the model, incorporating the available data in an optimal fashion. Two examples are given illustrating this approach to data assimilation, both in the context of Lagrangian data, one based on statistical 4DVAR and the other on weak constraint statistical 4DVAR. The former is compared with the ensemble Kalman filter, which is thereby shown to be inaccurate in a variety of scenarios.

Publication: International Journal for Numerical Methods in Fluids Vol.: 56 No.: 8 ISSN: 0271-2091

ID: CaltechAUTHORS:20160805-165730529

]]>

Abstract: Lagrangian data arise from instruments that are carried by the flow in a fluid field. Assimilation of such data into ocean models presents a challenge due to the potential complexity of Lagrangian trajectories in relatively simple flow fields. We adopt a Bayesian perspective on this problem and thereby take account of the fully non-linear features of the underlying model. In the perfect model scenario, the posterior distribution for the initial state of the system contains all the information that can be extracted from a given realization of observations and the model dynamics. We work in the smoothing context in which the posterior on the initial conditions is determined by future observations. This posterior distribution gives the optimal ensemble to be used in data assimilation. The issue then is sampling this distribution. We develop, implement, and test sampling methods, based on Markov-chain Monte Carlo (MCMC), which are particularly well suited to the low-dimensional, but highly non-linear, nature of Lagrangian data. We compare these methods to the well-established ensemble Kalman filter (EnKF) approach. It is seen that the MCMC based methods correctly sample the desired posterior distribution whereas the EnKF may fail due to infrequent observations or non-linear structures in the underlying flow.

Publication: Tellus A Vol.: 60 No.: 2 ISSN: 0280-6495

ID: CaltechAUTHORS:20161108-173409695

]]>

Abstract: The understanding of adaptive algorithms for stochastic differential equations (SDEs) is an open area, where many issues related to both convergence and stability (long-time behaviour) of algorithms are unresolved. This paper considers a very simple adaptive algorithm, based on controlling only the drift component of a time step. Both convergence and stability are studied. The primary issue in the convergence analysis is that the adaptive method does not necessarily drive the time steps to zero with the user-input tolerance. This possibility must be quantified and shown to have low probability. The primary issue in the stability analysis is ergodicity. It is assumed that the noise is nondegenerate, so that the diffusion process is elliptic, and the drift is assumed to satisfy a coercivity condition. The SDE is then geometrically ergodic (averages converge to statistical equilibrium exponentially quickly). If the drift is not linearly bounded, then explicit fixed time step approximations, such as the Euler–Maruyama scheme, may fail to be ergodic. In this work, it is shown that the simple adaptive time-stepping strategy cures this problem. In addition to proving ergodicity, an exponential moment bound is also proved, generalizing a result known to hold for the SDE itself.

Publication: IMA Journal of Numerical Analysis Vol.: 27 No.: 3 ISSN: 0272-4979

ID: CaltechAUTHORS:20170612-132345468

]]>

Abstract: The viewpoint taken in this paper is that data assimilation is fundamentally a statistical problem and that this problem should be cast in a Bayesian framework. In the absence of model error, the correct solution to the data assimilation problem is to find the posterior distribution implied by this Bayesian setting. Methods for dealing with data assimilation should then be judged by their ability to probe this distribution. In this paper we propose a range of techniques for probing the posterior distribution, based around the Langevin equation; and we compare these new techniques with existing methods. When the underlying dynamics is deterministic, the posterior distribution is on the space of initial conditions leading to a sampling problem over this space. When the underlying dynamics is stochastic the posterior distribution is on the space of continuous time paths. By writing down a density, and conditioning on observations, it is possible to define a range of Markov Chain Monte Carlo (MCMC) methods which sample from the desired posterior distribution, and thereby solve the data assimilation problem. The basic building-blocks for the MCMC methods that we concentrate on in this paper are Langevin equations which are ergodic and whose invariant measures give the desired distribution; in the case of path space sampling these are stochastic partial differential equations (SPDEs). Two examples are given to show how data assimilation can be formulated in a Bayesian fashion. The first is weather prediction, and the second is Lagrangian data assimilation for oceanic velocity fields. Furthermore the relationship between the Bayesian approach outlined here and the commonly used Kalman filter based techniques, prevalent in practice, is discussed. Two simple pedagogical examples are studied to illustrate the application of Bayesian sampling to data assimilation concretely. Finally a range of open mathematical and computational issues, arising from the Bayesian approach, are outlined.

Publication: Physica D Vol.: 230 No.: 1-2 ISSN: 0167-2789

ID: CaltechAUTHORS:20170609-130839195

]]>

Abstract: We study the problem of parameter estimation for time-series possessing two, widely separated, characteristic time scales. The aim is to understand situations where it is desirable to fit a homogenized single-scale model to such multiscale data. We demonstrate, numerically and analytically, that if the data is sampled too finely then the parameter fit will fail, in that the correct parameters in the homogenized model are not identified. We also show, numerically and analytically, that if the data is subsampled at an appropriate rate then it is possible to estimate the coefficients of the homogenized model correctly. The ideas are studied in the context of thermally activated motion in a two-scale potential. However the ideas may be expected to transfer to other situations where it is desirable to fit an averaged or homogenized equation to multiscale data.

Publication: Journal of Statistical Physics Vol.: 127 No.: 4 ISSN: 0022-4715

ID: CaltechAUTHORS:20170613-124705885

]]>

Abstract: In many applications, it is important to be able to sample paths of SDEs conditional on observations of various kinds. This paper studies SPDEs which solve such sampling problems. The SPDE may be viewed as an infinite-dimensional analogue of the Langevin equation used in finite-dimensional sampling. In this paper, conditioned nonlinear SDEs, leading to nonlinear SPDEs for the sampling, are studied. In addition, a class of preconditioned SPDEs is studied, found by applying a Green’s operator to the SPDE in such a way that the invariant measure remains unchanged; such infinite dimensional evolution equations are important for the development of practical algorithms for sampling infinite dimensional problems. The resulting SPDEs provide several significant challenges in the theory of SPDEs. The two primary ones are the presence of nonlinear boundary conditions, involving first order derivatives, and a loss of the smoothing property in the case of the pre-conditioned SPDEs. These challenges are overcome and a theory of existence, uniqueness and ergodicity is developed in sufficient generality to subsume the sampling problems of interest to us. The Gaussian theory developed in Part I of this paper considers Gaussian SDEs, leading to linear Gaussian SPDEs for sampling. This Gaussian theory is used as the basis for deriving nonlinear SPDEs which affect the desired sampling in the nonlinear case, via a change of measure.

Publication: Annals of Applied Probability Vol.: 17 No.: 5/6 ISSN: 1050-5164

ID: CaltechAUTHORS:20170613-080132206

]]>

Abstract: We study the problem of homogenization for inertial particles moving in a time-dependent random velocity field and subject to molecular diffusion. We show that, under appropriate assumptions on the velocity field, the large-scale, long-time behavior of the inertial particles is governed by an effective diffusion equation for the position variable alone. This is achieved by the use of a formal multiple scales expansion in the scale parameter. The expansion relies on the hypoellipticity of the underlying diffusion. An expression for the diffusivity tensor is found and various of its properties are studied. The results of the formal multiscale analysis are justified rigorously by the use of the martingale central limit theorem. Our theoretical findings are supported by numerical investigations where we study the parametric dependence of the effective diffusivity on the various non-dimensional parameters of the problem.

Publication: Communications in Mathematical Sciences Vol.: 5 No.: 3 ISSN: 1539-6746

ID: CaltechAUTHORS:20161108-174342361

]]>

Abstract: We explore situations in which certain stochastic and high-dimensional deterministic systems behave effectively as low-dimensional dynamical systems. We define and study moment maps, maps on spaces of low-order moments of evolving distributions, as a means of understanding equation-free multiscale algorithms for these systems. The moment map itself is deterministic and attempts to capture the implied probability distribution of the dynamics. By choosing situations where the low-dimensional dynamics can be understood a priori, we evaluate the moment map. Despite requiring the evolution of an ensemble to define the map, this can be an efficient numerical tool, as the map opens up the possibility of bifurcation analyses and other high level tasks being performed on the system. We demonstrate how nonlinearity arises in these maps and how this results in the stabilization of metastable states. Examples are shown for a hierarchy of models, ranging from simple stochastic differential equations to molecular dynamics simulations of a particle in contact with a heat bath.

Publication: SIAM Journal on Applied Dynamical Systems Vol.: 5 No.: 3 ISSN: 1536-0040

ID: CaltechAUTHORS:20170612-131025773

]]>

Abstract: In this paper we present a rigorous asymptotic analysis for stochastic systems with two fast relaxation times. The mathematical model analyzed in this paper consists of a Langevin equation for the particle motion with time-dependent force constructed through an infinite dimensional Gaussian noise process. We study the limit as the particle relaxation time as well as the correlation time of the noise tend to zero, and we obtain the limiting equations under appropriate assumptions on the Gaussian noise. We show that the limiting equation depends on the relative magnitude of the two fast time scales of the system. In particular, we prove that in the case where the two relaxation times converge to zero at the same rate there is a drift correction, in addition to the limiting Itô integral, which is not of Stratonovich type. If, on the other hand, the colored noise is smooth on the scale of particle relaxation, then the drift correction is the standard Stratonovich correction. If the noise is rough on this scale, then there is no drift correction. Strong (i.e., pathwise) techniques are used for the proof of the convergence theorems.

Publication: Multiscale Modeling and Simulation Vol.: 4 No.: 1 ISSN: 1540-3459

ID: CaltechAUTHORS:20170612-130002475

]]>

Abstract: We study the problem of homogenization for inertial particles moving in a periodic velocity field, and subject to molecular diffusion. We show that, under appropriate assumptions on the velocity field, the large scale, long time behavior of the inertial particles is governed by an effective diffusion equation for the position variable alone. To achieve this we use a formal multiple scale expansion in the scale parameter. This expansion relies on the hypo-ellipticity of the underlying diffusion. An expression for the diffusivity tensor is found and various of its properties studied. In particular, an expansion in terms of the non-dimensional particle relaxation time τ (the Stokes number) is shown to co-incide with the known result for passive (non-inertial) tracers in the singular limit τ→0. This requires the solution of a singular perturbation problem, achieved by means of a formal multiple scales expansion in τ Incompressible and potential fields are studied, as well as fields which are neither, and theoretical findings are supported by numerical simulations.

Publication: Physica D Vol.: 204 No.: 3-4 ISSN: 0167-2789

ID: CaltechAUTHORS:20170609-143518545

]]>

Abstract: In many applications it is important to be able to sample paths of SDEs conditional on observations of various kinds. This paper studies SPDEs which solve such sampling problems. The SPDE may be viewed as an infinite dimensional analogue of the Langevin SDE used in finite dimensional sampling. Here the theory is developed for conditioned Gaussian processes for which the resulting SPDE is linear. Applications include the Kalman-Bucy filter/smoother. A companion paper studies the nonlinear case, building on the linear analysis provided here.

Publication: Communications in Mathematical Sciences Vol.: 3 No.: 4 ISSN: 1539-6746

ID: CaltechAUTHORS:20170612-141658808

]]>

Abstract: We study a class of “particle in a heat bath” models, which are a generalization of the well-known Kac–Zwanzig class of models, but where the coupling between the distinguished particle and the n heat bath particles is through nonlinear springs. The heat bath particles have random initial data drawn from an equilibrium Gibbs density. The primary objective is to approximate the forces exerted by the heat bath—which we do not want to resolve—by a stochastic process. By means of the central limit theorem for Gaussian processes, and heuristics based on linear response theory, we demonstrate conditions under which it is natural to expect that the trajectories of the distinguished particle can be weakly approximated, as n→∞, by the solution of a Markovian SDE. The quality of this approximation is verified by numerical calculations with parameters chosen according to the linear response theory. Alternatively, the parameters of the effective equation can be chosen using time series analysis. This is done and agreement with linear response theory is shown to be good.

Publication: Physica D Vol.: 199 No.: 3-4 ISSN: 0167-2789

ID: CaltechAUTHORS:20170609-143006730

]]>

Abstract: We introduce a stochastic PDE based approach to sampling paths of SDEs, conditional on observations. The SPDEs are derived by generalising the Langevin MCMC method to infinite dimensions. Various applications are described, including sampling paths subject to two end-point conditions (bridges) and nonlinear filter/smoothers.

Publication: Communications in Mathematical Sciences Vol.: 2 No.: 4 ISSN: 1539-6746

ID: CaltechAUTHORS:20170612-144819276

]]>

Abstract: In many applications, the primary objective of numerical simulation of time-evolving systems is the prediction of coarse-grained, or macroscopic, quantities. The purpose of this review is twofold: first, to describe a number of simple model systems where the coarse-grained or macroscopic behaviour of a system can be explicitly determined from the full, or microscopic, description; and second, to overview some of the emerging algorithmic approaches that have been introduced to extract effective, lower-dimensional, macroscopic dynamics. The model problems we describe may be either stochastic or deterministic in both their microscopic and macroscopic behaviour, leading to four possibilities in the transition from microscopic to macroscopic descriptions. Model problems are given which illustrate all four situations, and mathematical tools for their study are introduced. These model problems are useful in the evaluation of algorithms. We use specific instances of the model problems to illustrate these algorithms. As the subject of algorithm development and analysis is, in many cases, in its infancy, the primary purpose here is to attempt to unify some of the emerging ideas so that individuals new to the field have a structured access to the literature. Furthermore, by discussing the algorithms in the context of the model problems, a platform for understanding existing algorithms and developing new ones is built.

Publication: Nonlinearity Vol.: 17 No.: 6 ISSN: 0951-7715

ID: CaltechAUTHORS:20170609-153344149

]]>

Abstract: We consider the dynamics of systems in the presence of inertia and colored multiplicative noise. We study the limit where the particle relaxation time and the correlation time of the noise both tend to zero. We show that the limiting equation for the particle position depends on the magnitude of the particle relaxation time relative to the noise correlation time. In particular, the limiting equation should be interpreted either in the Itô or Stratonovich sense, with a crossover occurring when the two fast-time scales are of comparable magnitude. At the crossover the limiting stochastic differential equation is neither of Itô nor of Stratonovich type. This means that, after adiabatic elimination, the governing equations have different drift fields, leading to different physical behavior depending on the relative magnitude of the two fast-time scales. Our findings are supported by numerical simulations.

Publication: Physical Review E Vol.: 70 No.: 3 ISSN: 1539-3755

ID: CaltechAUTHORS:20170612-132002885

]]>

Abstract: In this paper we present a rigorous analysis of a scaling limit related to the motion of an inertial particle in a Gaussian random field. The mathematical model comprises Stokes's law for the particle motion and an infinite dimensional Ornstein-Uhlenbeck process for the fluid velocity field. The scaling limit studied leads to a white noise limit for the fluid velocity, which balances particle inertia and the friction term. Strong convergence methods are used to justify the limiting equations. The rigorously derived limiting equations are of physical interest for the concrete problem under investigation and facilitate the study of two-point motions in the white noise limit. Furthermore, the methodology developed may also prove useful in the study of various other asymptotic problems for stochastic differential equations in infinite dimensions.

Publication: Multiscale Modeling and Simulation Vol.: 1 No.: 4 ISSN: 1540-3459

ID: CaltechAUTHORS:20170609-165633126

]]>

Abstract: The purpose of this work is to shed light on an algorithm designed to extract effective macroscopic models from detailed microscopic simulations. The particular algorithm we study is a recently developed transfer operator approach due to Schütte et al. [20]. The investigations involve the formulation, and subsequent numerical study, of a class of model problems. The model problems are ordinary differential equations constructed to have the property that, when projected onto a low-dimensional subspace, the dynamics is approximately that of a stochastic differential equation exhibiting a finite-state-space Markov chain structure. The numerical studies show that the transfer operator approach can accurately extract finite-state Markov chain behavior embedded within high-dimensional ordinary differential equations. In so doing the studies lend considerable weight to existing applications of the algorithm to the complex systems arising in applications such as molecular dynamics. The algorithm is predicated on the assumption of Markovian input data; further numerical studies probe the role of memory effects. Although preliminary, these studies of memory indicate interesting avenues for further development of the transfer operator methodology.

Publication: Communications on Pure and Applied Mathematics Vol.: 56 No.: 2 ISSN: 0010-3640

ID: CaltechAUTHORS:20170609-144239769

]]>

Abstract: Positive results are proved here about the ability of numerical simulations to reproduce the exponential mean-square stability of stochastic differential equations (SDEs). The first set of results applies under finite-time convergence conditions on the numerical method. Under these conditions, the exponential mean-square stability of the SDE and that of the method (for sufficiently small step sizes) are shown to be equivalent, and the corresponding second-moment Lyapunov exponent bounds can be taken to be arbitrarily close. The required finite-time convergence conditions hold for the class of stochastic theta methods on globally Lipschitz problems. It is then shown that exponential mean-square stability for non-globally Lipschitz SDEs is not inherited, in general, by numerical methods. However, for a class of SDEs that satisfy a one-sided Lipschitz condition, positive results are obtained for two implicit methods. These results highlight the fact that for long-time simulation on nonlinear SDEs, the choice of numerical method can be crucial.

Publication: LMS Journal of Computation and Mathematics Vol.: 6ISSN: 1461-1570

ID: CaltechAUTHORS:20170612-133950435

]]>

Abstract: We study the long-time behaviour of large systems of ordinary differential equations with random data. Our main focus is a Hamiltonian system which describes a distinguished particle attached to a large collection of heat bath particles by springs. In the limit where the size of the heat bath tends to infinity, the trajectory of the distinguished particle can be weakly approximated, on finite time intervals, by a Langevin stochastic differential equation. We examine the long-term behaviour of these trajectories, both analytically and numerically. We find ergodic behaviour manifest in both the long-time empirical measures and in the resulting auto-correlation functions.

Publication: Stochastics and Dynamics Vol.: 02 No.: 04 ISSN: 1793-6799

ID: CaltechAUTHORS:20170612-072429575

]]>

Abstract: The preferential concentration of inertial particles in a turbulent velocity field occurs when the particle and fluid time constants are commensurate. We propose a straightforward mathematical model for this phenomenon and use the model to study various scaling limits of interest and to study numerically the effect of interparticle collisions. The model comprises Stokes’ law for the particle motions, and a Gaussian random field for the velocity. The primary advantages of the model are its amenability to mathematical analysis in various interesting scaling limits and the speed at which numerical simulations can be performed. The scaling limits corroborate experimental evidence about the lack of preferential concentration for a large and small Stokes number and make new predictions about the possibility of preferential concentration at large times and lead to stochastic differential equations governing this phenomenon. The effect of collisions is found to be negligible for the most part, although in some cases they have an interesting antidiffusive effect.

Publication: Physics of Fluids Vol.: 14 No.: 12 ISSN: 1070-6631

ID: CaltechAUTHORS:20170609-152746245

]]>

Abstract: The ergodic properties of SDEs, and various time discretizations for SDEs, are studied. The ergodicity of SDEs is established by using techniques from the theory of Markov chains on general state spaces, such as that expounded by Meyn–Tweedie. Application of these Markov chain results leads to straightforward proofs of geometric ergodicity for a variety of SDEs, including problems with degenerate noise and for problems with locally Lipschitz vector fields. Applications where this theory can be usefully applied include damped-driven Hamiltonian problems (the Langevin equation), the Lorenz equation with degenerate noise and gradient systems. The same Markov chain theory is then used to study time-discrete approximations of these SDEs. The two primary ingredients for ergodicity are a minorization condition and a Lyapunov condition. It is shown that the minorization condition is robust under approximation. For globally Lipschitz vector fields this is also true of the Lyapunov condition. However in the locally Lipschitz case the Lyapunov condition fails for explicit methods such as Euler–Maruyama; for pathwise approximations it is, in general, only inherited by specially constructed implicit discretizations. Examples of such discretization based on backward Euler methods are given, and approximation of the Langevin equation studied in some detail.

Publication: Stochastic Processes and their Applications Vol.: 101 No.: 2 ISSN: 0304-4149

ID: CaltechAUTHORS:20170609-125050787

]]>

Abstract: Traditional finite-time convergence theory for numerical methods applied to stochastic differential equations (SDEs) requires a global Lipschitz assumption on the drift and diffusion coefficients. In practice, many important SDE models satisfy only a local Lipschitz property and, since Brownian paths can make arbitrarily large excursions, the global Lipschitz-based theory is not directly relevant. In this work we prove strong convergence results under less restrictive conditions. First, we give a convergence result for Euler-Maruyama requiring only that the SDE is locally Lipschitz and that the pth moments of the exact and numerical solution are bounded for some p > 2. As an application of this general theory we show that an implicit variant of Euler-Maruyama converges if the diffusion coefficient is globally Lipschitz, but the drift coefficient satisfies only a one-sided Lipschitz condition; this is achieved by showing that the implicit method has bounded moments and may be viewed as an Euler-Maruyama approximation to a perturbed SDE of the same form. Second, we show that the optimal rate of convergence can be recovered if the drift coefficient is also assumed to behave like a polynomial.

Publication: SIAM Journal on Numerical Analysis Vol.: 40 No.: 3 ISSN: 0036-1429

ID: CaltechAUTHORS:20170613-133526351

]]>

Abstract: We study the dynamical behavior of the discontinuous Galerkin finite element method for initial value problems in ordinary differential equations. We make two different assumptions which guarantee that the continuous problem defines a dissipative dynamical system. We show that, under certain conditions, the discontinuous Galerkin approximation also defines a dissipative dynamical system and we study the approximation properties of the associated discrete dynamical system. We also study the behavior of difference schemes obtained by applying a quadrature formula to the integrals defining the discontinuous Galerkin approximation and construct two kinds of discrete finite element approximations that share the dissipativity properties of the original method.

Publication: Mathematics of Computation Vol.: 71 No.: 239 ISSN: 0025-5718

ID: CaltechAUTHORS:20170609-164822143

]]>

Abstract: The motion of an inertial particle in a Gaussian random field is studied. This is a model for the phenomenon of preferential concentration, whereby inertial particles in a turbulent flow can correlate significantly. Mathematically the motion is described by Newton's second law for a particle on a 2-D torus, with force proportional to the difference between a background fluid velocity and the particle velocity itself. The fluid velocity is defined through a linear stochastic PDE of Ornstein–Uhlenbeck type. The properties of the model are studied in terms of the covariance of the noise which drives the stochastic PDE. Sufficient conditions are found for almost sure existence and uniqueness of particle paths, and for a random dynamical system with a global random attractor. The random attractor is illustrated by means of a numerical experiment, and the relevance of the random attractor for the understanding of particle distributions is highlighted.

Publication: Stochastics and Dynamics Vol.: 02 No.: 02 ISSN: 1793-6799

ID: CaltechAUTHORS:20170612-073927763

]]>

Abstract: Two degenerate SDEs arising in statistical physics are studied. The first is a Langevin equation with state-dependent noise and damping. The second is the equation of motion for a particle obeying Stokes' law in a Gaussian random field; this field is chosen to mimic certain features of turbulence. Both equations are hypo-elliptic and smoothness of probability densities may be established. By developing appropriate Lyapunov functions and by studying the necessary control problems, geometric ergodicity is proved.

Publication: Markov Processes And Related Fields Vol.: 8 No.: 2 ISSN: 1024-2953

ID: CaltechAUTHORS:20170613-125012320

]]>

Abstract: We develop an efficient algorithm for detecting collisions among a large number of particles moving in a velocity field, when the field itself is possibly coupled to the particle motions. We build on ideas from molecular dynamics simulations and, as a byproduct, give a literature survey of methods for hard sphere molecular dynamics. We analyze the complexity of the algorithm in detail and present several experimental results on performance which corroborate the analysis. An optimal algorithm for collision detection has cost scaling at least like the total number of collisions detected. We argue, both theoretically and experimentally, that with the appropriate parameter choice and when the number of collisions grows with the number of particles at least as fast as for billiards, the algorithm we recommend is optimal.

Publication: Journal of Computational Physics Vol.: 172 No.: 2 ISSN: 0021-9991

ID: CaltechAUTHORS:20170612-063817274

]]>

Abstract: Some recent numerical and theoretical studies indicate that it is possible to accurately simulate the macroscopic motion of a particle in a heat bath, comprising coupled oscillators, without accurately resolving the fast frequencies in the heat bath itself. Here we study this issue further by performing numerical experiments on a wide variety of mechanical heat bath models, all generalizations of the Ford–Kac oscillator model. The results indicate that the nature of the particle-bath damping in the macroscopic limit crucially affects the ability of underresolved simulations to correctly predict macroscopic behaviour. In particular, problems for which the damping is local in time pose more severe problems for approximation. The root cause is that local damping typically arises from the degeneration of a memory kernel to a delta singularity in the macroscopic limit. The approximation of such singularities is a more delicate issue than the approximation of smoother memory kernels.

Publication: Journal of Computational Physics Vol.: 169 No.: 1 ISSN: 0021-9991

ID: CaltechAUTHORS:20170609-132504945

]]>

Abstract: Two model problems for stiff oscillatory systems are introduced. Both comprise a linear superposition of N ≫ 1 harmonic oscillators used as a forcing term for a scalar ODE. In the first case the initial conditions are chosen so that the forcing term approximates a delta function as N → ∞ and in the second case so that it approximates white noise. In both cases the fastest natural frequency of the oscillators is

Publication: Foundations of Computational Mathematics Vol.: 1 No.: 1 ISSN: 1615-3375

ID: CaltechAUTHORS:20170613-130016747

]]>

Abstract: Perturbations to Markov chains and Markov processes are considered. The unperturbed problem is assumed to be geometrically ergodic in the sense usually established through the use of Foster--Lyapunov drift conditions. The perturbations are assumed to be uniform, in a weak sense, on bounded time intervals. The long-time behavior of the perturbed chain is studied. Applications are given to numerical approximations of a randomly impulsed ODE, an Itô stochastic differential equation (SDE), and a parabolic stochastic partial differential equation (SPDE) subject to space-time Brownian noise. Existing perturbation theories for geometrically ergodic Markov chains are not readily applicable to these situations since they require very stringent hypotheses on the perturbations.

Publication: SIAM Journal on Numerical Analysis Vol.: 37 No.: 4 ISSN: 0036-1429

ID: CaltechAUTHORS:20170613-080747440

]]>

Abstract: A question of some interest in computational statistical mechanics is whether macroscopic quantities can be accurately computed without detailed resolution of the fastest scales in the problem. To address this question a simple model for a distinguished particle immersed in a heat bath is studied (due to Ford and Kac). The model yields a Hamiltonian system of dimension 2N+2 for the distinguished particle and the degrees of freedom describing the bath. It is proven that, in the limit of an infinite number of particles in the heat bath (N→∞), the motion of the distinguished particle is governed by a stochastic differential equation (SDE) of dimension 2. Numerical experiments are then conducted on the Hamiltonian system of dimension 2N+2 (N≫1) to investigate whether the motion of the distinguished particle is accurately computed (i.e., whether it is close to the solution of the SDE) when the time step is small relative to the natural time scale of the distinguished particle, but the product of the fastest frequency in the heat bath and the time step is not small—the underresolved regime in which many computations are performed. It is shown that certain methods accurately compute the limiting behavior of the distinguished particle, while others do not. Those that do not are shown to compute a different, incorrect, macroscopic limit.

Publication: Journal of Statistical Physics Vol.: 97 No.: 3/4 ISSN: 0022-4715

ID: CaltechAUTHORS:20170609-161129844

]]>

Abstract: Differential equations subject to random impulses are studied. Randomness is introduced both through the time between impulses, which is distributed exponentially, and through the sign of the impulses, which are fixed in amplitude and orientation. Such models are particular instances of piecewise deterministic Markov processes and they arise naturally in the study of a number of physical phenomena, particularly impacting systems. The underlying deterministic semigroup is assumed to be dissipative and a general theorem which establishes the existence of invariant measures for the randomly forced problem is proved. Further structure is then added to the deterministic semigroup, which enables the proof of ergodic theorems. Characteristic functions are used for the case when the deterministic component forms a damped linear problem and irreducibility measures are employed for the study of a randomly forced damped double-well nonlinear oscillator with a gradient structure.

Publication: Journal of Differential Equations Vol.: 155 No.: 2 ISSN: 0022-0396

ID: CaltechAUTHORS:20170609-124028394

]]>

Abstract: Suppose that a consistent one-step numerical method of order r is applied to a smooth system of ordinary differential equations. Given any integer m ⩾ 1, the method may be shown to be of order r + m as an approximation to a certain modified equation. If the method and the system have a particular qualitative property then it is important to determine whether the modified equations inherit this property. In this article, a technique is introduced for proving that the modified equations inherit qualitative properties from the method and the underlying system. The technique uses a straightforward contradiction argument applicable to arbitrary one-step methods and does not rely on the detailed structure of associated power series expansions. Hence the conclusions apply, but are not restricted, to the case of Runge-Kutte methods. The new approach unifies and extends results of this type that have been derived by other means: results are presented for integral preservation, reversibility, inheritance of fixed points. Hamiltonian problems and volume preservation. The technique also applies when the system has an integral that the method preserves not exactly, but to order greater than r. Finally, a negative result is obtained by considering a gradient system and gradient numerical method possessing a global property that is not shared by the associated modified equations.

Publication: IMA Journal of Numerical Analysis Vol.: 19 No.: 2 ISSN: 0272-4979

ID: CaltechAUTHORS:20170609-164025962

]]>

Abstract: We prove convergence results on finite time intervals, as the user-defined tolerance τ→0, for a class of adaptive timestepping ODE solvers that includes the ode23 routine supplied in MATLAB Version 4.2. In contrast to existing theories, these convergence results hold with error constants that are uniform in the neighbourhood of equilibria; such uniformity is crucial for the derivation of results concerning the numerical approximation of dynamical systems. For linear problems the error estimates are uniform on compact sets of initial data. The analysis relies upon the identification of explicit embedded Runge-Kutta pairs for which all but the leading order terms of the expansion of the local error estimate areO(∥f(u∥)^2).

Publication: BIT Numerical Mathematics Vol.: 38 No.: 4 ISSN: 0006-3835

ID: CaltechAUTHORS:20170609-154149149

]]>

Abstract: Waveform relaxation algorithms for partial differential equations (PDEs) are traditionally obtained by discretizing the PDE in space and then splitting the discrete operator using matrix splittings. For the semidiscrete heat equation one can show linear convergence on unbounded time intervals and superlinear convergence on bounded time intervals by this approach. However, the bounds depend in general on the mesh parameter and convergence rates deteriorate as one refines the mesh. Motivated by the original development of waveform relaxation in circuit simulation, where the circuits are split in the physical domain into subcircuits, we split the PDE by using overlapping domain decomposition. We prove linear convergence of the algorithm in the continuous case on an infinite time interval, at a rate depending on the size of the overlap. This result remains valid after discretization in space and the convergence rates are robust with respect to mesh refinement. The algorithm is in the class of waveform relaxation algorithms based on overlapping multisplittings. Our analysis quantifies the empirical observation by Jeltsch and Pohl [SIAM J. Sci. Comput., 16 (1995), pp. 40--49] that the convergence rate of a multisplitting algorithm depends on the overlap. Numerical results are presented which support the convergence theory.

Publication: SIAM Journal on Scientific Computing Vol.: 19 No.: 6 ISSN: 1064-8275

ID: CaltechAUTHORS:20170612-134545090

]]>

Abstract: The effect of using grid adaptation on the numerical solution of model convection-diffusion equations with a conservation form is studied. The grid adaptation technique studied is based on moving a fixed number of mesh points to equidistribute a generalization of the arc-length of the solution. In particular, a parameter-dependent monitor function is introduced which incorporates fixed meshes, approximate arc-length equidistribution, and equidistribution of the absolute value of the solution, in a single framework. Thus the resulting numerical method is a coupled nonlinear system of equations for the mesh spacings and the nodal values. A class of singularly perturbed problems, including Burgers's equation in the limit of small viscosity, is studied. Singular perturbation and bifurcation techniques are used to analyze the solution of the discretized equations, and numerical results are compared with the results from the analysis. Computation of the bifurcation diagram of the system is performed numerically using a continuation method and the results are used to illustrate the theory. It is shown that equidistribution does not remove spurious solutions present on a fixed mesh and that, furthermore, the spurious solutions can be stable for an appropriate moving mesh method.

Publication: SIAM Journal on Scientific Computing Vol.: 20 No.: 2 ISSN: 1064-8275

ID: CaltechAUTHORS:20170612-143618623

]]>

Abstract: We show that results concerning the persistence of invariant sets of ordinary differential equations under perturbation may be applied directly to a certain class of partial differential equations. Our framework is particularly well-suited to encompass numerical approximations of these partial differential equations. Specifically, we show that for a class of PDEs with aC^1 inertial form, certain natural numerical approximations possess an inertial form close to that of the underlying PDE in theC^1 norm.

Publication: Journal of Mathematical Analysis and Applictions Vol.: 219 No.: 2 ISSN: 0022-247X

ID: CaltechAUTHORS:20170609-123316540

]]>

Abstract: Positive results are obtained about the effect of local error control in numerical simulations of ordinary differential equations. The results are cast in terms of the local error tolerance. Under theassumption that a local error control strategy is successful, it is shown that a continuous interpolant through the numerical solution exists that satisfies the differential equation to within a small, piecewise continuous, residual. The assumption is known to hold for thematlab ode23 algorithm [10] when applied to a variety of problems. Using the smallness of the residual, it follows that at any finite time the continuous interpolant converges to the true solution as the error tolerance tends to zero. By studying the perturbed differential equation it is also possible to prove discrete analogs of the long-time dynamical properties of the equation—dissipative, contractive and gradient systems are analysed in this way.

Publication: BIT Numerical Mathematics Vol.: 38 No.: 1 ISSN: 0006-3835

ID: CaltechAUTHORS:20170613-110046810

]]>

Abstract: In this paper the properties of waveform relaxation are studied when applied to the dynamical system generated by an autonomous ordinary differential equation. In particular, the effect of the waveform relaxation on the invariant sets of the flow is analysed. Windowed waveform relaxation is studied, whereby the iterative technique is applied on successive time intervals of length T and a fixed, finite, number of iterations taken on each window. This process does not generate a dynamical system on R+ since two different applications of the waveform algorithm over different time intervals do not, in general, commute. In order to generate a dynamical system it is necessary to consider the time T map generated by the relaxation process. This is done, and C^1-closeness of the resulting map to the time T map of the underlying ordinary differential equation is established. Using this, various results from the theory of dynamical systems are applied, and the results discussed.

Publication: Mathematics of Computation Vol.: 66 No.: 219 ISSN: 0025-5718

ID: CaltechAUTHORS:20170612-145717225

]]>

Abstract: The numerical solution of initial value problems for ordinary differential equations is frequently performed by means of adaptive algorithms with user-input tolerance τ. The time-step is then chosen according to an estimate, based on small time-step heuristics, designed to try and ensure that an approximation to the local error commited is bounded by τ. A question of natural interest is to determine how the global error behaves with respect to the tolerance τ. This has obvious practical interest and also leads to an interesting problem in mathematical analysis. The primary difficulties arising in the analysis are that: (i) the time-step selection mechanisms used in practice are discontinuous as functions of the specified data; (ii) the small time-step heuristics underlying the control of the local error can break down in some cases. In this paper an analysis is presented which incorporates these two difficulties. For a mathematical model of an error per unit step or error per step adaptive Runge–Kutta algorithm, it may be shown that in a certain probabilistic sense, with respect to a measure on the space of initial data, the small time-step heuristics are valid with probability one, leading to a probabilistic convergence result for the global error as τ → 0. The probabilistic approach is only valid in dimension m > 1; this observation is consistent with recent analysis concerning the existence of spurious steady solutions of software codes which highlights the difference between the cases m = 1 and m > 1. The breakdown of the small time-step heuristics can be circumvented by making minor modifications to the algorithm, leading to a deterministic convergence proof for the global error of such algorithms as τ → 0. An underlying theory is developed and the deterministic and probabilistic convergence results proved as particular applications of this theory.

Publication: Numerical Algorithms Vol.: 14 No.: 1/3 ISSN: 1017-1398

ID: CaltechAUTHORS:20170613-132616648

]]>

Abstract: The viscous Cahn–Hilliard equation may be viewed as a singular limit of the phase-field equations for phase transitions. It contains both the Allen–Cahn and Cahn–Hilliard models of phase separation as particular cases; by specific choices of parameters it may be formulated as a one-parameter (sayα) homotopy connecting the Cahn–Hilliard (α=0) and Allen–Cahn (α=1) models. The limitα=0 is singular in the sense that the smoothing property of the analytic semigroup changes from being of the type associated with second order operators to the type associated with fourth order operators. The properties of the gradient dynamical system generated by the viscous Cahn–Hilliard equation are studied asαvaries in [0, 1]. Continuity of the phase portraits near equilibria is established independently ofα∈[0, 1] and, using this, a piecewise, uniform in time, perturbation result is proved for trajectories. Finally the continuity of the attractor is established and, in one dimension, the existence and continuity of inertial manifolds shown and the flow on the attractor detailed.

Publication: Journal of Differential Equations Vol.: 128 No.: 2 ISSN: 0022-0396

ID: CaltechAUTHORS:20170609-125856997

]]>

Abstract: A class of nonlinear dissipative partial differential equations that possess finite dimensional attractive invariant manifolds is considered. An existence and perturbation theory is developed which unifies the cases of unstable manifolds and inertial manifolds into a single framework. It is shown that certain approximations of these equations, such as those arising from spectral or finite element methods in space, one-step time-discretization or a combination of both. also have attractive invariant manifolds. Convergence of the approximate manifolds to the true manifolds is established as the approximation is refined. In this part of the paper applications to the behavior of inertial manifolds under approximation are considered. From this analysis deductions about the structure of the attractor and the flow on the attractor under discretization can be made.

Publication: Journal of Differential Equations Vol.: 123 No.: 2 ISSN: 0022-0396

ID: CaltechAUTHORS:20170612-064633625

]]>

Abstract: Although most adaptive software for initial value problems is designed with an accuracy requirement—control of the local error—it is frequently observed that stability is imparted by the adaptation. This relationship between local error control and numerical stability is given a firm theoretical underpinning. The dynamics of numerical methods with local error control are studied for three classes of ordinary differential equations: dissipative, contractive, and gradient systems. Dissipative dynamical systems are characterised by having a bounded absorbing set B which all trajectories eventually enter and remain inside. The exponentially contractive problems studied have a unique, globally exponentially attracting equilibrium point and thus they are also dissipative since the absorbing set B may be chosen to be a ball of arbitrarily small radius around the equilibrium point. The gradient systems studied are those for which the set of equilibria comprises isolated points and all trajectories are bounded so that each trajectory converges to an equilibrium point as t → ∞. If the set of equilibria is bounded then the gradient systems are also dissipative. Conditions under which numerical methods with local error control replicate these large-time dynamical features are described. The results are proved without recourse to asymptotic expansions for the truncation error. Standard embedded Runge–Kutta pairs are analysed together with several nonstandard error control strategies. Both error per step and error per unit step strategies are considered. Certain embedded pairs are identified for which the sequence generated can be viewed as coming from a small perturbation of an algebraically stable scheme, with the size of the perturbation proportional to the tolerance τ. Such embedded pairs are defined to be essentially algebraically stable and explicit essentially stable pairs are identified. Conditions on the tolerance τ are identified under which appropriate discrete analogues of the properties of the underlying differential equation may be proved for certain essentially stable embedded pairs. In particular, it is shown that for dissipative problems the discrete dynamical system has an absorbing set B_τ and is hence dissipative. For exponentially contractive problems the radius of B_τ is proved to be proportional to τ. For gradient systems the numerical solution enters and remains in a small ball about one of the equilibria and the radius of the ball is proportional to τ. Thus the local error control mechanisms confer desirable global properties on the numerical solution. It is shown that for error per unit step strategies the conditions on the tolerance τ are independent of initial data while for error per step strategies the conditions are initial-data dependent. Thus error per unit step strategies are considerably more robust.

Publication: SIAM Journal on Numerical Analysis Vol.: 32 No.: 6 ISSN: 0036-1429

ID: CaltechAUTHORS:20170613-084044146

]]>

Abstract: In this note, we consider numerical methods for a class of Hamiltonian systems that preserve the Hamiltonian. We show that the rate of growth of error is at most linear in time when such methods are applied to problems with period uniquely determined by the value of the Hamiltonian. This contrasts to generic numerical schemes, for which the rate of error growth is superlinear. Asymptotically, the rate of error growth for symplectic schemes is also linear. Hence, Hamiltonian-conserving schemes are competitive with symplectic schemes in this respect. The theory is illustrated with a computation performed on Kepler's problem for the interaction of two bodies.

Publication: Zeitschrift für Angewandte Mathematik und Physik Vol.: 46 No.: 3 ISSN: 0044-2275

ID: CaltechAUTHORS:20170613-104839336

]]>

Abstract: A semi-discrete spatial finite difference approximation to the complex Ginzburg-Landau equation with cubic non-linearity is considered. Using the fractional powers of a sectorial operator, discrete versions of the Sobolev spaces H^5, and Gevrey classes of regularity G, are introduced.Discrete versions of some standard Sobolev space norm inequalities are proved.

Publication: Numerical Functional Analysis and Optimization Vol.: 16 No.: 7-8 ISSN: 0163-0563

ID: CaltechAUTHORS:20170613-143108777

]]>

Abstract: The viscous Cahn-Hilliard equation arises as a singular limit of the phase-field model of phase transitions. It contains both the Cahn-Hilliard and Allen-Cahn equations as particular limits. The equation is in gradient form and possesses a compact global attractor A, comprising heteroclinic orbits between equilibria. Two classes of computation are described. First heteroclinic orbits on the global attractor are computed; by using the viscous Cahn-Hilliard equation to perform a homotopy, these results show that the orbits, and hence the geometry of the attractors, are remarkably insensitive to whether the Allen-Cahn or Cahn-Hilliard equation is studied. Second, initial-value computations are described; these computations emphasize three differing mechanisms by which interfaces in the equation propagate for the case of very small penalization of interfacial energy. Furthermore, convergence to an appropriate free boundary problem is demonstrated numerically.

Publication: Nonlinearity Vol.: 8 No.: 2 ISSN: 0951-7715

ID: CaltechAUTHORS:20170612-135245414

]]>

Abstract: Time dependent solutions of the Cahn-Hilliard equation are studied numerically. In particular heteroclinic orbits, which connect different equilibrium solutions at t = -∞ and t = +∞, are sought. Thus boundary value problems in space-time are computed. This computation requires an investigation of the stability of equilibria, since projections onto the stable and unstable manifolds determine the boundary conditions at t = -∞ and t = +∞. This stability analysis is then followed by solution of the appropriate boundary value problem in space-time. The results obtained cannot be found by standard initial value simulations. By specifying the two steady states at t = ±∞ appropriately it is possible to find orbits reflecting a given degree of coarsening over the time evolution. This gives a clear picture of the dynamic coarsening admissible in the equation. It also provides an understanding of orbits on the global attractor for the equation.

Publication: Physica D Vol.: 78 No.: 3-4 ISSN: 0167-2789

ID: CaltechAUTHORS:20170609-122809369

]]>

Abstract: The numerical approximation of dissipative initial value problems by fixed time-stepping Runge–Kutta methods is considered and the asymptotic features of the numerical and exact solutions are compared. A general class of ordinary differential equations, for which dissipativity is induced through an inner product, is studied throughout. This class arises naturally in many finite dimensional applications (such as the Lorenz equations) and also from the spatial discretization of a variety of partial differential equations arising in applied mathematics. It is shown that the numerical solution defined by an algebraically stable method has an absorbing set and is hence dissipative for any fixed step-size h > 0. The numerical solution is shown to define a dynamical system on the absorbing set if h is sufficiently small and hence a global attractor A_h exists; upper-semicontinuity of A_h at h = 0 is established, which shows that, for h small, every point on the numerical attractor is close to a point on the true global attractor A. Under the additional assumption that the problem is globally Lipschitz, it is shown that if h is sufficiently small any method with positive weights defines a dissipative dynamical system on the whole space and upper semicontinuity of A_h at h = 0 is again established. For gradient systems with globally Lipschitz vector fields it is shown that any Runge–Kutta method preserves the gradient structure for h sufficiently small. For general dissipative gradient systems it is shown that algebraically stable methods preserve the gradient structure within the absorbing set for h sufficiently small. Convergence of the numerical attractor is studied and, for a dissipative gradient system with hyperbolic equilibria, lower semicontinuity at h = 0 is established. Thus, for such a system, A_h converges to A in the Hausdorff metric as h → 0.

Publication: SIAM Journal on Numerical Analysis Vol.: 31 No.: 5 ISSN: 0036-1429

ID: CaltechAUTHORS:20170613-084043889

]]>

Abstract: A reaction-diffusion-convection equation with a nonlocal term is studied; the nonlocal operator acts to conserve the spatial integral of the unknown function as time evolves. The equations are parameterised by µ, and for µ = 1 the equation arises as a similarity solution of the Navier-Stokes equations and the nonlocal term plays the role of pressure. For µ = 0, the equation is a nonlocal reaction-diffusion problem. The aim of the paper is to determine for which values of the parameter µ blow-up occurs and to study its form. In particular, interest is focused on the three cases µ < 1/2, µ > 1/2, and µ → 1. It is observed that, for any 0 ≤ µ ≤ 1/2, nonuniform global blow-up occurs; if 1/2 < µ < 1, then the blow-up is global and uniform, while for µ = 1 (the Navier-Stokes equations) there are exact solutions with initial data of arbitrarily large L_∞, L_2, and H^1 norms that decay to zero. Furthermore, one of these exact solutions is proved to be nonlinearly stable in L_2 for arbitrarily large supremum norm. An understanding of this transition from blow-up behaviour to decay behaviour is achieved by a combination of analysis, asymptotics, and numerical techniques.

Publication: SIAM Journal on Applied Mathematics Vol.: 54 No.: 3 ISSN: 0036-1399

ID: CaltechAUTHORS:20170613-120606856

]]>

Abstract: In the past numerical stability theory for initial value problems in ordinary differential equations has been dominated by the study of problems with simple dynamics; this has been motivated by the need to study error propagation mechanisms in stiff problems, a question modeled effectively by contractive linear or nonlinear problems. While this has resulted in a coherent and self-contained body of knowledge, it has never been entirely clear to what extent this theory is relevant for problems exhibiting more complicated dynamics. Recently there have been a number of studies of numerical stability for wider classes of problems admitting more complicated dynamics. This on-going work is unified and, in particular, striking similarities between this new developing stability theory and the classical linear and nonlinear stability theories are emphasized. The classical theories of A, B and algebraic stability for Runge–Kutta methods are briefly reviewed; the dynamics of solutions within the classes of equations to which these theories apply—linear decay and contractive problems—are studied. Four other categories of equations—gradient, dissipative, conservative and Hamiltonian systems—are considered. Relationships and differences between the possible dynamics in each category, which range from multiple competing equilibria to chaotic solutions, are highlighted. Runge-Kutta schemes that preserve the dynamical structure of the underlying problem are sought, and indications of a strong relationship between the developing stability theory for these new categories and the classical existing stability theory for the older problems are given. Algebraic stability, in particular, is seen to play a central role. It should be emphasized that in all cases the class of methods for which a coherent and complete numerical stability theory exists, given a structural assumption on the initial value problem, is often considerably smaller than the class of methods found to be effective in practice. Nonetheless it is arguable that it is valuable to develop such stability theories to provide a firm theoretical framework in which to interpret existing methods and to formulate goals in the construction of new methods. Furthermore, there are indications that the theory of algebraic stability may sometimes be useful in the analysis of error control codes which are not stable in a fixed step implementation; this work is described.

Publication: SIAM Review Vol.: 36 No.: 2 ISSN: 0036-1445

ID: CaltechAUTHORS:20170613-100013806

]]>

Abstract: This article reviews the application of various notions from the theory of dynamical systems to the analysis of numerical approximation of initial value problems over long-time intervals. Standard error estimates comparing individual trajectories are of no direct use in this context since the error constant typically grows like the exponential of the time interval under consideration. Instead of comparing trajectories, the effect of discretization on various sets which are invariant under the evolution of the underlying differential equation is studied. Such invariant sets are crucial in determining long-time dynamics. The particular invariant sets which are studied are equilibrium points, together with their unstable manifolds and local phase portraits, periodic solutions, quasi-periodic solutions and strange attractors. Particular attention is paid to the development of a unified theory and to the development of an existence theory for invariant sets of the underlying differential equation which may be used directly to construct an analogous existence theory (and hence a simple approximation theory) for the numerical method.

Publication: Acta Numerica Vol.: 3ISSN: 0962-4929

ID: CaltechAUTHORS:20170613-082428693

]]>

Abstract: A class of scalar semilinear parabolic equations possessing absorbing sets, a Lyapunov functional, and a global attractor are considered. The gradient structure of the problem implies that, provided all steady states are isolated, solutions approach a steady state as $t \to \infty $. The dynamical properties of various finite difference and finite element schemes for the equations are analysed. The existence of absorbing sets, bounded independently of the mesh size, is proved for the numerical methods. Discrete Lyapunov functions are constructed to show that, under appropriate conditions on the mesh parameters, numerical orbits approach steady state solutions as discrete time increases. However, it is shown that insufficient spatial resolution can introduce deceptively smooth spurious steady solutions and cause the stability properties of the true steady solutions to be incorrectly represented. Furthermore, it is also shown that the explicit Euler scheme introduces spurious solutions with period 2 in the timestep. As a result, the absorbing set is destroyed and there is initial data leading to blow up of the scheme, however small the mesh parameters are taken. To obtain stabilization to a steady state for this scheme, it is necessary to restrict the timestep in terms of the initial data and the space step. Implicit schemes are constructed for which absorbing sets and Lyapunov functions exist under restrictions on the timestep that are independent of initial data and of the space step; both one-step and multistep (BDF) methods are studied.

Publication: SIAM Journal on Numerical Analysis Vol.: 30 No.: 6 ISSN: 0036-1429

ID: CaltechAUTHORS:20170613-070150162

]]>

Abstract: In order to accomplish the transition from avascular to vascular growth, solid tumours secrete a diffusible substance known as tumour angiogenesis factor (TAF) into the surrounding tissue. Endothelial cells which form the lining of neighbouring blood vessels respond to this chemotactic stimulus in a well-ordered sequence of events consisting, at minimum, of a degradation of their basement membrane, migration, and proliferation. A model mechanism is presented which includes the diffusion of the TAF into the surrounding host tissue and the response of the endothelial cells to the chemotactic stimulus. The model accounts for the main observed events associated with the endothelial cells during the process of angiogenesis (i.e. cell migration and proliferation); the numerical results compare very well with experimental observations. The situation where the tumour (i.e. the source of TAF) is removed and the vessels recede is also considered.

Publication: Mathematical Medicine and Biology Vol.: 10 No.: 3 ISSN: 1477-8599

ID: CaltechAUTHORS:20170612-071258761

]]>

Abstract: A reaction-diffusion equation with a nonlocal term is studied. The nonlocal term acts to conserve the spatial integral of the unknown function as time evolves. Such equations give insight into biological and chemical problems where conservation properties predominate. The aim of the paper is to understand how the conservation property affects the nature of blowup. The equation studied has a trivial steady solution that is proved to be stable. Existence of nontrivial steady solutions is proved, and their instability established numerically. Blowup is proved for sufficiently large initial data by using a comparison principle in Fourier space. The nature of the blowup is investigated by a combination of asymptotic and numerical calculations.

Publication: SIAM Journal on Applied Mathematics Vol.: 53 No.: 3 ISSN: 0036-1399

ID: CaltechAUTHORS:20170613-080915743

]]>

Abstract: The numerical computation of heteroclinic connections in partial differential equations (PDEs) with a gradient structure, such as those arising in the modeling of phase transitions, is considered. Initially, a scalar reaction diffusion equation is studied; structural assumptions are made on the problem to ensure the existence of absorbing sets and, consequently, a global attractor. As a result of the gradient structure, it is known that, if all equilibria are hyperbolic, the global attractor comprises the set of equilibria and heteroclinic orbits connecting equilibria to one another. Thus it is natural to consider direct approximation of the set of equilibria and the connecting orbits. Results are proved about the Fourier spanning basis for branches of equilibria and also for certain heteroclinic connections; these results exploit the oddness of the nonlinearity. The reaction-diffusion equation is then approximated by a Galerkin spectral discretization to produce a system of ordinary differential equations (ODEs). Analogous results to those holding for the PDE are proved for the ODEs—in particular, the existence and structure of the global attractor and appropriate spanning bases for the equilibria and certain heteroclinic connections, are studied. Heteroclinic connections in the system of ODEs are then computed using a generalization of known methods to cope with the gradient structure. Suitable parameterizations of the attractor are introduced and numerical continuation used to find families of connections on the attractor. Special connections, which are stable in certain Fourier spanning bases, are used as starting points for the computations. The methods used allow the calculation of connecting orbits that are unstable as solutions of the initial value problem, and thus provide a computational tool for understanding the dynamics of dissipative problems in a manner that could not be achieved by use of standard initial value methods. Numerical results are given for the Chafee–Infante problem and for the Cahn–Hilliard equation. A one-parameter family of PDEs connecting these two problems is introduced, and it is demonstrated numerically that the global attractor for the Chafee–Infante problem can be continuously deformed into that for the Cahn–Hilliard equation.

Publication: SIAM Journal on Applied Mathematics Vol.: 53 No.: 3 ISSN: 0036-1399

ID: CaltechAUTHORS:20170613-082542564

]]>

Abstract: It has been proved inter alia in part I of the present paper (Iserles et al., 1991) that irreducible multistep methods for ordinary differential equations may possess period-2 solutions as asymptotic states if and only if σ(−1)≠0, where the underlying method is ∑^m_k=0ρκyn+k = h ∑^m_k=0^σκf(yn+k) and σ(z):=∑^m_k=0^σκ^z^k. We provide an alternative proof of that statement and examine in detail properties of methods that obey σ(−1)=0. By using a variation of the original proof of the first Dahlquist barrier (Henrici, 1962), we establish an attainable upper bound on the order of zero-stable multistep methods with the aforementioned feature. Moreover, we modify the concept of backward differentiation formulae (BDF) to require that σ(−1)=0. A zero-stability bound on the ensuing methods is produced by extending the method of proof in (Hairer & Wanner, 1983).

Publication: IMA Journal of Numerical Analysis Vol.: 12 No.: 4 ISSN: 0272-4979

ID: CaltechAUTHORS:20170612-105155948

]]>

Abstract: The Cauchy and initial boundary value problems are studied for a linear advection equation with a nonlinear source term. The source term is chosen to have two equilibrium states, one unstable and the other stable as solutions of the underlying characteristic equation. The true solutions exhibit travelling waves which propagate from one equilibrium to another. The speed of propagation is dependent on the rate of decay of the initial data at infinity A class of monotone explicit finite-difference schemes are proposed and analysed; the schemes are upwind in space for the advection term with some freedom of choice for the evaluation of the nonlinear source term. Convergence of the schemes is demonstrated and the existence of numerical waves, mimicking the travelling waves in the underlying equation, is proved. The convergence of the numerical wave-speeds to the true wave-speeds is also established. The behaviour of the scheme is studied when the monotonicity criteria are violated due to stiff source terms, and oscillations and divergence are shown to occur. The behaviour is contrasted with a split-step scheme where the solution remains monotone and bounded but where incorrect speeds of propagation are observed as the stiffness of the problem increases.

Publication: SIAM Journal on Numerical Analysis Vol.: 29 No.: 5 ISSN: 0036-1429

ID: CaltechAUTHORS:20170613-084043611

]]>

Abstract: The approximation of solutions of reaction-diffusion equations that approach asymptotically stable, hyperbolic equilibria is considered. Near such equilibria trajectories of the equation contract and hence it is possible to seek error estimates that are uniformly valid in time. A technique for the derivation of such estimates is illustrated in the context of an explicit Euler finite-difference scheme.

Publication: IMA Journal of Numerical Analysis Vol.: 12 No.: 3 ISSN: 0272-4979

ID: CaltechAUTHORS:20170612-070148270

]]>

Abstract: The asymptotic states of numerical methods for initial value problems are examined. In particular, spurious steady solutions, solutions with period 2 in the timestep, and spurious invariant curves are studied. A numerical method is considered as a dynamical system parameterised by the timestep h. It is shown that the three kinds of spurious solutions can bifurcate from genuine steady solutions of the numerical method (which are inherited from the differential equation) as h is varied. Conditions under which these bifurcations occur are derived for Runge–Kutta schemes, linear multistep methods, and a class of predictor-corrector methods in a PE(CE)^M implementation. The results are used to provide a unifying framework to various scattered results on spurious solutions which already exist in the literature. Furthermore, the implications for choice of numerical scheme are studied. In numerical simulation it is desirable to minimise the effect of spurious solutions. Classes of methods with desirable dynamical properties are described and evaluated.

Publication: SIAM Journal on Numerical Analysis Vol.: 28 No.: 6 ISSN: 0036-1429

ID: CaltechAUTHORS:20170612-164247464

]]>

Abstract: The dynamics of the theta method for arbitrary systems of nonlinear ordinary differential equations are analysed. Two scalar examples are presented to demonstrate the importance of spurious solutions in determining the dynamics of discretisations. A general system of differential equations is then considered. It is shown that the choice θ = ½ does not generate spurious solutions of period 2 in the timestep n. Using bifurcation theory, it is shown that for θ ≠ ½ the theta method does generate spurious solutions of period 2. The existence and form of spurious solutions are examined in the limit △t ⟶ 0. The existence of spurious steady solutions in a predictor-corrector method is proved to be equivalent to the existence of spurious period 2 solutions in the Euler method. The theory is applied to several examples from nonlinear parabolic equations. Numerical continuation is used to trace out the spurious solutions as Lit is varied. Timestepping experiments are presented to demonstrate the effect of the spurious solutions on the dynamics and some complementary theoretical results are proved. In particular, the linear stability restriction △t/△ x^2 ≤ ½ for the Euler method applied to the heat equation is generalised to cope with a nonlinear problem. This naturally introduces a restriction on △t in terms of the initial data; this restriction is necessary to avoid the effect of spurious periodic solutions.

Publication: SIAM Journal on Scientific and Statistical Computing Vol.: 12 No.: 6 ISSN: 0196-5204

ID: CaltechAUTHORS:20170613-092411747

]]>

Abstract: Unless they are furnished with an adequate blood supply and a means of disposing of their waste products by a mechanism other than diffusion, solid tumours cannot grow beyond a few millimetres in diameter. It is now a well-established fact that, in order to accomplish this neovascularization, solid tumours secrete a diffusable chemical compound known as turnour angiogenesis factor (TAF) into the surrounding tissue. This stimulates nearby blood vessels to migrate towards and finally penetrate the tumour. Once provided with the new supply of nutrient, rapid growth takes place. In this paper, a mathematical model is presented for the diffusion of TAF into the surrounding tissue. The complete process of angiogenesis is made up of a sequence of several distinct events and the model is an attempt to take into account as many of these as possible. In the diffusion equation for the TAF, a decay term is included which models the loss of the chemical in the surrounding tissue itself. A threshold distance for the TAF is incorporated in an attempt to reflect the results from experiments of corneal implants in test animals. By formulating the problems in terms of a free boundary problem, the extent of the diffusion of TAF into the surrounding tissue can be monitored. Finally, by introducing a sink term representing the action of proliferating endothelial cells, the boundary of the TAF is seen to recede, and hence the position and movement of the capillaries can be indirectly followed. The changing concentration gradient observed as the boundary recedes may offer a possible explanation for the initiation of anastomosis. Several functions are considered as possible sink terms and numerical results are presented. The situation where the turnour. (i.e. the source of TAF) is removed is also considered.

Publication: Mathematical Medicine and Biology Vol.: 8 No.: 3 ISSN: 1477-8599

ID: CaltechAUTHORS:20170612-104202241

]]>

Abstract: We analyze the following class of nonlinear eigenvalue problems: find (uµ) є B x R satisfying (1) Du + µH(a.u - 1)f(u) = 0 in Ω ⊆ R^N, (2) u = O on ∂Ω. Here H (X) is the Heaviside step-function defined by H(X) =0, X ≤ 0 H (X) = 1 X > 0. B is some Banach space appropriate to the problem. D is taken to be a (possibly nonlinear) differential operator with the property than, when µ = 0, equations (1,2) have the unique solution u

Publication: Rocky Mountain Journal of Mathematics Vol.: 21 No.: 2 ISSN: 0035-7596

ID: CaltechAUTHORS:20170613-144801379

]]>

Abstract: Numerical methods for initial-value problems which develop singularities in finite time are analyzed. The objective is to determine simple strategies which produce the correct asymptotic behaviour and give an accurate approximation of the blow-up time. Fixed step methods for scalar ordinary differential equations are studied first and it is shown that there is a natural embedding of the discrete process in a continuous one. This shows clearly how and why the fixed-step strategy fails. A class of time-stepping strategies that correspond to a time- continuous re-scaling of the underlying differential equation is then proposed; this class is analyzed and criteria established to determine suitable choices for the re-scaling. Finally the ideas are applied to a partial differential equation arising from the study of a fluid with temperature-dependent viscosity. The numerical method involves re-formulating the equation as a moving boundary problem for the peak value and applying the ODE time-stepping strategies based on this peak value.

Publication: European Journal of Applied Mathematics Vol.: 1 No.: 01 ISSN: 0956-7925

ID: CaltechAUTHORS:20170612-101245620

]]>

Abstract: We analyse discrete approximations of reaction-diffusion-convection equations and show that linearized instability implies the existence of spurious periodic solutions in the fully nonlinear problem. The result is proved by using ideas from bifurcation theory. Using singularity theory we provide a precise local description of the spurious solutions. The results form the basis for an analysis of the range of discretization parameters in which spurious solutions can exist, their magnitude, and their spatial structure. We present a modified equations approach to determine criteria under which spurious periodic solutions exist for arbitrarily small values of the time-step. The theoretical results are applied to a specific example.

Publication: IMA Journal of Numerical Analysis Vol.: 9 No.: 4 ISSN: 0272-4979

ID: CaltechAUTHORS:20170612-083455306

]]>

Abstract: A unified analysis of reaction-diffusion equations and their finite difference representations is presented. The parallel treatment of the two problems shows clearly when and why the finite difference approximations break down. The approach used provides a general framework for the analysis and interpretation of numerical instability in approximations of dissipative nonlinear partial differential equations Continuous and discrete problems are studied from the perspective of bifurcation theory, and numerical instability is shown to be associated with the bifurcation of periodic orbits in discrete systems. An asymptotic approach, due to Newell (SIAM J. Appl. Math., 33 (1977), 133–160), is used to investigate the instability phenomenon further. In particular, equations are derived that describe the interaction of the dynamics of the partial differential equation with the artefacts of the discretization.

Publication: SIAM Review Vol.: 31 No.: 2 ISSN: 0036-1445

ID: CaltechAUTHORS:20170613-075253765

]]>

Abstract: A model of time-dependent porous-medium combustion is presented. The model is of combustion in a three-dimensional porous medium. The typical situation envisaged is the combustion of a non-deforming porous solid medium through which a gas such as air passes. The model represents conservation of mass and energy for both the gas and solid species, whilst the fluid flow is governed by Darcy's law and the ideal-gas law. This model is highly complex and requires sophisticated computer analysis. Consequently we derive a simplified model as a one-dimensional version of the equations, by a number of asymptotic considerations. Central to the analysis is the concept of the large-activation-energy limit. This limit is shown to have entirely different features from those which arise in conventional flame theory. This fact is a consequence of the two-stage reaction rate governing porous-medium combustion; the stages are first the diffusion of gas components between the gas mainstream and the reaction sites in the solid and secondly the conventional Arrhenius reaction. Thus the overall reaction rate is not proportional to the Arrhenius reaction rate, but is a rational function of it. Because of this two-stage reaction rate, the limit E→∞ has a novel result not encountered in conventional flame theory. A critical switching temperature T_c, determined by A = exp (E/T_c), where A is the pre-exponential factor in the Arrhenius reaction term, arises naturally from the large-activation-energy analysis. For temperatures beneath T_c the reaction rate is negligible whereas for temperatures above T_c the reaction is controlled by the ability of the active gas components to diffuse into or out of the reaction sites in the solid. This rate of active gas-component diffusion has been shown experimentally to be proportional to a power (approximately the square) of the gas temperature. Thus, when switched on, the rate-limiting reaction rate grows algebraically with the temperature, in contrast to the explosive exponential growth of the Arrhenius term which governs the switching process.

Publication: Quarterly Journal of Mechanics and Applied Mathematics Vol.: 42 No.: 1 ISSN: 0033-5614

ID: CaltechAUTHORS:20170613-090547094

]]>

Abstract: A constructive method applicable to the solution of a wide class of free boundary problems is presented. A solution-dependent transformation technique is introduced. By considering a singular limit of the transformation, a related problem, to which local bifurcation theory may be applied, is derived. By inverting the (near singular) mapping between the two problems, an expression for solutions of the original problem is obtained. The method is illustrated by the study of a singularly perturbed elliptic equation. Approximate solutions are constructed and the validity of the approximations established by means of the Contraction Mapping Theorem.

Publication: SIAM Journal on Applied Mathematics Vol.: 49 No.: 1 ISSN: 0036-1399

ID: CaltechAUTHORS:20170613-080915333

]]>

Abstract: We describe an instability introduced by the spatial discretization of reaction-diffusion equations. The mechanism is a nonlinear interaction between high and low wave-number modes in the discrete equations. In partial differential equations which exhibit strong temporal growth, a parasitic high-wave-number mode is stimulated, through aliasing, by a physically meaningful low-wave-number mode. We analyse the interaction using phase-plane techniques and present complementary numerical results.

Publication: IMA Journal of Applied Mathematics Vol.: 42 No.: 1 ISSN: 0272-4960

ID: CaltechAUTHORS:20170613-135127282

]]>

Abstract: A reaction-diffusion equation, coupled through variable heat capacity and source term to a temporally evolving ordinary differential equation, is examined. The model is a prototype for the study of combustion processes where the heat capacity of a composite solid medium changes significantly as the reactant within the medium is consumed. Similarity solutions are sought by analysing the invariance of the equations to various stretching groups. The resulting two-point boundary-value problem is singular at the origin and posed on the semi-infinite domain. By employing series expansion techniques we derive a regular problem posed on a finite domain. This problem is amenable to standard numerical solution by means of Newton-Kantorovich iteration. Results of the computations are presented and interpreted in terms of the governing partial differential equation.

Publication: IMA Journal of Applied Mathematics Vol.: 40 No.: 3 ISSN: 0272-4960

ID: CaltechAUTHORS:20170613-090149803

]]>

Abstract: The linear stability properties of the travelling combustion waves found in Part I are examined. The key parameters which determine the stability properties of the waves are found to be the (scaled) driving velocity and the solid specific heat. In particular, the destabilising influence of increasing either of these two parameters is demonstrated. The results indicate that travelling combustion waves whose reaction is turned off because the solid temperature becomes too low are always unstable, whereas travelling waves whose reaction is turned off due to depletion of solid reactant can be stable. Global techniques are employed to prove that, for large enough values of the scaled solid specific heat, combustion cannot be sustained in any form, and all initial conditions lead to extinction.

Publication: SIAM Journal on Applied Mathematics Vol.: 48 No.: 2 ISSN: 0036-1399

ID: CaltechAUTHORS:20170612-164819129

]]>

Abstract: A one-space-dimensional, time-dependent model for travelling combustion waves in a porous medium is analysed. The key variables are the temperature of the solid medium and its density and the temperature of the gaseous phase and its density. The key parameters µ, λ and a are related (respectively) to the driving gas velocity, the specific heat of the combustible solid and the ratio of consumption of oxygen to that of solid. The regions of existence of the different types of combustion waves are found in µ, λ parameter space, with a = 0. The types of combustion wave are classified by the switch mechanism that turns off the combustion, which occurs over a finite, but unknown, interval. Because the model is linear outside the combustion zone, the eigenvalue problem governing the existence of travelling waves may be reformulated as a two-point free boundary problem on a finite domain. Existence and nonexistence theorems are established for this unusual bifurcation problem.

Publication: SIAM Journal on Applied Mathematics Vol.: 48 No.: 1 ISSN: 0036-1399

ID: CaltechAUTHORS:20170612-165322289

]]>

Abstract: A parabolic partial differential equation approximating the evolution of temperature in highly exothermic porous-medium combustion at low driving velocities is examined. The equation is of reaction-diffusion type with a reaction term which is discontinuous as a function of the dependent variable. Firstly the variation of the steady solution set with the scaled heat of the reaction is described and the related time-dependent behaviour analysed. The stability results follow from characterizing the ends of the solution branch and fold points explicitly and deducing global stability results about the whole of the continuous solution branch. The results are used to indicate the parameter regimes and temporal scales on which the small driving velocity approximation ceases to be valid. Secondly the behaviour of the discontinuous partial differential equation is compared with that of a continuous equation which it approximates. This provides justification for the approximation of reaction terms possessing steep gradients by discontinuous functions; the large activation energy limit in porous-medium combustion involves such a process.

Publication: IMA Journal of Applied Mathematics Vol.: 39 No.: 3 ISSN: 0272-4960

ID: CaltechAUTHORS:20170613-084535886

]]>

Abstract: The existence of solutions of a two-point free-boundary problem arising from the theory of travelling combustion waves in a porous medium is examined. The problem comprises a third-order nonlinear ordinary differential equation posed on an unknown interval of finite length; four boundary conditions are given, two at either end of the interval. The equations possess a trivial solution for all values of the bifurcation parameter λ. A shooting technique is employed to prove the existence of a nontrivial solution for 0 < λ< λ_c and nonexistence theorems are proved for λ ∉ (0, λ_c ).

Publication: IMA Journal of Applied Mathematics Vol.: 38 No.: 1 ISSN: 0272-4960

ID: CaltechAUTHORS:20170612-142648576

]]>

Abstract: We present a generalisation of the continuous Gronwall inequality and show its use in bounding solutions of discrete inequalities of a form that arise when analysing the convergence of product integration methods for Volterra integral equations. We then use these ideas to prove convergence of a numerical method which is effective in approximating Volterra integral equations of the second kind with weakly singular kernels.

Publication: Proceedings of the Royal Society of Edinburgh: Section A Mathematics Vol.: 106 No.: 3-4 ISSN: 0308-2105

ID: CaltechAUTHORS:20170612-131130001

]]>

Abstract: We consider nonlinear singular Volterra integral equations of the second kind. We generalise the transformation method introduced in Part I of this paper [6] to cope with both the nonlinearity and slightly more general singular kernels. We also consider a particular class of nonlinear equation for which the solution behaviour is known. Using this a priori knowledge, we propose a modification of the transformation technique which results in a numerical method with good asymptotic stability properties. Applying the general theory of Part I of this paper, we prove convergence of this scheme.

Publication: Proceedings of the Royal Society of Edinburgh: Section A Mathematics Vol.: 106 No.: 3-4 ISSN: 0308-2105

ID: CaltechAUTHORS:20170612-135043115

]]>