CaltechAUTHORS: Monograph
https://feeds.library.caltech.edu/people/Burdick-J-W/monograph.rss
A Caltech Library Repository Feedhttp://www.rssboard.org/rss-specificationpython-feedgenenWed, 26 Jun 2024 12:51:55 -0700Simultaneous Model Identification and Task Satisfaction in the Presence of Temporal Logic Constraints
https://resolver.caltech.edu/CaltechCDSTR:2016.002
Year: 2016
Recent proliferation of cyber-physical systems, ranging from autonomous cars to nuclear hazard inspection robots, has exposed several challenging research problems on automated fault detection and recovery. This paper considers how recently developed formal synthesis and model verification techniques may be used to automatically generate information-seeking trajectories for anomaly detection. In particular, we consider the problem of how a robot could select its actions so as to maximally disambiguate between different model hypotheses that govern the environment it operates in or its interaction with other agents whose prime motivation is a priori unknown. The identification problem is posed as selection of the most likely model from a set of candidates, where each candidate is an adversarial Markov decision process (MDP) together with a linear temporal logic (LTL) formula that constrains robot-environment interaction. An adversarial MDP is an MDP in which transitions depend on both a (controlled) robot action and an (uncontrolled) adversary action. States are labeled, thus allowing interpretation of satisfaction of LTL formulae, which have a special form admitting satisfaction decisions in bounded time. An example where a robotic car must discern whether neighboring vehicles are following its trajectory for a surveillance operation is used to demonstrate our approach.https://resolver.caltech.edu/CaltechCDSTR:2016.002Modeling Motor Responses of Paraplegics under Epidural Spinal
Cord Stimulation: Computational Modeling Technical Report
https://resolver.caltech.edu/CaltechAUTHORS:20170417-152552927
Year: 2017
DOI: 10.7907/Z9N58JDV
This technical report describes the computational model in our paper, Modeling Motor Responses of Paraplegic Patients under Epidural Spinal Cord Stimulation, appearing at the 2017 IEEE EMBS Conference on Neural Engineering. This model is the basis of our human spinal cord simulations.
The study utilizes the finite element method [1] to simulate the electrical activity within the spinal cord and nearby tissues; this technique numerically solves a system of partial differential equations over a specified geometry. Simulations were performed via COMSOL MultiphysicsR , version 5.1. The following sections detail our modeling procedure, including the spinal cord geometry, equations solved over this domain, material properties, and finite element analysis.https://resolver.caltech.edu/CaltechAUTHORS:20170417-152552927Correlational Dueling Bandits with Application to Clinical Treatment in Large Decision Spaces
https://resolver.caltech.edu/CaltechAUTHORS:20190205-133559444
Year: 2019
DOI: 10.48550/arXiv.1707.02375
We consider sequential decision making under uncertainty, where the goal is to optimize over a large decision space using noisy comparative feedback. This problem can be formulated as a K-armed Dueling Bandits problem where K is the total number of decisions. When K is very large, existing dueling bandits algorithms suffer huge cumulative regret before converging on the optimal arm. This paper studies the dueling bandits problem with a large number of arms that exhibit a low-dimensional correlation structure. Our problem is motivated by a clinical decision making process in large decision space. We propose an efficient algorithm CorrDuel which optimizes the exploration/exploitation tradeoff in this large decision space of clinical treatments. More broadly, our approach can be applied to other sequential decision problems with large and structured decision spaces. We derive regret bounds, and evaluate performance in simulation experiments as well as on a live clinical trial of therapeutic spinal cord stimulation. To our knowledge, this marks the first time an online learning algorithm was applied towards spinal cord injury treatments. Our experimental results show the effectiveness and efficiency of our approach.https://resolver.caltech.edu/CaltechAUTHORS:20190205-133559444Tools and Algorithms for Sampling in Extreme Terrains
https://resolver.caltech.edu/CaltechAUTHORS:20190213-142207704
Year: 2019
DOI: 10.26206/KZK9-WM63
Extreme-terrain robots such as JPL's Axel rover are enabling access to new and exciting science opportunities. The goal of this mini-program was to develop a compact sampling instrument for Axel. Over the summer of 2012, a small group of students designed, built, and tested prototype sampling devices. Nikola Georgiev created a versatile four-degree-of-freedom scoop, which can acquire up to 4 different samples in clean self-sealing containers. Hima Hassenruck-Gudipati studied percussive scooping, and prototyped a percussive scoop that takes advantage Axel's independent body rotation to acquire samples. Kristen Holtz and Yifei Huang collaborated on a pneumatic sampling system, which uses a puff of air to propel loose grains into flexible tubing, and separates the grains into an interchangeable sample container. Each of these sampling systems has been demonstrated, and each proved useful for different conditions. In turn, the students gained valuable design experience and the opportunity to work alongside a number of experts in various fields.https://resolver.caltech.edu/CaltechAUTHORS:20190213-142207704Multi-dueling Bandits with Dependent Arms
https://resolver.caltech.edu/CaltechAUTHORS:20190410-120658254
Year: 2019
DOI: 10.48550/arXiv.1705.00253
The dueling bandits problem is an online learning framework for learning from pairwise preference feedback, and is particularly well-suited for modeling settings that elicit subjective or implicit human feedback. In this paper, we study the problem of multi-dueling bandits with dependent arms, which extends the original dueling bandits setting by simultaneously dueling multiple arms as well as modeling dependencies between arms. These extensions capture key characteristics found in many real-world applications, and allow for the opportunity to develop significantly more efficient algorithms than were possible in the original setting. We propose the selfsparring algorithm, which reduces the multi-dueling bandits problem to a conventional bandit setting that can be solved using a stochastic bandit algorithm such as Thompson Sampling, and can naturally model dependencies using a Gaussian process prior. We present a no-regret analysis for multi-dueling setting, and demonstrate the effectiveness of our algorithm empirically on a wide range of simulation settings.https://resolver.caltech.edu/CaltechAUTHORS:20190410-120658254Convex Model Predictive Control for Vehicular Systems
https://resolver.caltech.edu/CaltechAUTHORS:20190410-120644197
Year: 2019
DOI: 10.48550/arXiv.1410.2792
In this work, we present a method to perform Model Predictive Control (MPC) over systems whose state is an element of SO(n) for n=2,3. This is done without charts or any local linearization, and instead is performed by operating over the orbitope of rotation matrices. This results in a novel MPC scheme without the drawbacks associated with conventional linearization techniques. Instead, second order cone- or semidefinite-constraints on state variables are the only requirement beyond those of a QP-scheme typical for MPC of linear systems. Of particular emphasis is the application to aeronautical and vehicular systems, wherein the method removes many of the transcendental trigonometric terms associated with these systems' state space equations. Furthermore, the method is shown to be compatible with many existing variants of MPC, including obstacle avoidance via Mixed Integer Linear Programming (MILP).https://resolver.caltech.edu/CaltechAUTHORS:20190410-120644197Bellman Gradient Iteration for Inverse Reinforcement Learning
https://resolver.caltech.edu/CaltechAUTHORS:20190410-120640737
Year: 2019
DOI: 10.48550/arXiv.1707.07767
This paper develops an inverse reinforcement learning algorithm aimed at recovering a reward function from the observed actions of an agent. We introduce a strategy to flexibly handle different types of actions with two approximations of the Bellman Optimality Equation, and a Bellman Gradient Iteration method to compute the gradient of the Q-value with respect to the reward function. These methods allow us to build a differentiable relation between the Q-value and the reward function and learn an approximately optimal reward function with gradient methods. We test the proposed method in two simulated environments by evaluating the accuracy of different approximations and comparing the proposed method with existing solutions. The results show that even with a linear reward function, the proposed method has a comparable accuracy with the state-of-the-art method adopting a non-linear reward function, and the proposed method is more flexible because it is defined on observed actions instead of trajectories.https://resolver.caltech.edu/CaltechAUTHORS:20190410-120640737A Function Approximation Method for Model-based High-Dimensional Inverse Reinforcement Learning
https://resolver.caltech.edu/CaltechAUTHORS:20190410-120630278
Year: 2019
DOI: 10.48550/arXiv.1708.07738
This works handles the inverse reinforcement learning problem in high-dimensional state spaces, which relies on an efficient solution of model-based high-dimensional reinforcement learning problems. To solve the computationally expensive reinforcement learning problems, we propose a function approximation method to ensure that the Bellman Optimality Equation always holds, and then estimate a function based on the observed human actions for inverse reinforcement learning problems. The time complexity of the proposed method is linearly proportional to the cardinality of the action set, thus it can handle high-dimensional even continuous state spaces efficiently. We test the proposed method in a simulated environment to show its accuracy, and three clinical tasks to show how it can be used to evaluate a doctor's proficiency.https://resolver.caltech.edu/CaltechAUTHORS:20190410-120630278Online Inverse Reinforcement Learning via Bellman Gradient Iteration
https://resolver.caltech.edu/CaltechAUTHORS:20190410-120637140
Year: 2019
DOI: 10.48550/arXiv.1707.09393
This paper develops an online inverse reinforcement learning algorithm aimed at efficiently recovering a reward function from ongoing observations of an agent's actions. To reduce the computation time and storage space in reward estimation, this work assumes that each observed action implies a change of the Q-value distribution, and relates the change to the reward function via the gradient of Q-value with respect to reward function parameter. The gradients are computed with a novel Bellman Gradient Iteration method that allows the reward function to be updated whenever a new observation is available. The method's convergence to a local optimum is proved.
This work tests the proposed method in two simulated environments, and evaluates the algorithm's performance under a linear reward function and a non-linear reward function. The results show that the proposed algorithm only requires a limited computation time and storage space, but achieves an increasing accuracy as the number of observations grows. We also present a potential application to robot cleaners at home.https://resolver.caltech.edu/CaltechAUTHORS:20190410-120637140Inverse Reinforcement Learning in Large State Spaces via Function
Approximation
https://resolver.caltech.edu/CaltechAUTHORS:20190410-120633713
Year: 2019
DOI: 10.48550/arXiv.1707.09394
This paper introduces a new method for inverse reinforcement learning in large-scale and high-dimensional state spaces. To avoid solving the computationally expensive reinforcement learning problems in reward learning, we propose a function approximation method to ensure that the Bellman Optimality Equation always holds, and then estimate a function to maximize the likelihood of the observed motion. The time complexity of the proposed method is linearly proportional to the cardinality of the action set, thus it can handle large state spaces efficiently. We test the proposed method in a simulated environment, and show that it is more accurate than existing methods and significantly better in scalability. We also show that the proposed method can extend many existing methods to high-dimensional state spaces. We then apply the method to evaluating the effect of rehabilitative stimulations on patients with spinal cord injuries based on the observed patient motions.https://resolver.caltech.edu/CaltechAUTHORS:20190410-120633713Stochastic Finite State Control of POMDPs with LTL Specifications
https://resolver.caltech.edu/CaltechAUTHORS:20200527-082959993
Year: 2020
DOI: 10.48550/arXiv.2001.07679
Partially observable Markov decision processes (POMDPs) provide a modeling framework for autonomous decision making under uncertainty and imperfect sensing, e.g. robot manipulation and self-driving cars. However, optimal control of POMDPs is notoriously intractable. This paper considers the quantitative problem of synthesizing sub-optimal stochastic finite state controllers (sFSCs) for POMDPs such that the probability of satisfying a set of high-level specifications in terms of linear temporal logic (LTL) formulae is maximized. We begin by casting the latter problem into an optimization and use relaxations based on the Poisson equation and McCormick envelopes. Then, we propose an stochastic bounded policy iteration algorithm, leading to a controlled growth in sFSC size and an any time algorithm, where the performance of the controller improves with successive iterations, but can be stopped by the user based on time or memory considerations. We illustrate the proposed method by a robot navigation case study.https://resolver.caltech.edu/CaltechAUTHORS:20200527-082959993Space Science Opportunities Augmented by Exploration Telepresence
https://resolver.caltech.edu/CaltechAUTHORS:20200609-101330976
Year: 2020
DOI: 10.7907/w6ma-a113
Since the end of the Apollo missions to the lunar surface in December 1972, humanity has exclusively conducted scientific studies on distant planetary surfaces using teleprogrammed robots. Operations and science return for all of these missions are constrained by two issues related to the great distances between terrestrial scientists and their exploration targets: high communication latencies and limited data bandwidth.
Despite the proven successes of in-situ science being conducted using teleprogrammed robotic assets such as Spirit, Opportunity, and Curiosity rovers on the surface of Mars, future planetary field research may substantially overcome latency and bandwidth constraints by employing a variety of alternative strategies that could involve: 1) placing scientists/astronauts directly on planetary surfaces, as was done in the Apollo era; 2) developing fully autonomous robotic systems capable of conducting in-situ field science research; or 3) teleoperation of robotic assets by humans sufficiently proximal to the exploration targets to drastically reduce latencies and significantly increase bandwidth, thereby achieving effective human telepresence.
This third strategy has been the focus of experts in telerobotics, telepresence, planetary science, and human spaceflight during two workshops held from October 3–7, 2016, and July 7–13, 2017, at the Keck Institute for Space Studies (KISS). Based on findings from these workshops, this document describes the conceptual and practical foundations of low-latency telepresence (LLT), opportunities for using derivative approaches for scientific exploration of planetary surfaces, and circumstances under which employing telepresence would be especially productive for planetary science. An important finding of these workshops is the conclusion that there has been limited study of the advantages of planetary science via LLT. A major recommendation from these workshops is that space agencies such as NASA should substantially increase science return with greater investments in this promising strategy for human conduct at distant exploration sites.https://resolver.caltech.edu/CaltechAUTHORS:20200609-101330976Revolutionizing Access to the Mars Surface
https://resolver.caltech.edu/CaltechAUTHORS:20220222-212156808
Year: 2022
DOI: 10.7907/d1sm-mj77
Under the auspices of the Keck Institute for Space Studies at Caltech, we convened a group of Mars scientists and engineers, representing multiple academic institutions, NASA Centers, and commercial companies in March 2021 to address the challenge of revolutionizing access to the Mars surface. Following a 3-month summer study period where working groups addressed specific programmatic, cultural, and engineering challenges, we then convened a second workshop in September 2021. The report that follows represents the product of the
discussions.
We first outline why Mars—and particularly landed access to Mars—is a priority over the next decades of scientific exploration (Section 1). We then review the challenges to achieving increased numbers of missions accessing the Mars surface (Section 2) as well as near-term trends in the space sector that provide means to overcome these challenges (Section 3). We then describe the intertwined elements of the Frequent, Affordable, Bold (FAB) strategy to achieve access (Section 4). Finally, we discuss the means by which a FAB strategy can be implemented, developing similar sets of science mission types and employing a services model that fosters increasing commercial capabilities over a set of the mission types (Section 5).https://resolver.caltech.edu/CaltechAUTHORS:20220222-212156808Sample-Based Bounds for Coherent Risk Measures: Applications to Policy Synthesis and Verification
https://resolver.caltech.edu/CaltechAUTHORS:20220714-194303859
Year: 2022
DOI: 10.48550/arXiv.arXiv.2204.09833
The dramatic increase of autonomous systems subject to variable environments has given rise to the pressing need to consider risk in both the synthesis and verification of policies for these systems. This paper aims to address a few problems regarding risk-aware verification and policy synthesis, by first developing a sample-based method to bound the risk measure evaluation of a random variable whose distribution is unknown. These bounds permit us to generate high-confidence verification statements for a large class of robotic systems. Second, we develop a sample-based method to determine solutions to non-convex optimization problems that outperform a large fraction of the decision space of possible solutions. Both sample-based approaches then permit us to rapidly synthesize risk-aware policies that are guaranteed to achieve a minimum level of system performance. To showcase our approach in simulation, we verify a cooperative multi-agent system and develop a risk-aware controller that outperforms the system's baseline controller. We also mention how our approach can be extended to account for any g-entropic risk measure - the subset of coherent risk measures on which we focus.https://resolver.caltech.edu/CaltechAUTHORS:20220714-194303859Adaptive Conformal Prediction for Motion Planning among Dynamic Agents
https://resolver.caltech.edu/CaltechAUTHORS:20221219-234025206
Year: 2022
DOI: 10.48550/arXiv.2212.00278
This paper proposes an algorithm for motion planning among dynamic agents using adaptive conformal prediction. We consider a deterministic control system and use trajectory predictors to predict the dynamic agents' future motion, which is assumed to follow an unknown distribution. We then leverage ideas from adaptive conformal prediction to dynamically quantify prediction uncertainty from an online data stream. Particularly, we provide an online algorithm uses delayed agent observations to obtain uncertainty sets for multistep-ahead predictions with probabilistic coverage. These uncertainty sets are used within a model predictive controller to safely navigate among dynamic agents. While most existing data-driven prediction approached quantify prediction uncertainty heuristically, we quantify the true prediction uncertainty in a distribution-free, adaptive manner that even allows to capture changes in prediction quality and the agents' motion. We empirically evaluate of our algorithm on a simulation case studies where a drone avoids a flying frisbee.https://resolver.caltech.edu/CaltechAUTHORS:20221219-234025206Learning Disturbances Online for Risk-Aware Control: Risk-Aware Flight with Less Than One Minute of Data
https://resolver.caltech.edu/CaltechAUTHORS:20221219-234112304
Year: 2022
DOI: 10.48550/arXiv.2212.06253
Recent advances in safety-critical risk-aware control are predicated on apriori knowledge of the disturbances a system might face. This paper proposes a method to efficiently learn these disturbances online, in a risk-aware context. First, we introduce the concept of a Surface-at-Risk, a risk measure for stochastic processes that extends Value-at-Risk -- a commonly utilized risk measure in the risk-aware controls community. Second, we model the norm of the state discrepancy between the model and the true system evolution as a scalar-valued stochastic process and determine an upper bound to its Surface-at-Risk via Gaussian Process Regression. Third, we provide theoretical results on the accuracy of our fitted surface subject to mild assumptions that are verifiable with respect to the data sets collected during system operation. Finally, we experimentally verify our procedure by augmenting a drone's controller and highlight performance increases achieved via our risk-aware approach after collecting less than a minute of operating data.https://resolver.caltech.edu/CaltechAUTHORS:20221219-234112304STEP: Stochastic Traversability Evaluation and Planning for Risk-Aware Off-road Navigation; Results from the DARPA Subterranean Challenge
https://resolver.caltech.edu/CaltechAUTHORS:20230316-204056145
Year: 2023
Although autonomy has gained widespread usage in structured and controlled environments, robotic autonomy in unknown and off-road terrain remains a difficult problem. Extreme, off-road, and unstructured environments such as undeveloped wilderness, caves, rubble, and other post-disaster sites pose unique and challenging problems for autonomous navigation. Based on our participation in the DARPA Subterranean Challenge, we propose an approach to improve autonomous traversal of robots in subterranean environments that are perceptually degraded and completely unknown through a traversability and planning framework called STEP (Stochastic Traversability Evaluation and Planning). We present 1) rapid uncertainty-aware mapping and traversability evaluation, 2) tail risk assessment using the Conditional Value-at-Risk (CVaR), 3) efficient risk and constraint-aware kinodynamic motion planning using sequential quadratic programming-based (SQP) model predictive control (MPC), 4) fast recovery behaviors to account for unexpected scenarios that may cause failure, and 5) risk-based gait adaptation for quadrupedal robots. We illustrate and validate extensive results from our experiments on wheeled and legged robotic platforms in field studies at the Valentine Cave, CA (cave environment), Kentucky Underground, KY (mine environment), and Louisville Mega Cavern, KY (final competition site for the DARPA Subterranean Challenge with tunnel, urban, and cave environments).https://resolver.caltech.edu/CaltechAUTHORS:20230316-204056145An Active Learning Based Robot Kinematic Calibration Framework Using Gaussian Processes
https://resolver.caltech.edu/CaltechAUTHORS:20230316-204100500
Year: 2023
Future NASA lander missions to icy moons will require completely automated, accurate, and data efficient calibration methods for the robot manipulator arms that sample icy terrains in the lander's vicinity. To support this need, this paper presents a Gaussian Process (GP) approach to the classical manipulator kinematic calibration process. Instead of identifying a corrected set of Denavit-Hartenberg kinematic parameters, a set of GPs models the residual kinematic error of the arm over the workspace. More importantly, this modeling framework allows a Gaussian Process Upper Confident Bound (GP-UCB) algorithm to efficiently and adaptively select the calibration's measurement points so as to minimize the number of experiments, and therefore minimize the time needed for recalibration. The method is demonstrated in simulation on a simple 2-DOF arm, a 6 DOF arm whose geometry is a candidate for a future NASA mission, and a 7 DOF Barrett WAM arm.https://resolver.caltech.edu/CaltechAUTHORS:20230316-204100500FRoGGeR: Fast Robust Grasp Generation via the Min-Weight Metric
https://resolver.caltech.edu/CaltechAUTHORS:20230316-204052735
Year: 2023
Many approaches to grasp synthesis optimize analytic quality metrics that measure grasp robustness based on finger placements and local surface geometry. However, generating feasible dexterous grasps by optimizing these metrics is slow, often taking minutes. To address this issue, this paper presents FRoGGeR: a method that quickly generates robust precision grasps using the min-weight metric, a novel, almost-everywhere differentiable approximation of the classical epsilon grasp metric. The min-weight metric is simple and interpretable, provides a reasonable measure of grasp robustness, and admits numerically efficient gradients for smooth optimization. We leverage these properties to rapidly synthesize collision-free robust grasps - typically in less than a second. FRoGGeR can refine the candidate grasps generated by other methods (heuristic, data-driven, etc.) and is compatible with many object representations (SDFs, meshes, etc.). We study FRoGGeR's performance on over 40 objects drawn from the YCB dataset, outperforming a competitive baseline in computation time, feasibility rate of grasp synthesis, and picking success in simulation. We conclude that FRoGGeR is fast: it has a median synthesis time of 0.834s over hundreds of experiments.https://resolver.caltech.edu/CaltechAUTHORS:20230316-204052735