Book Section records
https://feeds.library.caltech.edu/people/Mazumdar-Eric/book_section.rss
A Caltech Library Repository Feedhttp://www.rssboard.org/rss-specificationpython-feedgenenFri, 08 Dec 2023 12:26:14 +0000Understanding the impact of parking on urban mobility via routing games on queue-flow networks
https://resolver.caltech.edu/CaltechAUTHORS:20210903-222216008
Authors: Calderone, Daniel; Mazumdar, Eric; Ratliff, Lillian J.; Sastry, S. Shankar
Year: 2016
DOI: 10.1109/CDC.2016.7799444
We derive a new routing game model for urban centers that takes into account parking-related traffic along with all other traffic. In particular, we combine a queuing game model for on-street parking with a classical routing game to create a queue-routing game where parking traffic selects a parking zone (block-face) in addition to their route through the network. We show that this game is a potential game. We construct practical examples using subsections of the Seattle downtown area to illustrate the usefulness of this modeling paradigm and to examine how parking-traffic can impact overall congestion and the route choices of other drivers. By varying the cost of parking in different parking zones, we demonstrate that parking-related traffic can be adjusted to satisfy a particular objective.https://authors.library.caltech.edu/records/tcken-s8w33To observe or not to observe: Queuing game framework for urban parking
https://resolver.caltech.edu/CaltechAUTHORS:20210903-222215263
Authors: Ratliff, Lillian J.; Dowling, Chase; Mazumdar, Eric; Zhang, Baosen
Year: 2016
DOI: 10.1109/CDC.2016.7799079
We model parking in urban centers as a set of parallel queues and overlay a game theoretic structure. We model arriving drivers as utility maximizers and consider two games: one in which it is free to observe the queue length and one in which it is not. Not only do we compare the Nash induced welfare to the socially optimal welfare, confirming the usual result that Nash is worse for society, we also show that by other performance metrics more commonly used in transportation- such as occupancy and time spent circling-the Nash solution is suboptimal. We find that gains to welfare do not require everyone to observe. Through simulation, we explore a more complex scenario where drivers decide based the queueing game whether or not to enter a collection of queues over a network. Our simulated models use parameters informed by real-world data collected by the Seattle Department of Transportation.https://authors.library.caltech.edu/records/5jg1y-yaw95Gradient-based inverse risk-sensitive reinforcement learning
https://resolver.caltech.edu/CaltechAUTHORS:20210903-222215940
Authors: Mazumdar, Eric; Ratliff, Lillian J.; Fiez, Tanner; Sastry, S. Shankar
Year: 2017
DOI: 10.1109/CDC.2017.8264535
We address the problem of inverse reinforcement learning in Markov decision processes where the agent is risksensitive. In particular, we model risk-sensitivity in a reinforcement learning framework by making use of models of human decision-making having their origins in behavioral psychology and economics. We propose a gradient-based inverse reinforcement learning algorithm that minimizes a loss function defined on the observed behavior. We demonstrate the performance of the proposed technique on two examples, the first of which is the canonical Grid World example and the second of which is an MDP modeling passengers' decisions regarding ride-sharing. In the latter, we use pricing and travel time data from a ride-sharing company to construct the transition probabilities and rewards of the MDP.https://authors.library.caltech.edu/records/vcvee-qm253On the Analysis of Cyclic Drug Schedules for Cancer Treatment using Switched Dynamical Systems
https://resolver.caltech.edu/CaltechAUTHORS:20210903-222215867
Authors: Chapman, Margaret P.; Mazumdar, Eric V.; Langer, Ellen; Sears, Rosalie; Tomlin, Claire J.
Year: 2018
DOI: 10.1109/CDC.2018.8619490
Motivated by our prior work on a Triple Negative breast cancer cell line, the focus of this paper is controller synthesis for cancer treatment, through the use of drug scheduling and a switched dynamical system model. Here we study a cyclic schedule of d drugs with maximal waiting times between drug inputs, where each drug is applied once per cycle in any order. We suppose that some of the d drugs are highly toxic to normal cells and that these drugs can shrink the live cancer cell population. The remaining drugs are less toxic to normal cells and can only reduce the growth rate of the live cancer cell population. Also, we assume that waiting time bounds related to toxicity, or to the onset of resistance, are available for each drug. A cancer cell population is said to be stable if the number of live cells tends to zero, as time becomes sufficiently large. In the absence of modeling error, we derive conditions for exponential stability. In the presence of modeling error, we prove exponential stability and derive a settling time, under certain mathematical conditions on the error. We conclude the paper with a numerical example that uses models which were identified on Triple Negative breast cancer cell line data.https://authors.library.caltech.edu/records/2dbdw-9pw59Local Nash Equilibria are Isolated, Strict Local Nash Equilibria in 'Almost All' Zero-Sum Continuous Games
https://resolver.caltech.edu/CaltechAUTHORS:20210903-222215800
Authors: Mazumdar, Eric; Ratliff, Lillian J.
Year: 2019
DOI: 10.1109/CDC40024.2019.9030203
We prove that differential Nash equilibria are generic amongst local Nash equilibria in continuous zero-sum games. That is, there exists an open-dense subset of zero-sum games for which local Nash equilibria are nondegenerate differential Nash equilibria. The result extends previous results to the zero-sum setting, where we obtain even stronger results; in particular, we show that local Nash equilibria are generically hyperbolic critical points. We further show that differential Nash equilibria of zero-sum games are structurally stable. The purpose for presenting these extensions is the recent renewed interest in zero-sum games within machine learning and optimization. Adversarial learning and generative adversarial network approaches are touted to be more robust than the alternative. Zero-sum games are at the heart of such approaches. Many works proceed under the assumption of hyperbolicity of critical points. Our results justify this assumption by showing `almost all' zero-sum games admit local Nash equilibria that are hyperbolic.https://authors.library.caltech.edu/records/0c7my-cex38Feedback Linearization for Uncertain Systems via Reinforcement Learning
https://resolver.caltech.edu/CaltechAUTHORS:20210903-222215650
Authors: Westenbroek, Tyler; Fridovich-Keil, David; Mazumdar, Eric; Arora, Shreyas; Prabhu, Valmik; Sastry, S. Shankar; Tomlin, Claire J.
Year: 2020
DOI: 10.1109/ICRA40945.2020.9197158
We present a novel approach to control design for nonlinear systems which leverages model-free policy optimization techniques to learn a linearizing controller for a physical plant with unknown dynamics. Feedback linearization is a technique from nonlinear control which renders the input-output dynamics of a nonlinear plant linear under application of an appropriate feedback controller. Once a linearizing controller has been constructed, desired output trajectories for the nonlinear plant can be tracked using a variety of linear control techniques. However, the calculation of a linearizing controller requires a precise dynamics model for the system. As a result, model-based approaches for learning exact linearizing controllers generally require a simple, highly structured model of the system with easily identifiable parameters. In contrast, the model-free approach presented in this paper is able to approximate the linearizing controller for the plant using general function approximation architectures. Specifically, we formulate a continuous-time optimization problem over the parameters of a learned linearizing controller whose optima are the set of parameters which best linearize the plant. We derive conditions under which the learning problem is (strongly) convex and provide guarantees which ensure the true linearizing controller for the plant is recovered. We then discuss how model-free policy optimization algorithms can be used to solve a discrete-time approximation to the problem using data collected from the real-world plant. The utility of the framework is demonstrated in simulation and on a real-world robotic platform.https://authors.library.caltech.edu/records/ev5c3-q2v72Expert Selection in High-Dimensional Markov Decision Processes
https://resolver.caltech.edu/CaltechAUTHORS:20210903-222215578
Authors: Rubies-Royo, VicenĂ§; Mazumdar, Eric; Dong, Roy; Tomlin, Claire; Sastry, S. Shankar
Year: 2020
DOI: 10.1109/CDC42340.2020.9303788
In this work we present a multi-armed bandit framework for online expert selection in Markov decision processes and demonstrate its use in high-dimensional settings. Our method takes a set of candidate expert policies and switches between them to rapidly identify the best performing expert using a variant of the classical upper confidence bound algorithm, thus ensuring low regret in the overall performance of the system. This is useful in applications where several expert policies may be available, and one needs to be selected at run-time for the underlying environment.https://authors.library.caltech.edu/records/vddty-ay603High Confidence Sets for Trajectories of Stochastic Time-Varying Nonlinear Systems
https://resolver.caltech.edu/CaltechAUTHORS:20210903-222215409
Authors: Mazumdar, Eric; Westenbroek, Tyler; Jordan, Michael I.; Sastry, S. Shankar
Year: 2020
DOI: 10.1109/CDC42340.2020.9304491
We analyze stochastic differential equations and their discretizations to derive novel high probability tracking bounds for exponentially stable time varying systems which are corrupted by process noise. The bounds have an explicit dependence on the rate of convergence for the unperturbed system and the dimension of the state space. The magnitude of the stochastic deviations have a simple intuitive form, and our perturbation bounds also allow us to derive tighter high probability bounds on the tracking of reference trajectories than the state of the art. The resulting bounds can be used in analyzing many tracking control schemes.https://authors.library.caltech.edu/records/8pqcg-xjy20Adaptive Control for Linearizable Systems Using On-Policy Reinforcement Learning
https://resolver.caltech.edu/CaltechAUTHORS:20210903-222215502
Authors: Westenbroek, Tyler; Mazumdar, Eric; Fridovich-Keil, David; Prabhu, Valmik; Tomlin, Claire J.; Sastry, S. Shankar
Year: 2020
DOI: 10.1109/CDC42340.2020.9304242
The following topics are dealt with: control system synthesis; nonlinear control systems; linear systems; stability; optimisation; feedback; closed loop systems; Lyapunov methods; multi-agent systems; optimal control.https://authors.library.caltech.edu/records/6amte-r7198