[
    {
        "id": "thesis:16607",
        "collection": "thesis",
        "collection_id": "16607",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:08022024-005547280",
        "type": "thesis",
        "title": "Studies on Scaling Throughput in Protein Engineering",
        "author": [
            {
                "family_name": "Schaus",
                "given_name": "Lucas Jean Nicolas",
                "orcid": "0000-0002-6094-7402",
                "clpid": "Schaus-Lucas-Jean-Nicolas"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Mayo",
                "given_name": "Stephen",
                "orcid": "0000-0002-9785-5018",
                "clpid": "Mayo-S-L"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Rees",
                "given_name": "Douglas C.",
                "orcid": "0000-0003-4073-1185",
                "clpid": "Rees-D-C"
            },
            {
                "family_name": "Bjorkman",
                "given_name": "Pamela J.",
                "orcid": "0000-0002-2277-3990",
                "clpid": "Bjorkman-P-J"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Mayo",
                "given_name": "Stephen L.",
                "orcid": "0000-0002-9785-5018",
                "clpid": "Mayo-S-L"
            }
        ],
        "local_group": [
            {
                "literal": "Resnick Sustainability Institute"
            },
            {
                "literal": "div_chem"
            }
        ],
        "abstract": "<p>In this work we present three studies in protein engineering. While all three protein classes that have been targeted for engineering tasks are very different, the studies have a focus on scaling-up the throughput in protein engineering.</p>\r\n\r\n<p>The first study concerns machine learning (ML) based antibody humanization techniques. Achieving a reduction of patient anti-drug antibody responses in clinical trials is the goal of antibody humanization. To measure this however, one needs to pass significant scientific, bureaucratic, and financial hurdles, which is very rarely done and especially never at scale. Most existing ML-based antibody humanization techniques claim that they work without providing any experimental evidence. We developed Mousify as an in silico antibody humanization platform to place existing models into one framework for wet-laboratory validation. We demonstrate that even the best models have a fundamental flaw in that they only generate a single antibody. We use Mousify and Markov chains to show that using ML-based antibody humanization models for library generation is not only feasible but produces both stable and functional variants. Learning the lessons from our wet-laboratory experiments, we then developed a variational autoencoder model with properties that hopefully improve the outcomes of antibody humanization experiments.</p>\r\n \r\n<p>In the second study, we outline our plans and initial results to develop a bioelectrocatalytic system for the conversion of N2 to ammonia using nitrogenase. Most of the world\u2019s ammonia is used for agricultural purposes and is produced via the environmentally damaging Haber-Bosch process. Engineering nitrogenase for the bioelectrocatalytic production of ammonia is not trivial and a high throughput is not guaranteed. We present preliminary results in how throughput can be increased through diazotrophic pre-selection of nitrogenase variants, as well as a quest to find the ideal starting point for engineering using a combination of ancestral sequence reconstruction and generative protein language models.</p>\r\n\r\n<p>In the third and final study we present a directed evolution campaign to evolve protoglobins for the enantioselective catalytic formation of cis-trifluoromethyl substituted cyclopropanes, the first such reaction in both the chemical and biological world. Not only is the enzyme ApePgb LQ capable of efficiently performing carbene insertions into double-bonds, but it also shows a much more diverse substrate scope than similar enantioselective formations of trans-trifluoromethyl substituted cyclopropanes. After demonstrating that ApePgb LQ reactions can be increased to a 1-mmol scale, we investigated the nature of protoglobin cis-selectivity using various computational methods.</p>",
        "doi": "10.7907/jqng-x012",
        "publication_date": "2025",
        "thesis_type": "phd",
        "thesis_year": "2025"
    },
    {
        "id": "thesis:16533",
        "collection": "thesis",
        "collection_id": "16533",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:07052024-170119371",
        "primary_object_url": {
            "basename": "Thesis.pdf",
            "content": "final",
            "filesize": 43525786,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/16533/1/Thesis.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Active Acquisition Methods for Single Cell Genomics",
        "author": [
            {
                "family_name": "Chen",
                "given_name": "Xiaoqiao",
                "orcid": "0000-0003-4685-3466",
                "clpid": "Chen-Xiaoqiao"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Cai",
                "given_name": "Long",
                "orcid": "0000-0002-7154-5361",
                "clpid": "Cai-Long"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Yue",
                "given_name": "Yisong",
                "orcid": "0000-0001-9127-1989",
                "clpid": "Yue-Yisong"
            },
            {
                "family_name": "Bouman",
                "given_name": "Katherine L.",
                "orcid": "0000-0003-0077-4367",
                "clpid": "Bouman-Katherine-L"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>We introduce two novel computational methodologies, ActiveSVM and Active Cell Inference, aimed at reducing the costs and enhancing the efficiency of single-cell mRNA sequencing and spatial transcriptomics, respectively. ActiveSVM employs an active learning approach to identify minimal yet highly informative gene sets for cell-type classification, physiological state identification, and genetic perturbation responses in single-cell datasets. By focusing on misclassified cells through an iterative process, ActiveSVM efficiently scales to analyze over a million cells, demonstrating around 90% accuracy across various datasets, including cell atlas and disease characterization studies.</p>\r\n\r\n<p>Active Cell Inference complements this by utilizing ordered gene sets, developed through ActiveSVM, to streamline spatial genomics measurements. This end-to-end pipeline significantly reduces measurement time and costs by up to 100-fold in scientific and clinical settings. It optimizes the gene probing process by identifying well-classified cells early, allowing for targeted gene application based on cell classification certainty. This method's efficacy is further enhanced by a temporal scaling calibration scheme, improving calibration accuracy throughout its iterative process.</p>\r\n\r\n<p>Both methodologies were rigorously tested on the expansive Human Cell Atlas dataset, using the advanced computational tool, CellxGene-Census, involving over 60 million cells. This integration facilitated the creation of precise gene sets for various human tissues, dramatically improving the efficiency and reliability of these cutting-edge genomic techniques. Together, ActiveSVM and Active Cell Inference represent significant advancements in the application of genomics to clinical diagnostics, therapeutic discovery, and genetic screens, promising substantial reductions in the operational complexities and costs associated with next-generation sequencing technologies.</p>",
        "doi": "10.7907/nsn8-nd79",
        "publication_date": "2025",
        "thesis_type": "phd",
        "thesis_year": "2025"
    },
    {
        "id": "thesis:16525",
        "collection": "thesis",
        "collection_id": "16525",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:06152024-132652470",
        "primary_object_url": {
            "basename": "Wang_Zitong_2025.pdf",
            "content": "final",
            "filesize": 14747054,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/16525/2/Wang_Zitong_2025.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "Theoretical and Computational Analysis of Cell Migration in Complex Tissue Environments",
        "author": [
            {
                "family_name": "Wang",
                "given_name": "Zitong (Jerry)",
                "orcid": "0000-0001-8008-7318",
                "clpid": "Wang-Zitong-Jerry"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Cai",
                "given_name": "Long",
                "orcid": "0000-0002-7154-5361",
                "clpid": "Cai-Long"
            },
            {
                "family_name": "Eberhardt",
                "given_name": "Frederick",
                "clpid": "Eberhardt-Frederick"
            },
            {
                "family_name": "Merchant",
                "given_name": "Akil Abid",
                "orcid": "0000-0001-7472-822X",
                "clpid": "Merchant-Akil-Abid"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>Cells sense and respond in spatially structured environments, including soils and tissue. My Ph.D. projects centered on developing new theoretical models and computational methods to understand how cells migrate in complex environments.</p> \r\n   \r\n<p>The first project is more theoretical in nature, leveraging information theory to study how the spatial organization of cell signaling pathways are adapted to the cell's natural environment. In tissue and soil, cells must localize to their targets by navigating distributions of extracellular ligands that are spatially discontinuous, consisting of local concentration peaks, due to binding a non-uniform network of ECM fibers. It is unclear how cells navigate patchy environments while not getting trapped in local concentration peaks. To answer this question, we framed navigation as a problem of maximizing mutual information in space and developed a computational algorithm for computing signaling pathway architectures that maximize mutual information in simulated natural environments. We found that for cells in tissues and soils, dynamic localization of membrane receptors dramatically boosts sensing precision and enables cells to navigate to chemical sources 30 times faster, but this receptor localization strategy is relatively inconsequential for cells in purely diffusive environments. Further, we found that anisotropic receptor dynamics previously observed in immune cells and growth cones are nearly optimal as predicted by our model.</p>\r\n\r\n<p>The second project is more computational in nature, leveraging multiplexed tissue imaging to understand T-cell migration in tumor microenvironments. Immunotherapies can halt or slow down cancer progression by activating either endogenous or engineered T-cells to detect and kill cancer cells. T-cells must infiltrate the tumor core for immunotherapies to be effective. However, many solid tumors resist T-cell infiltration, challenging the efficacy of current therapies. In collaboration with clinician scientists at Cedars-Sinai Medical Center, we developed an integrated deep learning framework, Morpheus, that takes large-scale spatial omics profiles of patient tumors, and combines a formulation of T-cell infiltration prediction as a self-supervised machine learning problem with a counterfactual optimization strategy to generate minimal tumor perturbations predicted to boost T-cell infiltration. We applied Morpheus to 368 metastatic melanoma and colorectal cancer samples assayed using 40-plex imaging mass cytometry, discovering cohort-dependent, combinatorial perturbations, involving CXCL9, CXCL10, CCL22 and CCL18 for melanoma and CXCR4, PD-1, PD-L1 and CYR61 for colorectal cancer, predicted to support T-cell infiltration across large patient cohorts. Using only raw image data, Morpheus also identified distinct therapeutic strategies for different patient strata such as cancer stage or fatty liver presence. Our work presents a paradigm for counterfactual-based prediction and design of cancer therapeutics using spatial omics data.</p>",
        "doi": "10.7907/mj08-b258",
        "publication_date": "2025",
        "thesis_type": "phd",
        "thesis_year": "2025"
    },
    {
        "id": "thesis:16912",
        "collection": "thesis",
        "collection_id": "16912",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:12092024-223834150",
        "type": "thesis",
        "title": "Quantitative Nucleic Acid Measurements Inform Strategies to Mitigate Viral Outbreaks",
        "author": [
            {
                "family_name": "Viloria Winnett",
                "given_name": "Alexander",
                "orcid": "0000-0002-7338-5605",
                "clpid": "Viloria-Winnett-Alexander"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Ismagilov",
                "given_name": "Rustem F.",
                "orcid": "0000-0002-3680-4399",
                "clpid": "Ismagilov-R-F"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Rothenberg",
                "given_name": "Ellen V.",
                "orcid": "0000-0002-3901-347X",
                "clpid": "Rothenberg-E-V"
            },
            {
                "family_name": "Arboleda",
                "given_name": "Valerie",
                "orcid": "0000-0002-9687-9122",
                "clpid": "Aboleda-V-A"
            },
            {
                "family_name": "Ismagilov",
                "given_name": "Rustem F.",
                "orcid": "0000-0002-3680-4399",
                "clpid": "Ismagilov-R-F"
            }
        ],
        "local_group": [
            {
                "literal": "3MT Competition (Caltech)"
            },
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "Humans have always been and continue to be at risk of infection by pathogens that surround us. However, recent advancements in quantitative nucleic acid technologies have allowed for more detailed study of these pathogens, how they spread among individuals, and how our immune systems respond to infection. In this thesis, I describe the design and execution of the Caltech COVID-19 Study, which used quantitative nucleic acid measurements to investigate the natural history of SARS-CoV-2 infection and inform strategies for diagnostics and vaccine development to reduce viral transmission. The Caltech COVID-19 Study enrolled participants in the Los Angeles area between September 2020 and April 2022 who were at risk of SARS-CoV-2 infection due to recent exposure to a household contact with acute infection. Participants collected paired upper respiratory specimens (saliva, nasal swabs, and throat swabs) daily or twice daily for approximately two weeks. These specimens underwent SARS-CoV-2 viral load quantification to assess transmission risk and determine whether to extend or terminate study enrollment. For participants who initially tested negative for SARS-CoV-2 RNA but later developed sustained infection, we tracked viral load from the very start of infection. These measurements were then used to evaluate the performance of various COVID-19 diagnostic tests. Our findings revealed a significant advantage of high-analytical-sensitivity tests over those with lower sensitivity, as well as the benefit of testing both the throat and nose rather than just the nose. In addition to viral load quantification, we sequenced human mRNA from these specimens to assess gene expression. Analyzing these changes allowed us to study how the mucosal immune system responds to acute viral infection across multiple anatomical sites over time, providing insights that could improve mucosal vaccine design. Notably, our data showed that, contrary to current models of localized paracrine interferon signaling, distinct compartments of the upper respiratory mucosa exhibited synchronized interferon stimulation during early infection\u2014even in the absence of detectable local viral replication. Mucosal vaccines capable of triggering this coordinated interferon response, maintaining CD8+ T memory cells to rapidly execute effector functions upon viral exposure, may be key to achieving sterilizing immunity. Findings from quantitative nucleic acid measurements in this thesis inform strategies to more effectively mitigate viral outbreaks.",
        "doi": "10.7907/qe3a-a670",
        "publication_date": "2025",
        "thesis_type": "phd",
        "thesis_year": "2025"
    },
    {
        "id": "thesis:16486",
        "collection": "thesis",
        "collection_id": "16486",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:06032024-182223499",
        "primary_object_url": {
            "basename": "Thesis_Draft_final_final.pdf",
            "content": "final",
            "filesize": 21944874,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/16486/1/Thesis_Draft_final_final.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Revealing Regulatory Network Organization Through Single-Cell Perturbation Profiling and Maximum Entropy Models",
        "author": [
            {
                "family_name": "Jiang",
                "given_name": "Jialong",
                "orcid": "0000-0001-8560-8397",
                "clpid": "Jiang-Jialong"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Elowitz",
                "given_name": "Michael B.",
                "orcid": "0000-0002-1221-0967",
                "clpid": "Elowitz-M-B"
            },
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            },
            {
                "family_name": "Pachter",
                "given_name": "Lior S.",
                "orcid": "0000-0002-9164-6231",
                "clpid": "Pachter-Lior-S"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "Gene regulatory networks within cells modulate the expression of the genome in response to signals and changing environmental conditions. Reconstructions of gene regulatory networks can reveal the information processing and control principles used by cells to maintain homeostasis and execute cell-state transitions. In this thesis, we introduce a computational framework, D-SPIN, that generates quantitative models of gene regulatory networks from single-cell mRNA-seq datasets collected across thousands of distinct perturbation conditions. D-SPIN models the cell as a collection of interacting gene-expression programs, and constructs a probabilistic model to infer regulatory interactions between gene-expression programs and external perturbations. Using large Perturb-seq and drug-response datasets, we demonstrate that D-SPIN models reveal the organization of cellular pathways, sub-functions of macromolecular complexes, and the logic of cellular regulation of transcription, translation, metabolism, and protein degradation in response to gene knockdown perturbations. D-SPIN can also be applied to dissect drug response mechanisms in heterogeneous cell populations, elucidating how combinations of immunomodulatory drugs can induce novel cell states through additive recruitment of gene expression programs. D-SPIN provides a computational framework for constructing interpretable models of gene-regulatory networks to reveal principles of cellular information processing and physiological control.",
        "doi": "10.7907/5zta-9818",
        "publication_date": "2024",
        "thesis_type": "phd",
        "thesis_year": "2024"
    },
    {
        "id": "thesis:16459",
        "collection": "thesis",
        "collection_id": "16459",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:06012024-054725051",
        "primary_object_url": {
            "basename": "240531_PB_thesis_final.pdf",
            "content": "final",
            "filesize": 44817586,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/16459/1/240531_PB_thesis_final.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Modeling and Design of Synthetic Biochemical Circuits for Biological Phenotypes",
        "author": [
            {
                "family_name": "Bhamidipati",
                "given_name": "Pranav Subramanyam",
                "orcid": "0000-0002-6199-6505",
                "clpid": "Bhamidipati-Pranav-Subramanyam"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Elowitz",
                "given_name": "Michael B.",
                "orcid": "0000-0002-1221-0967",
                "clpid": "Elowitz-M-B"
            },
            {
                "family_name": "Bois",
                "given_name": "Justin S.",
                "orcid": "0000-0001-7137-8746",
                "clpid": "Bois-Justin-S"
            },
            {
                "family_name": "Barr",
                "given_name": "Alan H.",
                "clpid": "Barr-A-H"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>Biological behaviors arise from the dynamical interactions of biochemical networks. For example, the various immune responses to damage are manifestations of signaling networks between immune cell types. A central goal in systems and synthetic biology is to elucidate the design principles of these networks, or circuits, both in the sense of dissecting how function arises from structure in the natural context and in the sense of understanding the guidelines for optimal engineering of synthetic biological systems. The study of design principles in both senses is aided by mathematical modeling and simulation, which provide a self-consistent framework for evaluating the theoretical implications of biological hypotheses as well as a testbed for the development of novel circuits for desired biological phenotypes. This thesis pertains to two related challenges in this field, namely the scaling of computational design to larger circuits and the engineering of global phenotypes that emerge nonlinearly from local interactions.</p> \r\n    \r\n<p>The first section of this thesis presents a novel design platform for biological circuits, called CircuiTree, that uses a game-playing paradigm to overcome the combinatorial complexity of \\textit{de novo} circuit design. This platform treats circuit design as a game of circuit assembly and traverses the tree of possible assemblies using Monte Carlo tree search (MCTS). Borrowed from artificial intelligence (AI) agents that have mastered complex games, MCTS is a reinforcement learning (RL)-based search algorithm that efficiently searches for the most effective design strategies and naturally discovers design principles in the form of network motifs, which appear as clusters of solutions in the search tree. Finally, when tasked with designing fault-tolerant oscillators with five components, CircuiTree finds a novel design strategy, which we call motif multiplexing, in which multiple sub-oscillators are interleaved so as to render the circuit highly resistant to deletions and knockdowns. This design principle, which may be responsible for the multiple oscillatory loops observed in eukaryotic circadian clocks, opens the possibility of engineering synthetic circuits at a larger scale and suggests that larger biological circuits contain yet-unknown design features that are not simply extensions of smaller circuits.</p>\r\n\r\n<p>The second section describes a novel mechanosensitive property of the SynNotch synthetic chimeric receptor and uses a multicellular modeling framework to show how it can be used to control spatiotemporal patterning \\textit{in vitro}. Modified from the endogenous juxtacrine receptor Notch, SynNotch binds to an arbitrary extracellular ligand and, in response, releases an arbitrary transcription factor, thus acting as a user-defined signal transducer. We show that, in mouse fibroblasts, a simple sender-receiver SynNotch circuit ceases to transduce a membrane-bound GFP signal at high cell densities in 2D culture. Because of this feature, a lawn of cells expressing a signal-relay circuit, which we call the transceiver circuit, can undergo spatially limited activation, where the signal propagates in a wave outward from a GFP-expressing sender cell until, due to cell division, the cell density crosses a threshold value and the signaling system shuts down. Using a multicellular lattice-based model combined with experiments, we demonstrate that perturbations of growth parameters can be used to control the size of activated spots. Finally, we achieve spatiotemporal patterns of activation by seeding the growth dish nonuniformly, creating a wave of activation at the millimeter scale that recapitulates the kinematic wave patterning phenomenon observed during vertebrate somitogenesis.</p>\r\n\r\n<p>Together, this body of work represents an advance in the use of computational methods and mathematical modeling to guide the design and control of complex biological phenotypes. Advances in these methods promise to catalyze the development of more advanced cell-based therapies and engineered tissues.</p>",
        "doi": "10.7907/gpc6-hb40",
        "publication_date": "2024-06-14",
        "thesis_type": "phd",
        "thesis_year": "2024"
    },
    {
        "id": "thesis:16431",
        "collection": "thesis",
        "collection_id": "16431",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:05282024-221603734",
        "primary_object_url": {
            "basename": "MorganSchwartz_Thesis_20240601.pdf",
            "content": "final",
            "filesize": 21199702,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/16431/2/MorganSchwartz_Thesis_20240601.pdf",
            "version": "v5.0.0"
        },
        "type": "thesis",
        "title": "Accelerating Biological Discovery with Deep Learning and Spatial Optical Barcodes",
        "author": [
            {
                "family_name": "Schwartz",
                "given_name": "Morgan Sarah",
                "orcid": "0000-0001-8131-9125",
                "clpid": "Schwartz-Morgan-Sarah"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Van Valen",
                "given_name": "David A.",
                "orcid": "0000-0001-7534-7621",
                "clpid": "Van-Valen-David-A"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Rothenberg",
                "given_name": "Ellen V.",
                "orcid": "0000-0002-3901-347X",
                "clpid": "Rothenberg-E-V"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Cai",
                "given_name": "Long",
                "orcid": "0000-0002-7154-5361",
                "clpid": "Cai-Long"
            },
            {
                "family_name": "Sternberg",
                "given_name": "Paul W.",
                "orcid": "0000-0002-7699-0173",
                "clpid": "Sternberg-P-W"
            },
            {
                "family_name": "Van Valen",
                "given_name": "David A.",
                "orcid": "0000-0001-7534-7621",
                "clpid": "Van-Valen-David-A"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "Methodological advances in biology have given us a powerful suite of tools for measuring the state of the cell. Among these methods, next-generation sequencing, including single-cell methods, enables comprehensive measurement of gene expression; however, sequencing-based methods often preclude the collection of other visible phenotypic information. In contrast, light microscopy supports many different measurements that can be acquired in sequential rounds of labeling and imaging because light microscopy does not destroy the sample. Furthermore, light microscopy supports live cell imaging, including the use of fluorescent reporters to observe signaling dynamics in real time. In order to fully understand cellular function, multimodal data collection is needed that encompasses live cell response, end-point phenotypes, and finally perturbations to test the components of relevant signaling networks. In this thesis, I present key advances to create a unified experimental platform for interrogating the cell state. This platform uses light microscopy to collect multimodal measurements of cell state while supporting high-throughput perturbation screening. This platform is supported by a suite of deep learning analysis tools to enable quantitative analysis of these high-dimensional datasets. In Chapter 2, I introduce Caliban, our deep learning method for nuclear segmentation and tracking. In Chapter 3, I present a new method of optical barcodes to enable microscopy-based pooled perturbation screens. Finally, in Chapter 4, I describe preliminary work that leverages the previously described cell tracking and barcoding methodologies to explore the interdependencies of signaling pathway dynamics.",
        "doi": "10.7907/55c7-8142",
        "publication_date": "2024",
        "thesis_type": "phd",
        "thesis_year": "2024"
    },
    {
        "id": "thesis:16213",
        "collection": "thesis",
        "collection_id": "16213",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:10232023-184021847",
        "primary_object_url": {
            "basename": "saladi-dissertation.pdf",
            "content": "final",
            "filesize": 327264663,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/16213/1/saladi-dissertation.pdf",
            "version": "v6.0.0"
        },
        "type": "thesis",
        "title": "Some Computer Studies of Membrane Proteins, Molecular Chaperones, and Color",
        "author": [
            {
                "family_name": "Saladi",
                "given_name": "Shyam Madhukar",
                "orcid": "0000-0001-9701-3059",
                "clpid": "Saladi-Shyam-Madhukar"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Clemons",
                "given_name": "William M.",
                "orcid": "0000-0002-0021-889X",
                "clpid": "Clemons-W-M"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Murray",
                "given_name": "Richard M.",
                "orcid": "0000-0002-5785-7481",
                "clpid": "Murray-R-M"
            },
            {
                "family_name": "Clemons",
                "given_name": "William M.",
                "orcid": "0000-0002-0021-889X",
                "clpid": "Clemons-W-M"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Rees",
                "given_name": "Douglas C.",
                "orcid": "0000-0003-4073-1185",
                "clpid": "Rees-D-C"
            },
            {
                "family_name": "Hoelz",
                "given_name": "Andre",
                "orcid": "0000-0003-0923-3284",
                "clpid": "Hoelz-A"
            }
        ],
        "local_group": [
            {
                "literal": "div_chem"
            }
        ],
        "abstract": "This thesis shares a series of stories on seemingly disparate topics united by my efforts and love of computers. Initially, I discuss how the challenge of membrane protein expression provided an initial impetus for research. I channeled efforts towards developing a predictive (machine-learning) model for heterologous overexpression in E. coli. While we made strides to extend this model to other systems (not discussed here), my time was refocused onto questions of more fundamental biochemical interest: the biogenesis of tail-anchored membrane proteins. I built structural, predictive, and phylogenetic models to better understand how the C-terminal domain of co-chaperone Sgt2 functioned, refined the definition of the wider Sti1 family which includes Sgt2-C, and extended our understanding of those features of tail-anchored proteins that determine successful targeting in Yeast and Human cells. I developed a deep phylogeny of Get3, a chaperone involved in tail-anchored protein biogenesis, and helped specifically place Get3 proteins of photosynthesising organisms into evolutionary context. Along the way, I developed a parallel and compelling theme around data visualization, specifically around the use of colormaps across the life sciences. In particular, I built an application to screen and notify preprint authors when their manuscript had poor colormap usage. This was the first time automated software has been used to help authors improve their work at the preprint stage, an area that has grown significantly since my initial work. Finally, I brought together structural biology and data visualization by making perceptually uniform colormaps available in popular molecular visualization software tools to advocate for more thoughtful color usage in the field.",
        "doi": "10.7907/40cw-kn70",
        "publication_date": "2024",
        "thesis_type": "phd",
        "thesis_year": "2024"
    },
    {
        "id": "thesis:15041",
        "collection": "thesis",
        "collection_id": "15041",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:10132022-000100592",
        "primary_object_url": {
            "basename": "bernstein_thesis.pdf",
            "content": "final",
            "filesize": 2113002,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/15041/1/bernstein_thesis.pdf",
            "version": "v5.0.0"
        },
        "type": "thesis",
        "title": "Optimisation & Generalisation in Networks of Neurons",
        "author": [
            {
                "family_name": "Bernstein",
                "given_name": "Jeremy David",
                "orcid": "0000-0001-9110-7476",
                "clpid": "Bernstein-Jeremy-David"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Yue",
                "given_name": "Yisong",
                "orcid": "0000-0001-9127-1989",
                "clpid": "Yue-Yisong"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Tropp",
                "given_name": "Joel A.",
                "orcid": "0000-0003-1024-1791",
                "clpid": "Tropp-J-A"
            },
            {
                "family_name": "Liu",
                "given_name": "Ming-Yu",
                "orcid": "0000-0002-2951-2398",
                "clpid": "Liu-Ming-Yu"
            },
            {
                "family_name": "Meister",
                "given_name": "Markus",
                "orcid": "0000-0003-2136-6506",
                "clpid": "Meister-M"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Yue",
                "given_name": "Yisong",
                "orcid": "0000-0001-9127-1989",
                "clpid": "Yue-Yisong"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>The goal of this thesis is to develop the optimisation and generalisation theoretic foundations of learning in artificial neural networks. The thesis tackles two central questions. Given training data and a network architecture:</p>\r\n\r\n<ol>\r\n<li style=\"text-align:left\"><span style=\"padding-left:10px\">Which weight setting will generalise best to unseen data, and why?</span></li>\r\n<li style=\"text-align:left\"><span style=\"padding-left:10px\">What optimiser should be used to recover this weight setting?</span></li>\r\n</ol>\r\n\r\n<p>On optimisation, an essential feature of neural network training is that the network weights affect the loss function only indirectly through their appearance in the network architecture. This thesis proposes a three-step framework for deriving novel \u201carchitecture aware\u201d optimisation algorithms. The first step\u2014termed <em>functional majorisation</em>\u2014is to majorise a series expansion of the loss function in terms of functional perturbations. The second step is to derive <em>architectural perturbation bounds</em> that relate the size of functional perturbations to the size of weight perturbations. The third step is to substitute these architectural perturbation bounds into the functional majorisation of the loss and to obtain an optimisation algorithm via minimisation. This constitutes an application of the <em>majorise-minimise meta-algorithm</em> to neural networks.</p>\r\n\r\n<p>On generalisation, a promising recent line of work has applied PAC-Bayes theory to derive non-vacuous generalisation guarantees for neural networks. Since these guarantees control the average risk of ensembles of networks, they do not address which individual network should generalise best. To close this gap, the thesis rekindles an old idea from the kernels literature: the <em>Bayes point machine</em>. A Bayes point machine is a single classifier that approximates the aggregate prediction of an ensemble of classifiers. Since aggregation reduces the variance of ensemble predictions, Bayes point machines tend to generalise better than other ensemble members. The thesis shows that the space of neural networks consistent with a training set concentrates on a Bayes point machine if both the network width and normalised margin are sent to infinity. This motivates the practice of returning a wide network of large normalised margin.</p>\r\n\r\n<p>Potential applications of these ideas include novel methods for uncertainty quantification, more efficient numerical representations for neural hardware, and optimisers that transfer hyperparameters across learning problems.</p>",
        "doi": "10.7907/1jz8-5t85",
        "publication_date": "2023",
        "thesis_type": "phd",
        "thesis_year": "2023"
    },
    {
        "id": "thesis:15008",
        "collection": "thesis",
        "collection_id": "15008",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:08252022-153300158",
        "primary_object_url": {
            "basename": "hirokawa_soichi_thesis.pdf",
            "content": "final",
            "filesize": 23936329,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/15008/4/hirokawa_soichi_thesis.pdf",
            "version": "v7.0.0"
        },
        "type": "thesis",
        "title": "Dynamics of Protein-Mediated Polymer Coupling and their Implications in Antibody Production and Emergent Patterning",
        "author": [
            {
                "family_name": "Hirokawa",
                "given_name": "Soichi",
                "orcid": "0000-0001-5584-2676",
                "clpid": "Hirokawa-Soichi"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Schwab",
                "given_name": "Keith C.",
                "orcid": "0000-0001-8216-4815",
                "clpid": "Schwab-K-C"
            },
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Hsieh",
                "given_name": "David",
                "orcid": "0000-0002-0812-955X",
                "clpid": "Hsieh-David"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>Proteins serve a wide range of functions in and out of the cell, from signaling and gene regulation to transport and structural reinforcement. These functions are usually carried out from interactions with other molecules in the surrounding medium such as other proteins, small molecules, or DNA. One such class of proteins are what I will call polymer-coupling proteins: these proteins intentionally link identical polymers or two regions of the same polymer together so that their coupled interactions critically affect the state of the biological system. A vast array of such proteins exist in nature with roles such as the looping of DNA to physically inhibit the expression of a gene or the formation of the cytoskeleton which provides a cell with its shape. In this thesis, I use <i>in vitro</i> experimental methods to explore two cases of coupling proteins and understand their roles not only in reorganizing their complementary polymers but influencing the final state of their respective systems. </p>\r\n\r\n<p>In Chapter 2, I examine the starting process for the assembly of an antibody-encoding gene in developing immune cells. Motivated by data suggesting that some antibodies are less likely to be made than others, I explore how the early steps of constructing an antibody-encoding gene affect this uneven frequency of assembly. To initiate recombination, the recombination-activating gene (RAG) protein complex simultaneously binds and cuts two well-recognized sequences neighboring two antibody-encoding gene segments in order to allow other proteins to combine these exposed segments together. The sequences to which the RAG protein performs its binding and cutting functions have certain identifiable sequence patterns but can still vary. Through a single-molecule experimental method known as tethered particle motion (TPM) I show how changes to the binding site sequence can enhance or diminish the propensity of the RAG protein to bind and cut the DNA and thus explore the consequences of these altered interactions in the unequal selection for certain antibody gene segments over others. </p>\r\n\r\n<p>In Chapter 3, I turn to questions of the emergence of order from self-organization in biological systems. From the molecular to the population scale, biology constantly demonstrates that with an injection of energy, systems can be driven out of equilibrium and allow for the organization of its constituents. A case of such organization in cells is the coupling of microtubules by motor proteins to create and maintain the mitotic spindle, a critical biological architecture for ensuring that each cell obtains a copy of the genome during division. <i>In vitro</i> experiments that exploit similar motor-microtubule interactions have become a convenient way to identify the effects of perturbing a key player such as motor properties or boundary conditions of the system on the spatiotemporal extent of organization. However, in many instances, the dynamics under which such cytoskeletal systems reduce their entropy over the course of creating order have not been carefully examined in experimental systems. Here, I use engineered light-dimerizable motors that can give rise to the formation of a highly connected network that compacts to form a dense, organized structure, and through the use of a noninvasive imaging technique observe how the polymers that make up the network continually reorganize in the bulk during a global contraction of the network.</p>",
        "doi": "10.7907/fpmm-a552",
        "publication_date": "2023",
        "thesis_type": "phd",
        "thesis_year": "2023"
    },
    {
        "id": "thesis:14986",
        "collection": "thesis",
        "collection_id": "14986",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:07252022-061122576",
        "primary_object_url": {
            "basename": "Thesis Ronghui Zhu.pdf",
            "content": "final",
            "filesize": 31903908,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/14986/1/Thesis Ronghui Zhu.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Multicellular Circuit Design in Mammalian Cells",
        "author": [
            {
                "family_name": "Zhu",
                "given_name": "Ronghui",
                "orcid": "0000-0001-8171-482X",
                "clpid": "Zhu-Ronghui"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Elowitz",
                "given_name": "Michael B.",
                "orcid": "0000-0002-1221-0967",
                "clpid": "Elowitz-M-B"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Hay",
                "given_name": "Bruce A.",
                "orcid": "0000-0002-5486-0482",
                "clpid": "Hay-B-A"
            },
            {
                "family_name": "Bjorkman",
                "given_name": "Pamela J.",
                "orcid": "0000-0002-2277-3990",
                "clpid": "Bjorkman-P-J"
            },
            {
                "family_name": "Murray",
                "given_name": "Richard M.",
                "orcid": "0000-0002-5785-7481",
                "clpid": "Murray-R-M"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Elowitz",
                "given_name": "Michael B.",
                "orcid": "0000-0002-1221-0967",
                "clpid": "Elowitz-M-B"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>Multicellular circuits control the development of multicellular organisms, through programming processes such as cell proliferation, cell differentiation, cell movement, and cell signaling. A fundamental goal of biology is to understand the design principles of these multicellular circuits, and use these principles to design synthetic multicellular systems for therapeutic purposes. Top-down approaches, for example analyzing embryos bearing genetic mutations, have identified key genes in many multicellular circuits, but are challenging to study these circuits in an isolated context and in a quantitative and systematic manner. An alternative, complementary approach is to engineer or reconstitute multicellular circuits from bottom-up, which allows us to overcome the limitations of top-down approach and gain quantitative insights into multicellular circuit design. In this thesis, we use this bottom-up approach to explore the design principles of two multicellular circuits. In the first project, we took inspiration from two prevalent features from natural multistable circuits, namely competitive protein-protein interactions and positive autoregulation, to design a synthetic multistable circuit architecture called MultiFate. Both in the model and in the experiment, MultiFate circuits generate multiple cellular states, each stable for weeks, allow control over state-switching and state stability, and can be easily expanded to generate more states. In the second project, we use a gradient reconstitution system to systematically analyze a gradient modulation circuit consisting of BMP4 and its modulators, Chordin, Twsg and BMP-1. We found that the circuit can give rise to diverse gradient modulation capabilities. In particular, the full circuit is sufficient for active ligand shuttling and generation of non-monotonic displaced gradient. These multicellular circuits could provide a foundation for engineering synthetic multicellular systems in mammalian cells.</p>",
        "doi": "10.7907/p0fn-qa56",
        "publication_date": "2023",
        "thesis_type": "phd",
        "thesis_year": "2023"
    },
    {
        "id": "thesis:16081",
        "collection": "thesis",
        "collection_id": "16081",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:06042023-195408313",
        "primary_object_url": {
            "basename": "galvezmerchan_angel_2023_thesis.pdf",
            "content": "final",
            "filesize": 15283976,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/16081/1/galvezmerchan_angel_2023_thesis.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Studies of mRNA Expression and Degradation",
        "author": [
            {
                "family_name": "G\u00e1lvez Merch\u00e1n",
                "given_name": "\u00c1ngel",
                "orcid": "0000-0001-7420-8697",
                "clpid": "G\u00e1lvez-Merch\u00e1n-\u00c1ngel"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Pachter",
                "given_name": "Lior S.",
                "orcid": "0000-0002-9164-6231",
                "clpid": "Pachter-Lior-S"
            },
            {
                "family_name": "Voorhees",
                "given_name": "Rebecca M.",
                "orcid": "0000-0003-1640-2293",
                "clpid": "Voorhees-R-M"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Aravin",
                "given_name": "Alexei A.",
                "orcid": "0000-0002-6956-8257",
                "clpid": "Aravin-A-A"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Pachter",
                "given_name": "Lior S.",
                "orcid": "0000-0002-9164-6231",
                "clpid": "Pachter-Lior-S"
            },
            {
                "family_name": "Voorhees",
                "given_name": "Rebecca M.",
                "orcid": "0000-0003-1640-2293",
                "clpid": "Voorhees-R-M"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>Part 1: Protein degradation coupled to Nonsense-mediated mRNA decay</p>\r\n\r\n<p>Translation of mRNAs containing premature termination codons (PTCs) results in truncated protein products with deleterious effects. Nonsense-mediated decay (NMD) is a surveillance pathway responsible for detecting PTC containing transcripts. While the molecular mechanisms governing mRNA degradation have been extensively studied, the fate of the nascent protein product remains largely uncharacterized. In part 1 of this thesis, we use a fluorescent reporter system in mammalian cells to reveal a selective degradation pathway specifically targeting the protein product of an NMD mRNA. We show that this process is post-translational, and dependent on the ubiquitin proteasome system. To systematically uncover factors involved in NMD-linked protein quality control, we conducted genome-wide flow cytometry-based screens. Our screens recovered known NMD factors, but suggested protein degradation did not depend on the canonical ribosome-quality control (RQC) pathway. A subsequent arrayed screen demonstrated that protein and mRNA branches of NMD rely on a shared recognition event. Our results establish the existence of a targeted pathway for nascent protein degradation from PTC containing mRNAs, and provides a reference for the field to identify and characterize required factors.</p>\r\n\r\n<p>Part 2: The Commons Cell Atlas</p>\r\n\r\n<p>Current cell atlas projects aim to curate representative datasets, cell-types, and marker genes for tissues across an organism. Despite their ubiquity, atlas projects rely on duplicated and manual effort to curate marker genes and annotate cell-types. Importantly, the lack of data-compatible tools and a fixed representation of the atlas make their reanalysis near-impossible. To overcome these challenges, we present a collection of data, algorithms, and tools to automate cataloging and analyzing cell-types across all tissues in an organism. We leveraged this work to build a Human Commons Cell Atlas comprising 2.9 million cells across 27 tissues that can be easily updated and that is structured to facilitate custom analyses. To showcase the flexibility of the atlas, we demonstrate that it can be used for isoform analyses. In particular, we study cell-type specificity of isoforms of OAS1, which has recently been shown to offer SARS-CoV-2 protection in certain individuals that display higher expression of the p46 isoform. Using our Commons Cell Atlas, we localize the OAS1 p44b isoform to the testis, and find that it is specific to germ line cells. By virtue of enabling customized analyses via a modular and dynamic atlas structure, the Commons Cell Atlas should be useful for exploratory analyses that are intractable within the rigid framework of current gene-centric static atlases.</p>",
        "doi": "10.7907/esxk-ch24",
        "publication_date": "2023",
        "thesis_type": "phd",
        "thesis_year": "2023"
    },
    {
        "id": "thesis:15123",
        "collection": "thesis",
        "collection_id": "15123",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:03172023-050019811",
        "primary_object_url": {
            "basename": "Guru_PhD_thesis_v2.pdf",
            "content": "final",
            "filesize": 16253785,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/15123/1/Guru_PhD_thesis_v2.pdf",
            "version": "v5.0.0"
        },
        "type": "thesis",
        "title": "Engineering Artificial Systems with Natural Intelligence",
        "author": [
            {
                "family_name": "Raghavan",
                "given_name": "Guruprasad",
                "orcid": "0000-0002-1970-9963",
                "clpid": "Raghavan-Guruprasad"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Winfree",
                "given_name": "Erik",
                "orcid": "0000-0002-5899-7523",
                "clpid": "Winfree-E"
            },
            {
                "family_name": "Rutishauser",
                "given_name": "Ueli",
                "orcid": "0000-0002-9207-7069",
                "clpid": "Rutishauser-U"
            },
            {
                "family_name": "Lois",
                "given_name": "Carlos",
                "orcid": "0000-0002-7305-2317",
                "clpid": "Lois-Carlos"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>Although Deep neural networks achieve human-like performance on a variety of perceptual and decision-making tasks, they perform poorly when confronted with changing tasks or goals, and broadly fail to match the flexibility and robustness of human intelligence. Additionally, artificial neural networks rely heavily on human-designed, hand-programmed architectures for their remarkable performance. In this thesis, I work towards achieving two goals: (i) development of a set of mathematical frameworks inspired by facets of natural intelligence, to endow artificial networks with flexibility and robustness, two key traits of natural intelligence; and (ii) inspired by the development of the biological vision system, I propose an algorithm that can \u2018grow\u2019 a functional, layered neural network from a single initial cell, with the aim of enabling autonomous development of artificial networks akin to living neural networks.</p>\r\n\r\n<p>For the first goal of endowing networks with flexibility and robustness, I propose a mathematical framework to enable continuous training of neural networks on a range of objectives by constructing path connected sets of networks, resulting in the discovery of a series of networks with equivalent functional performance on a given machine learning task. In this framework, I view the weight space of a neural network as a curved Riemannian manifold and move a network along a functionally invariant path in weight space while searching for networks that satisfy secondary objectives. A path-sampling algorithm trains computer vision and natural language processing networks with millions of weight parameters to learn a series of classification tasks without performance loss while accommodating secondary objectives including network sparsification, incremental task learning, and increased adversarial robustness. Broadly, for achieving this goal, I conceptualize a neural network as a mathematical object that can be iteratively transformed into distinct configurations by the path- sampling algorithm to define a sub-manifold of networks that can be harnessed to achieve user goals.</p>\r\n\r\n<p>For the second goal of \u2018growing\u2019 artificial neural networks in a manner similar to living neural networks, I develop an approach inspired by the mechanisms employed by the early visual system to wire the retina to the lateral geniculate nucleus (LGN), days before animals open their eyes. I find that the key ingredients for robust self- organization are (a) an emergent spontaneous spatiotemporal activity wave in the first layer and (b) a local learning rule in the second layer that \u2018learns\u2019 the underlying activity pattern in the first layer. As the bio-inspired developmental rule is adapt- able to a wide-range of input-layer geometries and robust to malfunctioning units in the first layer, it can be used to successfully grow and self-organize pooling architectures of different pool-sizes and shapes. The algorithm provides a primitive procedure for constructing layered neural networks through growth and self-organization. Finally, I also demonstrate that networks grown from a single unit perform as well as hand-crafted networks on a wide variety of static (MNIST recognition) and dynamic (gesture-recognition) tasks. Broadly, the work in the second section of this thesis shows that biologically inspired developmental algorithms can be applied to autonomously grow functional \u2018brains\u2019 in-silico.</p>",
        "doi": "10.7907/374f-1202",
        "publication_date": "2023",
        "thesis_type": "phd",
        "thesis_year": "2023"
    },
    {
        "id": "thesis:15132",
        "collection": "thesis",
        "collection_id": "15132",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:04132023-015900885",
        "primary_object_url": {
            "basename": "Ma_Yitong_2023.pdf",
            "content": "final",
            "filesize": 17632710,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/15132/1/Ma_Yitong_2023.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Multicellular Synthetic Biology in Mammalian Systems",
        "author": [
            {
                "family_name": "Ma",
                "given_name": "Yitong",
                "orcid": "0000-0003-4446-7326",
                "clpid": "Ma-Yitong"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Elowitz",
                "given_name": "Michael B.",
                "orcid": "0000-0002-1221-0967",
                "clpid": "Elowitz-M-B"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Guttman",
                "given_name": "Mitchell",
                "orcid": "0000-0003-4748-9352",
                "clpid": "Guttman-M"
            },
            {
                "family_name": "Murray",
                "given_name": "Richard M.",
                "orcid": "0000-0002-5785-7481",
                "clpid": "Murray-R-M"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Elowitz",
                "given_name": "Michael B.",
                "orcid": "0000-0002-1221-0967",
                "clpid": "Elowitz-M-B"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>In multicellular organisms, different types of cells use intercellular signals to communicate and regulate population dynamics, and further coordinate complex behaviors. This presents a rarely tapped into potential for mammalian synthetic biology, which was largely restricted to engineering a single cell type in the past to mimic and use similar multicellular designs to achieve more functionalities. However, with current synthetic biology tools and designs, there are several major challenges to achieve a multicellular circuit. Challenges include precise and tunable control over cell type switching, having an orthogonal cell-cell communication signal, and robust control of cell populations.</p>\r\n\r\n<p>To address these challenges, this thesis presents a system for tunable regulating of gene expression with DNA methylation, an auxin-based module for mammalian cell-cell communication, and a robust circuit for population control in mammalian cells. I further applied these work to engineering immune cells to show the potential of multicellular circuits in immunotherapies. Together, these works demonstrated the possibility of constructing multicellular circuits in mammalian systems, and that multicellular circuit can further extend the scope of synthetic biology to achieve more complex functions.</p>",
        "doi": "10.7907/w0q1-7s17",
        "publication_date": "2023",
        "thesis_type": "phd",
        "thesis_year": "2023"
    },
    {
        "id": "thesis:16105",
        "collection": "thesis",
        "collection_id": "16105",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:06112023-211027828",
        "type": "thesis",
        "title": "Stem Cell-Derived Embryo Models in Mouse and Human to Illuminate the \u201cBlack Box\u201d of Pre- to Post-Implantation Development",
        "author": [
            {
                "family_name": "Jorgensen",
                "given_name": "Victoria Lynn",
                "orcid": "0000-0002-4205-6198",
                "clpid": "Jorgensen-Victoria-Lynn"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Zernicka-Goetz",
                "given_name": "Magdalena",
                "orcid": "0000-0002-7004-2471",
                "clpid": "Zernicka-Goetz-M"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Guttman",
                "given_name": "Mitchell",
                "orcid": "0000-0003-4748-9352",
                "clpid": "Guttman-M"
            },
            {
                "family_name": "Hay",
                "given_name": "Bruce A.",
                "orcid": "0000-0002-5486-0482",
                "clpid": "Hay-B-A"
            },
            {
                "family_name": "Parker",
                "given_name": "Joseph",
                "orcid": "0000-0001-9598-2454",
                "clpid": "Parker-Joseph"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Zernicka-Goetz",
                "given_name": "Magdalena",
                "orcid": "0000-0002-7004-2471",
                "clpid": "Zernicka-Goetz-M"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>Mammalian development is a complex and highly regulated process by which a single cell, the totipotent zygote, gives rise to all lineages of the future organism.  While incredible advancements have been made to study and understand the earliest events of our life, many questions are still unanswered. Moreover, the most precarious stage of development, implantation, remains a \u201cblack box\u201d to researchers due to inaccessibility of the embryo within the uterus of the mother. In the last decade, however, the emergence of stem cell derived embryos represents an exciting alternative avenue to study these dynamic stages.</p>\r\n\r\n<p>During my PhD, I worked to establish two pre-implantation stem cell models, one in human and one in mouse, to better understand the earliest days of mammalian development. These models replicate the blastocyst stage of development; at this point in time the embryo is ready to implant into the uterus and contains all embryonic and extra-embryonic tissues needed to form the future organism: the epiblast, the hypoblast, and the trophectoderm. Beginning with my human model, I demonstrate the ability of a single cell type, expanded potential stem cells (EPSCs), to give rise to structures that replicate the natural blastocyst in size, morphology, and initiation of lineage segregation. Furthermore, these human blastocyst-like structures can undergo the very beginning of post-implantation remodeling by forming an epiblast rosette and initiating lumenogenesis.  Nevertheless, single cell RNA-seq (scRNA-seq) analysis reveals that lineages are not fully committed in this model, perhaps explaining why development is limited in these structures up to about Day 7/8. In the context of my mouse model, I combine not one but three distinct cell types to generate blastocyst-like structures: 1) wildtype embryonic stem cells (ESCs) to form the epiblast, 2) trophoblast stem cells (TSCs) to form the trophectoderm, and 3) Gata4-inducible ESCs to form the primitive endoderm. Again, these structures mimic the natural mouse blastocyst in morphology and lineage segregation and demonstrate the ability to transition to post-implantation stages. Development of the three blastocyst lineages was further confirmed via global scRNA-seq analysis comparing our Gata4i-Blastoids to natural embryos; importantly, however, this analysis also showed that differentiation of the mural trophectoderm, the tissue responsible for uterine invasion, is lacking in our stem cell model and likely explains the inability for these blastoids to implant <i>in vivo</i>.</p>\r\n\r\n<p>Altogether, this dissertation explains key aspects of pre- to post-implantation development and highlights the incredible power of stem cell-derived embryos to self-organize into structures that closely mimic the natural embryo.</p>",
        "doi": "10.7907/t1fe-3915",
        "publication_date": "2023",
        "thesis_type": "phd",
        "thesis_year": "2023"
    },
    {
        "id": "thesis:15235",
        "collection": "thesis",
        "collection_id": "15235",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:05302023-215054202",
        "type": "thesis",
        "title": "Diversity in Notch Ligand-Receptor Signaling Interactions",
        "author": [
            {
                "family_name": "Kuintzle",
                "given_name": "Rachael Christine",
                "orcid": "0000-0002-1035-4983",
                "clpid": "Kuintzle-Rachael-Christine"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Elowitz",
                "given_name": "Michael B.",
                "orcid": "0000-0002-1221-0967",
                "clpid": "Elowitz-M-B"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Pachter",
                "given_name": "Lior S.",
                "orcid": "0000-0002-9164-6231",
                "clpid": "Pachter-Lior-S"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Bronner",
                "given_name": "Marianne E.",
                "orcid": "0000-0003-4274-1862",
                "clpid": "Bronner-M-E"
            },
            {
                "family_name": "Hay",
                "given_name": "Bruce A.",
                "orcid": "0000-0002-5486-0482",
                "clpid": "Hay-B-A"
            },
            {
                "family_name": "Elowitz",
                "given_name": "Michael B.",
                "orcid": "0000-0002-1221-0967",
                "clpid": "Elowitz-M-B"
            }
        ],
        "local_group": [
            {
                "literal": "div_chem"
            }
        ],
        "abstract": "The ability to understand and predict signaling between different cell types is a major challenge in biology. The Notch pathway enables direct signaling through membrane-bound ligands and receptors, and is used in diverse contexts. While its canonical molecular signaling mechanism is well characterized, its many-to-many interacting pathway components, the complexity of their expression patterns, and the presence of same-cell (cis) as well as inter-cellular (trans) receptor-ligand interactions, have made it difficult to predict how a given cell will signal to others. Here, we use a cell-based approach, with Chinese hamster ovary (CHO-K1) cells and C2C12 mouse myoblasts, to systematically characterize trans-activation, cis-inhibition, and cis-activation efficiencies for the essential receptors (Notch1 and Notch2) and activating ligands (Dll1, Dll4, Jag1, and Jag2), in the presence of Lunatic Fringe (Lfng) or the enzymatically dead Lfng D289E mutant. All ligands trans-activate Notch1 and Notch2, except for Jag1, which competitively inhibits Notch1 signaling, and whose Notch1 binding strength is potentiated by Lfng. For Notch1, cis-activation is generally weaker than trans-activation, but for Notch2, cis-activation by Delta ligands is much stronger than trans-activation, and Notch2 cis-activation by Jag1 is similar in strength to trans-activation. Cis-inhibition is associated with weak cis-activation, as Dll1 and Dll4 do not cis-inhibit Notch2. Lfng expression potentiates trans-activation of both Notch1 and Notch2 by the Delta ligands and weakens trans-activation of both receptors by the Jagged ligands. The map of receptor-ligand-Fringe interaction outcomes revealed here should help guide rational perturbation and control of the Notch pathway.",
        "doi": "10.7907/w8gj-jb92",
        "publication_date": "2023",
        "thesis_type": "phd",
        "thesis_year": "2023"
    },
    {
        "id": "thesis:15247",
        "collection": "thesis",
        "collection_id": "15247",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:05312023-213322223",
        "primary_object_url": {
            "basename": "moses_lambda_2023.pdf",
            "content": "final",
            "filesize": 48841902,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/15247/1/moses_lambda_2023.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Computation Foundations of Spatial Transcriptomics",
        "author": [
            {
                "family_name": "Moses",
                "given_name": "Lambda",
                "orcid": "0000-0002-7092-9427",
                "clpid": "Moses-Lambda"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Pachter",
                "given_name": "Lior S.",
                "orcid": "0000-0002-9164-6231",
                "clpid": "Pachter-Lior-S"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Van Valen",
                "given_name": "David A.",
                "orcid": "0000-0001-7534-7621",
                "clpid": "Van-Valen-David-A"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Wold",
                "given_name": "Barbara J.",
                "orcid": "0000-0003-3235-8130",
                "clpid": "Wold-B-J"
            },
            {
                "family_name": "Pimentel",
                "given_name": "Harold",
                "orcid": "0000-0001-8556-2499",
                "clpid": "Pimentel-Harold"
            },
            {
                "family_name": "Pachter",
                "given_name": "Lior S.",
                "orcid": "0000-0002-9164-6231",
                "clpid": "Pachter-Lior-S"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>Single-cell and spatial transcriptomics have come of age in the past few years; datasets and data analysis software packages have proliferated. With the increasing sizes of datasets, proliferating new data collection technologies, and mainstreaming of high-throughput technologies, the software can be improved for better speed and memory efficiency, standardized and consistent user interface for multiple technologies, and in documentation to onboard new users. First, I collected a database of spatial transcriptomics literature and analyzed the data on trends and sociology in this field. Based on the database and data analyses, I wrote a comprehensive book both qualitatively and quantitatively documenting the history of the field since the 1960s and reviewing more recent developments, which informed the software and methods I later developed. Then, to address the challenges with the pre-processing large datasets, we developed \\texttt{kallisto} \\texttt{bustools}  for fast and modular pseudoalignment of sequencing reads to the transcriptome in single-cell RNA-seq (scRNA-seq), giving consistent results with the established and much more computationally demanding alignment method Cell Ranger. Briefly summarized are my attempt to map dissociated cells in scRNA-seq to a spatial gene expression reference and to build a image processing pipeline for image based spatial transcriptomics data analysis. Finally, to address the challenges in downstream analyses of spatial -omics data, I first wrote the new \\texttt{SpatialFeatureExperiment} (SFE) data structure to represent and operate on geometries in spatial transcriptomics data and to organize results from spatial analyses. Based on SFE, I wrote Voyager, which brings decades of research in geospatial data analysis to spatial transcriptomics, to better utilize the opportunities from spatial information to gain novel biological insights. To reduce user learning curve, Voyager conforms to SCE styles and conventions and has a comprehensive documentation website and consistent user interface to many geospatial methods.</p>",
        "doi": "10.7907/rt24-pq60",
        "publication_date": "2023",
        "thesis_type": "phd",
        "thesis_year": "2023"
    },
    {
        "id": "thesis:14339",
        "collection": "thesis",
        "collection_id": "14339",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:08242021-212959886",
        "primary_object_url": {
            "basename": "thesis_vahe_galstyan_final.pdf",
            "content": "final",
            "filesize": 45962171,
            "license": "cc_by",
            "mime_type": "application/pdf",
            "url": "/14339/1/thesis_vahe_galstyan_final.pdf",
            "version": "v6.0.0"
        },
        "type": "thesis",
        "title": "Studies in Physical Biology: Exploring Allosteric Regulation, Enzymatic Error Correction, and Cytoskeletal Self-Organization Using Theory and Modeling",
        "author": [
            {
                "family_name": "Galstyan",
                "given_name": "Vahe",
                "orcid": "0000-0001-7073-9175",
                "clpid": "Galstyan-Vahe"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Van Valen",
                "given_name": "David A.",
                "orcid": "0000-0001-7534-7621",
                "clpid": "Van-Valen-David-A"
            },
            {
                "family_name": "Winfree",
                "given_name": "Erik",
                "orcid": "0000-0002-5899-7523",
                "clpid": "Winfree-E"
            },
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            }
        ],
        "local_group": [
            {
                "literal": "div_chem"
            }
        ],
        "abstract": "<p>Physical biology offers powerful tools for quantitatively dissecting the various aspects of cellular life that one cannot attribute to inanimate matter. Signature examples of living matter include adaptation, self-organization, and division. In this thesis, we explore different interconnected facets of these processes using statistical mechanics, nonequilibrium thermodynamics, and biophysical modeling.</p>\r\n\r\n<p>One of the key mechanisms underlying physiological and evolutionary adaptation is allosteric regulation. It allows cells to dynamically respond to changes in the state of the environment often expressed through altered levels of different environmental cues. The first thread of our work is dedicated to exploring the combinatorial diversity of responses available to allosteric proteins that are subject to multi-ligand regulation. We demonstrate that proteins characterized through the Monod-Wyman-Changeux model of allostery and operating at thermodynamic equilibrium are capable of eliciting a wide range of response behaviors which include the kinds known from the field of digital circuits (e.g., NAND logic response), as well as more sophisticated computations such as ratiometric sensing. </p>\r\n\r\n<p>Despite the fact that biomolecules at thermodynamic equilibrium are able to orchestrate a variety of fascinating behaviors, the cell is ultimately 'alive' because it constantly metabolizes nutrients and generates energy to drive functions that cannot be sustained in the absence of energy consumption. One prominent example of such a function is nonequilibrium error correction present in high-fidelity processes such as protein synthesis, DNA replication, or pathogen recognition. We begin the second thread of our work by providing a conceptual understanding of the prevailing mechanism used in explaining this high-fidelity behavior, namely that of kinetic proofreading. Specifically, we develop an allostery-based mechanochemical model of a kinetic proofreader where chemical driving is replaced with a mechanical engine with tunable knobs which allow modulating the amount of dissipation in a transparent way. We demonstrate how varying levels of error correction can be attained at different regimes of dissipation and offer intuitive interpretations for the conditions required for efficient biological proofreading.</p>\r\n\r\n<p>We then extend the notion of error correction to equilibrium enzymes not endowed with structural features typically required for proofreading. We show that, under physiological conditions, purely diffusing enzymes can take advantage of the existing nonequilibrium organization of their substrates in space and enhance the fidelity of catalysis. Our proposed mechanism called spatial proofreading offers a novel perspective on spatial structures and compartmentalization in cells as a route to specificity.</p>\r\n\r\n<p>In the last thread of the thesis, we make a transition from molecular-scale studies to the mesoscopic scale, and explore the principles of self-organization in nonequilibrium structures formed in reconstituted microtubule-motor mixtures. In particular, we develop a theoretical framework that predicts the spatial distribution of kinesin motors in radially symmetric microtubule asters formed under various conditions using optogenetic control. The model manages to accurately recapitulate the experimentally measured motor profiles through effective parameters that are specific for each kind of kinesin motor used. Our theoretical work of rigorously assessing the motor distribution therefore offers an avenue for understanding the link between the microscopic motor properties (e.g., processivity or binding affinity) and the large-scale structures they create.</p>\r\n\r\n<p>In all, the thesis encompasses a series of case studies with shared themes of allostery and nonequilibrium, highlighting the capacity of living matter to perform remarkable tasks inaccessible to nonliving materials.</p>",
        "doi": "10.7907/1fzr-1240",
        "publication_date": "2022",
        "thesis_type": "phd",
        "thesis_year": "2022"
    },
    {
        "id": "thesis:14938",
        "collection": "thesis",
        "collection_id": "14938",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:06022022-201232129",
        "primary_object_url": {
            "basename": "Final Thesis version Eduardo da Veiga Beltrame.pdf",
            "content": "final",
            "filesize": 17501518,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/14938/1/Final Thesis version Eduardo da Veiga Beltrame.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Stories in Single Cell RNA Sequencing",
        "author": [
            {
                "family_name": "da Veiga Beltrame",
                "given_name": "Eduardo",
                "orcid": "0000-0002-1529-9207",
                "clpid": "da-Veiga-Beltrame-Eduardo"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Sternberg",
                "given_name": "Paul W.",
                "orcid": "0000-0002-7699-0173",
                "clpid": "Sternberg-P-W"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Pachter",
                "given_name": "Lior S.",
                "orcid": "0000-0002-9164-6231",
                "clpid": "Pachter-Lior-S"
            },
            {
                "family_name": "Van Valen",
                "given_name": "David A.",
                "orcid": "0000-0001-7534-7621",
                "clpid": "Van-Valen-David-A"
            },
            {
                "family_name": "Sternberg",
                "given_name": "Paul W.",
                "orcid": "0000-0002-7699-0173",
                "clpid": "Sternberg-P-W"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>This thesis describes the projects I have worked on since starting the Caltech bioengineering program in fall 2017. The general theme of my projects is that they are all about single cell RNA sequencing (scRNA-seq), spanning the experimental and computational realms.</p> \r\n\r\n<p>Chapter 1 is an introduction explaining the essential concepts and is meant to be readable by a wide audience. For the other chapters, each one describes a separate project in a succinct manner, including links to the related preprint, published paper or code repositories at the start of each chapter.</p>\r\n\r\n<p>Chapter 2 describes the scVI generative model for scRNA-seq data and the scvi-tools framework, which forms the basis of many of my computational projects.</p> \r\n\r\n<p>Chapter 3 describes an open source 3D printable syringe pump system that was developed envisioning facilitating many kinds of experiments, in particular droplet based scRNA-seq.</p> \r\n\r\n<p>Chapter 4 describes a new way of fabricating hydrogel beads with unique DNA barcodes that are used for scRNA-seq experiments.</p> \r\n\r\n<p>Chapter 5 describes a database listing most published scRNA-seq studies that I helped create, and provides a useful overview of the state of the field.</p> \r\n\r\n<p>Chapter 6 describes the kallisto bus workflow, which is used for pre-processing scRNA-seq data, going from FASTQ file to gene count matrix in a very efficient manner.</p> \r\n\r\n<p>Chapter 7 describes a new way of using scVI to quantify the trade- off in the quality of scRNA-seq of a given dataset when surveying more cells or sequencing more reads per cell.</p> \r\n\r\n<p>Chapter 8 describes tools developed for the WormBase users to leverage scRNA-seq data on <i>C. elegans</i>, and which can be deployed with any other scRNA-seq dataset.</p> \r\n\r\n<p>Chapter 9 describes a remarkably successful offshoot of the devel- opment of these tools: a simple scVI based analysis and visualization strategy for finding candidate marker genes using <i>C. elegans</i> scRNA-seq data, which was experimentally validated by members of the Sternberg lab.</p>",
        "doi": "10.7907/4kgh-8420",
        "publication_date": "2022",
        "thesis_type": "phd",
        "thesis_year": "2022"
    },
    {
        "id": "thesis:14399",
        "collection": "thesis",
        "collection_id": "14399",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:10172021-215439860",
        "primary_object_url": {
            "basename": "Dobreva_Tatyana_2021_v7.pdf",
            "content": "final",
            "filesize": 10892340,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/14399/1/Dobreva_Tatyana_2021_v7.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Engineering Tools to Probe and Manipulate the Immune System at Single-Cell Resolution",
        "author": [
            {
                "family_name": "Dobreva",
                "given_name": "Tatyana",
                "orcid": "0000-0002-2625-8873",
                "clpid": "Dobreva-Tatyana"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Gradinaru",
                "given_name": "Viviana",
                "orcid": "0000-0001-5868-348X",
                "clpid": "Gradinaru-V"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Gao",
                "given_name": "Wei",
                "orcid": "0000-0002-8503-4562",
                "clpid": "Gao-Wei"
            },
            {
                "family_name": "Gradinaru",
                "given_name": "Viviana",
                "orcid": "0000-0001-5868-348X",
                "clpid": "Gradinaru-V"
            },
            {
                "family_name": "Elowitz",
                "given_name": "Michael B.",
                "orcid": "0000-0002-1221-0967",
                "clpid": "Elowitz-M-B"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>My thesis focuses on developing experimental and computational tools to probe and manipulate cellular transcriptomes in the context of human health and disease. Chapter 1 and 2 focus on published work where we leverage single-cell RNA sequencing (scRNA-seq) to understand human immune variability, characterize cell-type specific biases of multiple viral variants within an animal, and assess temporal immune response in the brain to delivery of genetic cargo via an adeno-associated virus (AAV). Chapter 3 and 4 present progress I have made on tools for exporting RNA extracellularly and engineering of a transcription factor for modulating macrophage state.</p>\r\n\r\n<p>For probing cellular transcriptome states, we have developed a platform using multiplexed single-cell sequencing and out-of-clinic capillary blood extraction to understand temporal and inter-individual variability of gene expression within immune cell types. Our platform enables simplified, cost-effective profiling of the human immune system across subjects and time at single-cell resolution. To demonstrate the power of our platform, we performed a three day time-of-day study of four healthy individuals, generating gene expression data for 24,087 cells across 22 samples. We detected genes with cell type-specific time-of-day expression and identified robust genes and pathways particular to each individual, all of which could have been missed if analyzed with bulk RNA-sequencing. Also, using scRNA-seq, we have developed a method to screen and characterize cellular tropism of multiple AAV variants. Additionally, I have looked at AAV-mediated transcriptomic changes in animals injected with AAV-PHP.eB three days and twenty-five days post-injection. I have found that there is an upregulation of genes involved in p53 signaling in endothelial cells three days post-injection.</p>\r\n\r\n<p>In the context of manipulating cellular transcriptomic states, I demonstrate that a fusion between RNA targeting enzyme, dCas13, and capsid-forming neuronal protein, Arc, is able to form a capsid-like structure capable of encapsulating RNA. I also present methods and preliminary data for tuning macrophage states through mutations in transcription factor EB (TFEB) using scRNA-seq as a readout.</p>",
        "doi": "10.7907/n3rs-ft69",
        "publication_date": "2022",
        "thesis_type": "phd",
        "thesis_year": "2022"
    },
    {
        "id": "thesis:14409",
        "collection": "thesis",
        "collection_id": "14409",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:10282021-191743624",
        "type": "thesis",
        "title": "Quantitative Sequencing and its Application to Studies of the Human Small-Intestine Microbiota",
        "author": [
            {
                "family_name": "Barlow",
                "given_name": "Jacob T.",
                "orcid": "0000-0002-1842-4835",
                "clpid": "Barlow-Jacob-T"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Ismagilov",
                "given_name": "Rustem F.",
                "orcid": "0000-0002-3680-4399",
                "clpid": "Ismagilov-R-F"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Mazmanian",
                "given_name": "Sarkis K.",
                "orcid": "0000-0003-2713-1513",
                "clpid": "Mazmanian-S-K"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Cai",
                "given_name": "Long",
                "orcid": "0000-0002-7154-5361",
                "clpid": "Cai-Long"
            },
            {
                "family_name": "Ismagilov",
                "given_name": "Rustem F.",
                "orcid": "0000-0002-3680-4399",
                "clpid": "Ismagilov-R-F"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>Our understanding of the interplay between microbial species and the hosts they live on and in is continually expanding. New insights have focused not only microorganisms that drive specific disease states but also those that help maintain human health. As research drives towards mechanistic understanding of host-microbe relationships new quantitative tools are needed to help interrogate these complex interactions. Chapter I of this thesis discusses formulation of a method for rapid detection of antibiotic resistance in <i>Neisseria gonorrhoeae</i>. Our approach identified RNA signatures from transcriptional profiling of Neisseria gonorrhoeae after 10-minute antibiotic exposure. Utilization of these RNA markers allowed for rapid identification of antibiotic susceptibility or resistance to the antibiotic ciprofloxacin. Chapter II shifts focus to the development of a quantitative sequencing technique for the measurement of absolute taxon abundances in complex microbial communities. Combining the precision of digital PCR with the high-throughput nature of 16S rRNA gene amplicon sequencing allowed for simultaneous quantitative profiling of all bacterial taxa in host-associated microbial communities. We extensively characterized our quantitative sequencing methodology in the presence of high host nucleic acid levels and low microbial loads to understand the limits of quantification and detection in complex sample types. Last, Chapter III applies the quantitative sequencing technology from Chapter II to investigate the microbial community of the human small intestine, specifically the duodenum. Data from the duodenum of 250 individuals revealed a wide range of total microbial loads and a distinct subset of microbes, termed disruptor taxa, that were associated with small intestinal bacterial overgrowth (SIBO) and GI symptom severity.</p>",
        "doi": "10.7907/ca28-fk21",
        "publication_date": "2022",
        "thesis_type": "phd",
        "thesis_year": "2022"
    },
    {
        "id": "thesis:14424",
        "collection": "thesis",
        "collection_id": "14424",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:11102021-210013472",
        "type": "thesis",
        "title": "Compilation and Inference with Chemical Reaction Networks",
        "author": [
            {
                "family_name": "Poole",
                "given_name": "William",
                "orcid": "0000-0002-2958-6776",
                "clpid": "Poole-William"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Winfree",
                "given_name": "Erik",
                "orcid": "0000-0002-5899-7523",
                "clpid": "Winfree-E"
            },
            {
                "family_name": "Murray",
                "given_name": "Richard M.",
                "orcid": "0000-0002-5785-7481",
                "clpid": "Murray-R-M"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Winfree",
                "given_name": "Erik",
                "orcid": "0000-0002-5899-7523",
                "clpid": "Winfree-E"
            },
            {
                "family_name": "Murray",
                "given_name": "Richard M.",
                "orcid": "0000-0002-5785-7481",
                "clpid": "Murray-R-M"
            },
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>The successful advancement and deployment of technologies in the field of synthetic biology will require sophisticated computational infrastructure coupled with new theoretical ideas in order to more effectively engineer and reverse engineer biochemical networks. This thesis argues that the field of machine learning can inform the development of these underlying principles and techniques. First, software for compiling diverse chemical reaction network models of biological circuits from simple specifications is described. Second, three chemical reaction network implementations of a powerful machine learning model called a Boltzmann machine are analyzed and compared. Third, the class of detailed balanced chemical reaction networks are proven to be capable of probabilistic inference and, when coupled to a driven chemical system, autonomous learning. Finally, the use of machine learning to interpret and understand biological systems is explored in an experimental case study modeling <i>E. coli</i> cell extract metabolism.</p>",
        "doi": "10.7907/x3qc-je74",
        "publication_date": "2022",
        "thesis_type": "phd",
        "thesis_year": "2022"
    },
    {
        "id": "thesis:14327",
        "collection": "thesis",
        "collection_id": "14327",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:08182021-053622635",
        "primary_object_url": {
            "basename": "beeler_suzannah_thesis_2021.pdf",
            "content": "final",
            "filesize": 8849776,
            "license": "cc_by",
            "mime_type": "application/pdf",
            "url": "/14327/1/beeler_suzannah_thesis_2021.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Deciphering Regulation in Escherichia coli: From Genes to Genomes",
        "author": [
            {
                "family_name": "Beeler",
                "given_name": "Suzannah Michelle",
                "orcid": "0000-0002-1930-4827",
                "clpid": "Beeler-Suzannah-Michelle"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Rothenberg",
                "given_name": "Ellen V.",
                "orcid": "0000-0002-3901-347X",
                "clpid": "Rothenberg-E-V"
            },
            {
                "family_name": "Goentoro",
                "given_name": "Lea A.",
                "orcid": "0000-0002-3904-0195",
                "clpid": "Goentoro-L-A"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>Advances in DNA sequencing have revolutionized our ability to read genomes. However, even in the most well-studied of organisms, the bacterium <i>Escherichia coli</i>, for \u2248 65% of promoters we remain ignorant of their regulation. Until we crack this regulatory Rosetta Stone, efforts to read and write genomes will remain haphazard. We introduce a new method, Reg-Seq, that links massively-parallel reporter assays with mass spectrometry to produce a base pair resolution dissection of more than 100 <i>E. coli</i> promoters in 12 growth conditions. We demonstrate that the method recapitulates known regulatory information. Then, we examine regulatory architectures for more than 80 promoters which previously had no known regulatory information. In many cases, we also identify which transcription factors mediate their regulation. This method clears a path for highly multiplexed investigations of the regulatory genome of model organisms, with the potential of moving to an array of microbes of ecological and medical relevance.</p>",
        "doi": "10.7907/p3rg-m937",
        "publication_date": "2022",
        "thesis_type": "phd",
        "thesis_year": "2022"
    },
    {
        "id": "thesis:14338",
        "collection": "thesis",
        "collection_id": "14338",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:08242021-212609828",
        "primary_object_url": {
            "basename": "thesis.pdf",
            "content": "final",
            "filesize": 36577433,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/14338/1/thesis.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Physical Biology of Cellular Information Processing",
        "author": [
            {
                "family_name": "Razo-Mejia",
                "given_name": "Manuel",
                "orcid": "0000-0002-9510-0527",
                "clpid": "Razo-Mejia-Manuel"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Newman",
                "given_name": "Dianne K.",
                "orcid": "0000-0003-1647-1918",
                "clpid": "Newman-D-K"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Goentoro",
                "given_name": "Lea A.",
                "orcid": "0000-0002-3904-0195",
                "clpid": "Goentoro-L-A"
            },
            {
                "family_name": "Pachter",
                "given_name": "Lior S.",
                "orcid": "0000-0002-9164-6231",
                "clpid": "Pachter-Lior-S"
            },
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            }
        ],
        "local_group": [
            {
                "literal": "div_chem"
            }
        ],
        "abstract": "<p>The state of matter that we define as <em>life</em> is different from anything else we have encountered so far in the universe. Living systems not only perpetuate their existence out of equilibrium against the will of the second law of thermodynamics, but they do so while keeping up with an ever-changing environment. A key part of this capacity to adapt to environmental changes is the ability of organisms to gather information from their surroundings to put together an adequate response to the challenges presented to them. This thesis presents an effort to understand, from first principles, this fundamental feature of information gathering that all life on earth shares. We dig into the physics behind one of the most pervasive mechanisms through which living systems sense and respond to the environment\u2013the ability to turn <em>on</em> and <em>off</em> genes. In doing so, we hope to uncover general principles of how organisms deal with the problem of collecting information about the world that surrounds them.</p>\r\n\r\n<p>In Chapter 1, we develop the theoretical and conceptual tools to navigate the rest of the thesis. I introduce the idea of gene regulation, as well as different theoretical models of this pervasive biological phenomenon. We also delve into the realm of information theory and learn how the plastic concept of information can be mathematically defined and quantified.</p>\r\n\r\n<p>The second stop in our exploration (Chapter 2) asks the following question: can we understand, from first principles, how it is that proteins allow cells to regulate their genes on-demand upon sensing environmental cues? For this, we explore the physics behind transcriptional control due to allosteric transcription factors. Using simple quasi-equilibrium models of the two processes involved in this type of regulation\u2014the regulation of the gene by the binding and unbinding of the transcription factor, and the regulation of the activity of the transcription factor itself by the binding and unbinding of an effector molecule\u2014we are able to predict the input-output function of a simple genetic circuit, and compare such predictions with experimental determinations of the mean response of a population of bacterial cells.</p>\r\n\r\n<p>We then expand on these insights to ask questions about the inescapable cell-to-cell variability that isogenic cells encounter. For this, we have to leave behind the pure thermodynamic framework and work in the language of chemical kinetics. This allows us to make predictions beyond the mean input-output gene expression response of cells by reconstructing full gene expression distributions. With these probabilistic input-output functions, in Chapter 3 we formalize the question of the <em>amount of information</em> that cells can gather from the environment. For this, we turn to information-theoretic concepts of maximal mutual information (otherwise known as channel capacity) between the state of the environment and the gene expression response from bacterial cells. Finally, we compare our predictions of the maximum amount of information\u2014measured in bits\u2014that cells can gather with single-cell inferences of this quantity.</p>",
        "doi": "10.7907/kpc2-b345",
        "publication_date": "2022",
        "thesis_type": "phd",
        "thesis_year": "2022"
    },
    {
        "id": "thesis:14551",
        "collection": "thesis",
        "collection_id": "14551",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:04162022-233242577",
        "primary_object_url": {
            "basename": "tang_weiyi_2022_thesis.pdf",
            "content": "final",
            "filesize": 28039388,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/14551/1/tang_weiyi_2022_thesis.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Retroviral Lineage Analysis of the Vagal Neural Crest Reveals Multipotency Towards the Cardiac and Enteric Fates",
        "author": [
            {
                "family_name": "Tang",
                "given_name": "Weiyi",
                "orcid": "0000-0002-1279-1001",
                "clpid": "Tang-Weiyi"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Bronner",
                "given_name": "Marianne E.",
                "orcid": "0000-0003-4274-1862",
                "clpid": "Bronner-M-E"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Stathopoulos",
                "given_name": "Angelike",
                "orcid": "0000-0001-6597-2036",
                "clpid": "Stathopoulos-A"
            },
            {
                "family_name": "Rothenberg",
                "given_name": "Ellen V.",
                "orcid": "0000-0002-3901-347X",
                "clpid": "Rothenberg-E-V"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Bronner",
                "given_name": "Marianne E.",
                "orcid": "0000-0003-4274-1862",
                "clpid": "Bronner-M-E"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>The neural crest is a migratory stem cell population that gives rise to the craniofacial skeleton, heart septa, pigment cells, and peripheral nervous system.  Defects in neural crest development can lead to a broad range of congenital diseases, e.g., persistent truncus arteriosus, characterized by a mixture of oxygenated and deoxygenated blood, is related to the absence of the neural crest-derived outflow tract septum. Thus, a thorough understanding about neural crest migration, differentiation, and cell fate can shine lights on diagnosis and treatment of many congenital defects. A long-standing question is whether neural crest cells are composed of multipotent cells capable of giving rise to a wide range of cell types, or a mixture of fate-determined cells migrating to their destinations. Avian embryos resemble humans during neural crest development, but are more accessible to experimental manipulations than mammalian models, making them an ideal model to study the neural crest. Despite the abundance of information obtained from elegant experiments through interspecies grafting, the avian model lacks a direct tool to determine whether these cells are multipotent <i>in vivo</i>.</p>\r\n\r\n<p>Here, we present a new clonal analysis tool that takes advantage of Replication Incompetent Avian retroviruses (RIAs). We validate the method <i>in vitro</i> and present the potential application in the chick embryo to test the multipotency of the trunk neural crest. Next, we perform RIA-mediated lineage tracing at a population level and uncover cardiomyocytes as a previously unknown cardiac neural crest derivative in both chicken and mouse. Furthermore, we utilize RIA-mediated clonal analysis to identify individual premigratory vagal neural crest cells as a multipotent stem cell that forms cell types in both the heart and the gut. We then confirm the results by single-cell photoconversion assay that further confirms that migrating neural crest cells are also multipotent. Time-lapse imaging shows that stochastic post-mitotic migration is a cellular mechanism underlying multipotency. Finally, molecular perturbation experiments show that CXCR4 and RET are essential guidance cues for migratory neural crest cells to enter the heart and the gut, respectively. Together, these results demonstrate the utility of using RIA viruses to tackle questions regarding the lineage, developmental potential, and migratory pathways followed by neural crest cells in avian embryos.</p>",
        "doi": "10.7907/qakz-vm04",
        "publication_date": "2022",
        "thesis_type": "phd",
        "thesis_year": "2022"
    },
    {
        "id": "thesis:14435",
        "collection": "thesis",
        "collection_id": "14435",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:11282021-042001335",
        "primary_object_url": {
            "basename": "Rachel_Caltech_PhD_Thesis_V2-6.pdf",
            "content": "final",
            "filesize": 93199721,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/14435/11/Rachel_Caltech_PhD_Thesis_V2-6.pdf",
            "version": "v6.0.0"
        },
        "type": "thesis",
        "title": "Experimental and Theoretical Studies of Non-Equilibrium Systems: Motor-Microtubule Assemblies and the Human-Earth System",
        "author": [
            {
                "family_name": "Banks",
                "given_name": "Rachel A.",
                "orcid": "0000-0003-2028-2925",
                "clpid": "Banks-Rachel-A"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Van Valen",
                "given_name": "David A.",
                "orcid": "0000-0001-7534-7621",
                "clpid": "Van-Valen-David-A"
            },
            {
                "family_name": "Bois",
                "given_name": "Justin S.",
                "orcid": "0000-0001-7137-8746",
                "clpid": "Bois-Justin-S"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            }
        ],
        "local_group": [
            {
                "literal": "div_chem"
            }
        ],
        "abstract": "<p>Systems out of equilibrium are pervasive around us. In fact, being out of equilibrium is a key property of life, as described by Erwin Schrodinger in his series of essays \"What is life?\". Through the consumption of energy, i.e. food, living organisms achieve ordered states that would be very unlikely to occur at equilibrium, such as the mitotic spindle during cell division, swarms of bacteria, or flocks of starlings. The Earth system is another example of a non-equilibrium system. The state of the Earth has been evolving for billions of years, often under the influence of life. Today, humanity is a dominant influence forcing the Earth system to new states. Understanding these non-equilibrium systems has posed many challenges; in this thesis, we work towards quantitatively dissecting and gaining an intuition for the functioning of both a molecular scale and planetary scale non-equilibrium system. </p>\r\n\r\n<p>Underlying many cellular functions such as cell division and transportation of organelles is the cytoskeleton composed of motor proteins and their constituent filaments. One of the key components are kinesin motors, which consume chemical energy to walk along and reorganize microtubules. Collections of these motors and microtubules are able to form organized structures. Understanding how these structures are formed has remained an open question. In Chapter 2, we develop a system of kinesin motors and microtubules wherein motor activity is controlled by light, thereby gaining spatiotemporal control over the formation of motor-microtubule assemblies. We demonstrate the creation of a variety of structures of different sizes and geometry, and measure how length and time scales of these assemblies depend on the activated region. </p>\r\n\r\n<p>A remaining question was how the microscopic details of the interaction between motors and microtubule affect the dynamics and steady-state structure formed. With our scheme for light-control in hand, we extended the system to a variety of motor proteins that have different speeds, processivities (how many steps they take before unbinding from the microtubule), directionalities (which end of the microtubule they walk towards), and forces they are able to exert in Chapter 3. We found that the size of steady-state structures, distribution of motors within assemblies, and rate of contraction of networks depend on motor properties. Further, we demonstrate that various structures can be formed by combining different motors. This work begins to build a connection between the detailed microscopic interactions of cytoskeletal components to the larger scale structures they form. </p>\r\n\r\n<p>Chapter 4 begins our work on understanding the state of the human-Earth system. A major hurdle to quantitatively understanding this system is the difficulty of finding and parsing the relevant data, which is often within long, complicated reports. In order to facilitate access to this data, we created the Human Impacts Database, which houses a collection of $>$ 300 carefully curated values related to human impacts on the Earth, introduced in Chapter 4. In this chapter, we describe the format of the database as well as demonstrate how it can be harnessed to gain a more holistic perspective on humanity's influence on the Earth.</p>\r\n\r\n<p>Having this data is only a starting point towards deciphering the ways that humans are altering the state of the Earth, though. In Chapter 5, we combine these quantitative measurements with simple order-of-magnitude estimates to gain an intuition for the magnitude of several of the values. In this way, we show that many of the ways humanity is affecting the Earth can be tied back to how much land, water, and power we use. We further contextualize the magnitude of human influence by comparing human activities to natural analogs, finding that humans currently rival natural processes in influencing the state of the Earth system.</p>",
        "doi": "10.7907/5ee6-j454",
        "publication_date": "2022-06-10",
        "thesis_type": "phd",
        "thesis_year": "2022"
    },
    {
        "id": "thesis:14496",
        "collection": "thesis",
        "collection_id": "14496",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:02132022-064810187",
        "primary_object_url": {
            "basename": "David_Brown_Thesis_V4.pdf",
            "content": "final",
            "filesize": 10109719,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/14496/1/David_Brown_Thesis_V4.pdf",
            "version": "v5.0.0"
        },
        "type": "thesis",
        "title": "Principles of Massively Parallel Sequencing for Engineering and Characterizing Gene Delivery",
        "author": [
            {
                "family_name": "Brown",
                "given_name": "David",
                "orcid": "0000-0002-9757-1744",
                "clpid": "Brown-David"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Gradinaru",
                "given_name": "Viviana",
                "orcid": "0000-0001-5868-348X",
                "clpid": "Gradinaru-V"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Yue",
                "given_name": "Yisong",
                "orcid": "0000-0001-9127-1989",
                "clpid": "Yue-Yisong"
            },
            {
                "family_name": "Arnold",
                "given_name": "Frances Hamilton",
                "orcid": "0000-0002-4027-364X",
                "clpid": "Arnold-F-H"
            },
            {
                "family_name": "Gradinaru",
                "given_name": "Viviana",
                "orcid": "0000-0001-5868-348X",
                "clpid": "Gradinaru-V"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>The advent of massively parallel sequencing and synthesis technologies have ushered in a new paradigm of biology, where high throughput screening of billions of nucleid acid molecules and production of libraries of millions of genetic mutants are now routine in labs and clinics. During my Ph.D., I worked to develop data analysis and experimental methods that take advantage of the scale of this data, while making the minimal assumptions necessary for deriving value from their application. My Ph.D. work began with the development of software and principles for analyzing deep mutational scanning data of libraries of engineered AAV capsids. By looking at not only the top variant in a round of directed evolution, but instead a broad distribution of the variants and their phenotypes, we were able to identify AAV variants with enhanced ability to transduce specific cells in the brain after intravenous injection. I then shifted to better understand the phenotypic profile of these engineered variants. To that end, I turned to single-cell RNA sequencing to seek to identify, with high resolution, the delivery profile of these variants in all cell types present in the cortex of a mouse brain. I began by developing infrastructure and tools for dealing with the data analysis demands of these experiments. Then, by delivering an engineered variant to the animal, I was able to use the single-cell RNA sequencing profile, coupled with a sequencing readout of the delivered genetic cargo present in each cell type, to define the variant\u2019s tropism across the full spectrum of cell types in a single step. To increase the throughput of this experimental paradigm, I then worked to develop a multiplexing strategy for delivering up to 7 engineered variants in a single animal, and obtain the same high resolution readout for each variant in a single experiment. Finally, to take a step towards translation to human diagnostics, I leveraged the tools I built for scaling single-cell RNA sequencing studies and worked to develop a protocol for obtaining single-cell immune profiles of low volumes of self-collected blood. This study enabled repeat sampling in a short period of time, and revealed an incredible richness in individual variability and time-of-day dependence of human immune gene expression. Together, my Ph.D. work provides strategies for employing massively parallel sequencing and synthesis for new biological applications, and builds towards a future paradigm where personalized, high-resolution sequencing might be coupled with modular, customized gene therapy delivery.</p>",
        "doi": "10.7907/yqjm-6609",
        "publication_date": "2022",
        "thesis_type": "phd",
        "thesis_year": "2022"
    },
    {
        "id": "thesis:14617",
        "collection": "thesis",
        "collection_id": "14617",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:05252022-172145394",
        "type": "thesis",
        "title": "Mechanical Approach to Active Matter: Reverse Osmotic Effect and Motility-Induced Phase Separation",
        "author": [
            {
                "family_name": "Row",
                "given_name": "Hyeongjoo",
                "orcid": "0000-0003-3623-512X",
                "clpid": "Row-Hyeongjoo"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Brady",
                "given_name": "John F.",
                "orcid": "0000-0001-5817-9128",
                "clpid": "Brady-J-F"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Shapiro",
                "given_name": "Mikhail G.",
                "orcid": "0000-0002-0291-4215",
                "clpid": "Shapiro-M-G"
            },
            {
                "family_name": "Brady",
                "given_name": "John F.",
                "orcid": "0000-0001-5817-9128",
                "clpid": "Brady-J-F"
            },
            {
                "family_name": "Wang",
                "given_name": "Zhen-Gang",
                "orcid": "0000-0002-3361-6114",
                "clpid": "Wang-Zhen-Gang"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "local_group": [
            {
                "literal": "div_chem"
            }
        ],
        "abstract": "The defining feature of active matter, self-propulsion requires constant consumption of energy to be maintained. As a result, active matter systems are inherently out of equilibrium and some principles that are accepted as common knowledge, particularly from thermodynamics, do not apply to the active matter systems. Arguably the most popular example is the motility-induced phase separation (MIPS) -- active matter can spontaneously phase separate into liquid-like dense phase and gas-like sparse phase even without any attractive interactions between the self-propelling constituents. In this thesis, I demonstrate the utility of a mechanical perspective in revealing and understanding the underlying physics of seemingly confounding behaviors of active matter systems. In Chapters 2 and 3, I consider the mechanics of a suspension of active colloidal particles when the transport properties (self-propelling speed and diffusivities) vary spatially. The mechanical analysis reveals the reverse-osmotic nature of active matter systems with a spatial variation in activity. I provide an explanation for why physical processes governed by the osmotic pressure of particles can appear in a reversed manner in active matter systems, e.g. a fluid can flow from regions of high concentration to low in a suspension of active colloids. In Chapter 4, I develop a mechanical theory of phase coexistence that applies to both equilibrium and nonequilibrium systems. By applying the mechanical theory to MIPS, I find phase coexistence conditions of the MIPS that allow a construction of a phase diagram, which excellently agrees with the results from computer simulations. The mechanical theory also allows access to the microscopic structure of phase interfaces. By investigating the interfacial structure, I discover interesting nonequilibrium interfacial behavior of the MIPS. I find that the width of the MIPS interface varies nonmonotically  with the activity of particles and provide a mechanical explanation for the phenomena.",
        "doi": "10.7907/qef0-e420",
        "publication_date": "2022",
        "thesis_type": "phd",
        "thesis_year": "2022"
    },
    {
        "id": "thesis:14517",
        "collection": "thesis",
        "collection_id": "14517",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:03162022-173632582",
        "primary_object_url": {
            "basename": "Abdel-haq_Reem_2022_thesis.pdf",
            "content": "final",
            "filesize": 44751420,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/14517/1/Abdel-haq_Reem_2022_thesis.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Gut Microbiome Modulates Microglia Physiology in Homeostatic and Disease States",
        "author": [
            {
                "family_name": "Abdel-Haq",
                "given_name": "Reem",
                "orcid": "0000-0002-7418-5736",
                "clpid": "Abdel-Haq-Reem"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Mazmanian",
                "given_name": "Sarkis K.",
                "orcid": "0000-0003-2713-1513",
                "clpid": "Mazmanian-S-K"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Gradinaru",
                "given_name": "Viviana",
                "orcid": "0000-0001-5868-348X",
                "clpid": "Gradinaru-V"
            },
            {
                "family_name": "Mazmanian",
                "given_name": "Sarkis K.",
                "orcid": "0000-0003-2713-1513",
                "clpid": "Mazmanian-S-K"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Chan",
                "given_name": "David C.",
                "orcid": "0000-0002-0191-2154",
                "clpid": "Chan-D-C"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "The gastrointestinal tract (GI) harbors a complex community of ~100 trillion bacteria, fungi, and viruses collectively referred to as the gut microbiome. Through direct and indirect signaling mechanisms, the gut microbiome exerts its effects on almost every organ system, including the brain. Constant, bi-directional communication along the gut-brain axis is required for the normal and healthy development of the host Central Nervous System (CNS). One of the cells in the CNS shaped by microbial-derived cues is microglia, the resident immune cells in the brain. Aberrant microglia activity is a driving force of several neurological diseases in which the gut microbiome plays a role, including Parkinson\u2019s disease (PD). \r\n\r\nIn this thesis, we explore the interplay between gut microbiota signaling and microglia physiology during homeostatic and disease states. We first detail how microbial signaling along the gut-brain axis shapes microglial development and function. Next, we explore how the gut microbiome composition influences microglial activation states in the context of disease. Leveraging a preclinical mouse model of PD, we show that dietary-driven changes to the gut microbiome through the use of prebiotics attenuates motor deficits and \u03b1-synuclein aggregation. These effects result from changes in microglial gene expression and activation status. Collectively, these findings have broad implications for the gut microbiome research community and highlight potential for development of microbiome-based therapies for diseases of the brain.",
        "doi": "10.7907/ht1j-2461",
        "publication_date": "2022",
        "thesis_type": "phd",
        "thesis_year": "2022"
    },
    {
        "id": "thesis:14934",
        "collection": "thesis",
        "collection_id": "14934",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:06022022-032024376",
        "primary_object_url": {
            "basename": "Thesis_ChristinaSu_2022.pdf",
            "content": "final",
            "filesize": 5427101,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/14934/1/Thesis_ChristinaSu_2022.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Principles of Addressing Specificity in Promiscuous Ligand-Receptor Systems",
        "author": [
            {
                "family_name": "Su",
                "given_name": "Christina Janet",
                "orcid": "0000-0002-9223-9777",
                "clpid": "Su-Christina-Janet"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Elowitz",
                "given_name": "Michael B.",
                "orcid": "0000-0002-1221-0967",
                "clpid": "Elowitz-M-B"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Chan",
                "given_name": "David C.",
                "orcid": "0000-0002-0191-2154",
                "clpid": "Chan-D-C"
            },
            {
                "family_name": "Elowitz",
                "given_name": "Michael B.",
                "orcid": "0000-0002-1221-0967",
                "clpid": "Elowitz-M-B"
            },
            {
                "family_name": "Goentoro",
                "given_name": "Lea A.",
                "orcid": "0000-0002-3904-0195",
                "clpid": "Goentoro-L-A"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>In multicellular organisms, a relatively small number of highly conserved signaling pathways are used to enable intercellular communication. While the underlying molecular components and interactions are increasingly well understood, a fundamental mystery is how the diverse cell types of the body can be so precisely coordinated by so few pathways. It has long been known that different cell types exhibit varied responses to molecular signals, and it is unclear how this cell type specificity arises. In this work, we take a different perspective on this question and explore how cell type specificity can be generated at the level of intracellular signal. We refer to this ability to selectively activate different cell types as \"addressing.\" By eliminating the complexity of considering downstream pathway effectors, we are able to more comprehensively understand how cell type specificity can arise in spite of\u2014or because of\u2014promiscuity in ligand-receptor interactions. We focus on the bone morphogenetic protein (BMP) pathway as an ideal example. This pathway is essential in development, is of therapeutic interest in an array of pathologies, and has proven amenable to theoretical and experimental analysis. We first describe a minimal model of the pathway and identify what types of response functions can be achieved. We show that each layer of computation, from the formation of signaling complexes to the activation of downstream second messenger, can provide nontrivial integrations of ligand inputs. We then extend this analysis to systems with multiple cell types that may vary in receptor expression profile. The diverse response functions of this pathway enable systems in which different cell types or sets of cell types may be addressed with high specificity. In particular, the BMP pathway can address multiple cell types with high capacity, flexibility, and robustness. Taken together, these results provide a framework for understanding how molecular promiscuity in signaling pathways can, in fact, enable cellular specificity in pathway responses.</p>",
        "doi": "10.7907/z7dv-m192",
        "publication_date": "2022",
        "thesis_type": "phd",
        "thesis_year": "2022"
    },
    {
        "id": "thesis:14204",
        "collection": "thesis",
        "collection_id": "14204",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:05302021-051953086",
        "primary_object_url": {
            "basename": "CheeHuat(Linus)Eng_thesis.pdf",
            "content": "final",
            "filesize": 6063063,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/14204/1/CheeHuat(Linus)Eng_thesis.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Plus Ultra: Genome-Wide Spatial Transcriptomics with RNA seqFISH+",
        "author": [
            {
                "family_name": "Eng",
                "given_name": "Chee Huat (Linus)",
                "orcid": "0000-0002-2521-9696",
                "clpid": "Eng-Chee-Huat-Linus"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Cai",
                "given_name": "Long",
                "orcid": "0000-0002-7154-5361",
                "clpid": "Cai-Long"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Ismagilov",
                "given_name": "Rustem F.",
                "orcid": "0000-0002-3680-4399",
                "clpid": "Ismagilov-R-F"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Guttman",
                "given_name": "Mitchell",
                "orcid": "0000-0003-4748-9352",
                "clpid": "Guttman-M"
            },
            {
                "family_name": "Cai",
                "given_name": "Long",
                "orcid": "0000-0002-7154-5361",
                "clpid": "Cai-Long"
            }
        ],
        "local_group": [
            {
                "literal": "div_chem"
            }
        ],
        "abstract": "<p>Visualizing single cells and their organization in intact tissue is crucial to understanding their governing biological function. Even though single cell RNA sequencing has provided many insights into the heterogeneity and gene expression profiles across many tissue types, the dissociation process which loses the spatial information is hindering our deeper understanding of how these transcriptional distinct cell types are organized and interacting in their native tissue environment.</p>\r\n\r\n<p>The thesis begins by giving a background on how single cell RNA sequencing has transformed biology and the emergence of spatial technology such as sequential fluorescence in situ hybridization (seqFISH).  While spatial methods are useful for mapping the cell types identified from single cell RNA sequencing, the need for turning spatial technology such as seqFISH, which has high detection efficiency of the transcriptome with spatial information, into an in situ discovery tool is discussed as the scientific community\u2019s goal heads towards building spatial atlases for every human tissues and organs such as the brain.</p>\r\n \r\n<p>While seqFISH has high detection efficiency, it is still limited in the number of genes capable of profiling at once. The major obstacle is the optical crowding problems when more RNA species are targeted and imaged using a fluorescence microscope. In Chapter 2, we first investigated, if the RNA molecules are instead captured on a coverslip and profiled with sequential barcoding strategy, the FISH-based method will reliably characterize the transcriptome when molecular crowding is not an issue.</p>\r\n \r\n<p>Finally, in Chapter 3, we demonstrate the barcoding strategy to break through the molecular crowding limit of multiplexed FISH. From being able to profile hundreds to a thousand genes by various multiplexed FISH methods at that time in the field, we succeeded in profiling 10,000 genes by RNA seqFISH+, an evolved version of seqFISH, in various intact tissue sections, turning seqFISH+ into a spatial discovery technology with its genome-wide coverage and high detection efficiency. The work described in this part of the thesis is highlighted in Nature Method\u2019s Method of The Year 2020- Spatially-resolved Transcriptomic article.</p>",
        "doi": "10.7907/nvfe-5j74",
        "publication_date": "2021",
        "thesis_type": "phd",
        "thesis_year": "2021"
    },
    {
        "id": "thesis:14111",
        "collection": "thesis",
        "collection_id": "14111",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:03262021-160841703",
        "primary_object_url": {
            "basename": "thesis.pdf",
            "content": "final",
            "filesize": 3545856,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/14111/1/thesis.pdf",
            "version": "v7.0.0"
        },
        "type": "thesis",
        "title": "Signal Amplification in Synthetic Bacterial Communication",
        "author": [
            {
                "family_name": "Parkin",
                "given_name": "James Michael",
                "orcid": "0000-0002-4058-2338",
                "clpid": "Parkin-James-Michael"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Murray",
                "given_name": "Richard M.",
                "orcid": "0000-0002-5785-7481",
                "clpid": "Murray-R-M"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Winfree",
                "given_name": "Erik",
                "orcid": "0000-0002-5899-7523",
                "clpid": "Winfree-E"
            },
            {
                "family_name": "Leadbetter",
                "given_name": "Jared R.",
                "orcid": "0000-0002-7033-0844",
                "clpid": "Leadbetter-J-R"
            },
            {
                "family_name": "Bois",
                "given_name": "Justin S.",
                "orcid": "0000-0001-7137-8746",
                "clpid": "Bois-Justin-S"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Murray",
                "given_name": "Richard M.",
                "orcid": "0000-0002-5785-7481",
                "clpid": "Murray-R-M"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>Synthetic biology will one day enable embedded control of a variety of chemical and biological contexts, from the human gastrointestinal tract to crop roots. Groups of engineered organisms, also known as synthetic consortia, can inhabit niches of interest while monitoring and intervening according to their genetic design. However, the spatial structure of the deployment environments can obstruct coordination between cosortia members. The mechanisms engineered bacteria use to communicate must contend with these adversarial conditions to maximize group performance.</p>\r\n\r\n<p>Coordination between synthetic bacteria is typically achieved using small molecules that can traverse cell membranes through passive transport. Cell communicate by producing and sensing these small molecules. In cell-cell signaling relationships composed of a sender population and a receiver population, the concentration of signaling molecule sensed by the receiver cells depends on the spatial patterning of the two groups, the geometry of the diffusive environment, and the sender population\u2019s signal secretion rate.</p>\r\n\r\n<p>To make sender-receiver communication more robust to these environmental features, we introduce a third consortium strain that transiently amplifies local signaling molecule concentrations. These amplifier cells employ a synchronized pulse-generating circuit built using Lux-type quorum sensing components and an IFFL transcriptional architecture. When applied to sender-receiver consortia growing on semi-solid media, these amplifier cells respond to sender-secreted signaling molecules by contributing a small amount themselves. The support of amplifier cells enables communication over longer distances than can be achieved by sender cells alone and can partially recover coordination in small consortia where the sender population is too small to successfully signal its receiver population alone. We extend these results using simulation to investigate the benefit that amplifier cells confer to consortia of varying complexity.</p>",
        "doi": "10.7907/50p8-bd89",
        "publication_date": "2021",
        "thesis_type": "phd",
        "thesis_year": "2021"
    },
    {
        "id": "thesis:14084",
        "collection": "thesis",
        "collection_id": "14084",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:02192021-010538691",
        "primary_object_url": {
            "basename": "chour_william_2021_thesis.pdf",
            "content": "final",
            "filesize": 140220437,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/14084/1/chour_william_2021_thesis.pdf",
            "version": "v6.0.0"
        },
        "type": "thesis",
        "title": "Molecular Technologies for Antigen-Based Immunity",
        "author": [
            {
                "family_name": "Chour",
                "given_name": "William",
                "orcid": "0000-0003-1817-0123",
                "clpid": "Chour-William"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Heath",
                "given_name": "James R.",
                "orcid": "0000-0001-5356-4385",
                "clpid": "Heath-J-R"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Shapiro",
                "given_name": "Mikhail G.",
                "orcid": "0000-0002-0291-4215",
                "clpid": "Shapiro-M-G"
            },
            {
                "family_name": "Heath",
                "given_name": "James R.",
                "orcid": "0000-0001-5356-4385",
                "clpid": "Heath-J-R"
            },
            {
                "family_name": "Rothenberg",
                "given_name": "Ellen V.",
                "orcid": "0000-0002-3901-347X",
                "clpid": "Rothenberg-E-V"
            },
            {
                "family_name": "Yang",
                "given_name": "Changhuei",
                "orcid": "0000-0001-8791-0354",
                "clpid": "Yang-Changhuei"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>The presence and proliferation antigen-specific T cells is a defining characteristic of an adaptive immune response against various disease types (autoimmune, cancer, and infectious). The use of Class I and Class II peptide-major histocompatibility complex (pMHC) reagents to identify such cells, however, is technically difficult and expensive, and it has been challenging to refine synthesis protocols for higher yield and more efficient assembly to accommodate large-scale applications. This achievement would enable high-throughput capture of corresponding T cell receptors (TCR), which may be further used in clinical applications such as adoptive cell transfer therapies. Overcoming this hurdle requires the development and integration of various molecular technologies and analytical methods.</p>\r\n\r\n<p>Toward this end, the bulk of my thesis work, covered in Chapter 2, introduces these developments in the context of pMHCs, where the three subunits of each reagent are covalent linked together and expressed as a single protein. These single-chain trimer (SCT) technologies primarily consist of traditional DNA cloning and protein production techniques which have been streamlined for applications requiring output on the scale of 10<sup>2</sup>-10<sup>3</sup> of reagents. This chapter serves as the foundation for much of the methodology discussed throughout the rest of my thesis, and thus should serve as a reference point. The generated constructs are also functionally validated here, and potential future research directions are outlined.</p>\r\n\r\n<p>In Chapter 3, I explore the use of this technology in the context of COVID-19 to enumerate antigen specificity of the CD8+ T cell immune response. Class I SCTs were constructed to present peptides across several SARS-CoV-2 protein domains, using various HLA alleles to match haplotyped participant blood samples. These reagents were then used to capture SARS-CoV-2-specific T cells through flow and nanoparticle cytometry to demonstrate HLA-dependent, domain-dependent immune responses. Identified TCRs were cloned into T cells for confirmation of antigen specificity and functional cytotoxicity.</p>\r\n\r\n<p>In Chapters 4 and 5, I explore potential pMHC applications in cancer antigen contexts, covering both tumor-associated and tumor-specific antigens. Through various collaborations across the west coast (UCLA, Parker Institute, Fred Hutchinson Cancer Research Center), I make use of the SCT platform to showcase new assays to discover and rank key tumor targets (Chapter 4). Finally, Chapter 5 is a reproduction of our lab\u2019s published work concerning identification of antigen-specific CD8+ T cells from melanoma cancer patients.</p>\r\n\r\n<p>In summary, the adaptation of SCTs in a high-throughput format allows for the rapid enumeration of antigen-specific T-cell receptor sequences. As demonstrated in the contexts of COVID-19 and cancer, this SCT platform enables subsequent downstream applications, such as single-cell, antigen-specific immunophenotypic mapping/analysis and target discovery for personalized immunotherapies.</p>",
        "doi": "10.7907/z20t-nq62",
        "publication_date": "2021",
        "thesis_type": "phd",
        "thesis_year": "2021"
    },
    {
        "id": "thesis:14223",
        "collection": "thesis",
        "collection_id": "14223",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:06012021-203020365",
        "primary_object_url": {
            "basename": "Shashank_Gandhi_Thesis_Final.pdf",
            "content": "final",
            "filesize": 80122487,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/14223/1/Shashank_Gandhi_Thesis_Final.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Molecular Mechanisms Underlying Cardiac Neural Crest Development in Avian Embryos",
        "author": [
            {
                "family_name": "Gandhi",
                "given_name": "Shashank",
                "orcid": "0000-0002-4081-4338",
                "clpid": "Gandhi-Shashank"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Bronner",
                "given_name": "Marianne E.",
                "orcid": "0000-0003-4274-1862",
                "clpid": "Bronner-M-E"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Rothenberg",
                "given_name": "Ellen V.",
                "orcid": "0000-0002-3901-347X",
                "clpid": "Rothenberg-E-V"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Parker",
                "given_name": "Joseph",
                "orcid": "0000-0001-9598-2454",
                "clpid": "Parker-Joseph"
            },
            {
                "family_name": "Zernicka-Goetz",
                "given_name": "Magdalena",
                "orcid": "0000-0002-7004-2471",
                "clpid": "Zernicka-Goetz-M"
            },
            {
                "family_name": "Bronner",
                "given_name": "Marianne E.",
                "orcid": "0000-0003-4274-1862",
                "clpid": "Bronner-M-E"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>The neural crest is a multipotent, vertebrate-specific stem cell population that gives rise to diverse cell types in the developing embryo, including craniofacial cartilage, enteric ganglia, and cardiac septa. Neural crest cells that originate from a given axial level in the embryo give rise to a characteristic array of progeny and follow distinct pathways from those arising at other levels. One of these subpopulations, called the cardiac neural crest, originates in the dorsal hindbrain and migrates into the developing heart, where it forms the aorticopulmonary septum, cardiac ganglion, and part of the interventricular septum. Mutations in or loss of these cells causes heart defects that are among the most common birth defects in the general population. For my thesis, I sought to identify the mechanisms that underlie the formation of neural crest cells, and confer cardiac neural crest cells with their unique developmental potential.</p>\r\n\r\n<p>To enable interrogation of epistatic relationships between key neural crest genes during neural crest induction and crest specification, I first optimized the CRISPR-Cas9 system for genome editing in gastrula and neurula-stage chicken embryos. I then further improved the CRISPR toolbox by devising an all-in-one single-plasmid strategy that harnesses the self-cleavage properties of ribozymes for the simultaneous delivery of Cas9, gRNAs, and fluorescent reporters in transfected cells. This has enabled live tracking of wildtype and mutant neural crest cells as they migrate to their terminal locations.</p>\r\n\r\n<p>Prior to their induction at the neural plate border, precursors in the neural plate border are transcriptionally primed toward multiple cell fates, including neural tube, neural crest, epidermis, and placode. While this priming has been thought to involve epigenetic regulation, chromatin remodeler genes have been overlooked in the context of neural crest formation given their concomitant expression in surrounding cell types. By combining single-cell transcriptional profiling of the early chick embryonic hindbrain with temporally-controlled knockouts, I uncovered a novel bimodal mechanism whereby the chromatin remodeler gene <i>Hmga1</i> first regulates <i>Pax7</i>-dependent neural crest induction at the neural plate border, and later modulates Wnt signaling in the dorsal neural tube to control neural crest delamination. These results established <i>Hmga1</i> as a direct regulator of neural crest induction and emigration.</p>\r\n\r\n<p>Finally, given that amongst distinct neural crest subpopulations designated as cranial, cardiac/vagal, and trunk, only cardiac crest has the ability to contribute to heart development, and that neither trunk nor cranial neural crest subpopulations can rescue the loss of cardiac crest, I investigated the genetic logic that imbues cardiac crest with its unique ability to form cardiovascular derivatives. To this end, I combined surgical ablations, bulk and single-cell transcriptional profiling, RNA labeling, CRISPR-Cas9-mediated gene editing, transcription factor binding motif mutation analysis, and transgenic tissue grafting approaches to uncover and characterize a cardiac-neural-crest-specific subcircuit comprised of the transcription factors <i>Sox8</i>, <i>Tgif1</i>, and <i>Ets1</i>. I demonstrated that ectopic expression of this subcircuit in trunk neural crest cells reprogrammed them towards a cardiac-crest-like fate, and transplanting these reprogrammed cells in place of ablated cardiac crest restored cardiac-crest-like migration patterns and rescued outflow tract septation defects.</p>\r\n\r\n<p>Taken together, my thesis work has not only built a genome engineering toolbox for a key model system in developmental biology, but has also expanded our understanding of the genetic circuits that govern the formation of the cardiac neural crest and underlie its unique ability to contribute to the heart.</p>",
        "doi": "10.7907/y1e4-d090",
        "publication_date": "2021",
        "thesis_type": "phd",
        "thesis_year": "2021"
    },
    {
        "id": "thesis:13838",
        "collection": "thesis",
        "collection_id": "13838",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:07082020-113341068",
        "type": "thesis",
        "title": "Guiding Self-Organization in Active Matter with Spatiotemporal Boundary Conditions",
        "author": [
            {
                "family_name": "Ross",
                "given_name": "Tyler David",
                "orcid": "0000-0002-7872-3992",
                "clpid": "Ross-Tyler-David"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Winfree",
                "given_name": "Erik",
                "orcid": "0000-0002-5899-7523",
                "clpid": "Winfree-E"
            },
            {
                "family_name": "Rothemund",
                "given_name": "Paul W. K.",
                "orcid": "0000-0002-1653-3202",
                "clpid": "Rothemund-P-W-K"
            },
            {
                "family_name": "Qian",
                "given_name": "Lulu",
                "orcid": "0000-0003-4115-2409",
                "clpid": "Qian-Lulu"
            },
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            },
            {
                "family_name": "Brady",
                "given_name": "John F.",
                "orcid": "0000-0001-5817-9128",
                "clpid": "Brady-J-F"
            },
            {
                "family_name": "Shapiro",
                "given_name": "Mikhail G.",
                "orcid": "0000-0002-0291-4215",
                "clpid": "Shapiro-M-G"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>In this thesis, I demonstrate that self-organized structures and forces can be guided by modulating the interactions between force-generating molecules in space and time. The physics of self-organizing systems is an open frontier. We do not have a complete set of principles that can describe how a dynamic structure forms based on the non-equilibrium dynamics of its constituent components. Yet, living systems appear to depend on some set of rules of self-organization in order to reliably carry out their mechanical functions. Force-generating, active, molecules in the form of motor proteins and filamentous polymers are responsible for performing fundamental tasks in living matter, such as locomotion and division. While it is known that the regulation of motor-filament interactions is necessary to achieve the dynamic structures that drive movement and propagation, the role of spatial and temporal patterning in self-organizing systems has not been explored. I design a artificial system of purified molecules where the interactions between motors and filaments are toggled with light. By patterning molecular interactions in space and time, I show that it is possible to localize the formation of spherically symmetric asters, which can be moved, merged, and used to generate advective fluid flows. The ability to pattern molecular interactions in space and time offers a new perspective in the search for principles of active self-organization. Spatial and temporal control makes it possible to start distilling how the interactions between active molecules determine the mesoscopic behaviors of self-organized structures. These rules ultimately govern the physics of living matter and may eventually be harnessed to build new materials and cell-like machines.</p>",
        "doi": "10.7907/q85h-j730",
        "publication_date": "2021",
        "thesis_type": "phd",
        "thesis_year": "2021"
    },
    {
        "id": "thesis:13837",
        "collection": "thesis",
        "collection_id": "13837",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:07072020-154545363",
        "type": "thesis",
        "title": "Mechanism and Scaling of Eukaryotic Transcription Activation",
        "author": [
            {
                "family_name": "Quintero Cadena",
                "given_name": "Porfirio",
                "orcid": "0000-0003-0067-5844",
                "clpid": "Quintero-Cadena-Porfirio"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Sternberg",
                "given_name": "Paul W.",
                "orcid": "0000-0002-7699-0173",
                "clpid": "Sternberg-P-W"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Rothenberg",
                "given_name": "Ellen V.",
                "orcid": "0000-0002-3901-347X",
                "clpid": "Rothenberg-E-V"
            },
            {
                "family_name": "Guttman",
                "given_name": "Mitchell",
                "orcid": "0000-0003-4748-9352",
                "clpid": "Guttman-M"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Sternberg",
                "given_name": "Paul W.",
                "orcid": "0000-0002-7699-0173",
                "clpid": "Sternberg-P-W"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>Transcription activation is a universal process by which living cells adapt. Decades of work in this field have produced an intelligible paradigm of transcription activation that provides fundamental insights into its underlying molecular mechanisms. This thesis attempts to extend such paradigm to explain how transcription activation can be implemented across the diversity of molecular environments found in eukaryotic nuclei. Specifically, this diversity calls for an explanation of how this process scales throughout a range of genome sizes that spans five orders of magnitude, and of how to think about this subject in the increasingly relevant context of liquid-liquid phase-separation. We leverage data from RNA-seq, smFISH, growth-rate, fluorescence microscopy, computer simulations and literature to identify an appropriate and useful level of abstraction in which to grow our current paradigm. We propose scaling and phase-separation, two seemingly disparate aspects of transcription, are explained and intrinsically linked by a novel molecular state in which multiple RNA polymerases can bind the transcription complex. We provide support and rationale for this addition to the transcription model, and generate testable hypotheses that may further clarify the mechanism and evolution of eukaryotic transcription activation.</p>",
        "doi": "10.7907/m21w-8461",
        "publication_date": "2021",
        "thesis_type": "phd",
        "thesis_year": "2021"
    },
    {
        "id": "thesis:14260",
        "collection": "thesis",
        "collection_id": "14260",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:06082021-005042886",
        "type": "thesis",
        "title": "Statistical Mechanics of Problems in Transcription Regulation",
        "author": [
            {
                "family_name": "Morrison",
                "given_name": "Muir",
                "orcid": "0000-0002-0768-7234",
                "clpid": "Morrison-Muir"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Roukes",
                "given_name": "Michael Lee",
                "orcid": "0000-0002-2916-6026",
                "clpid": "Roukes-M-L"
            },
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Van Valen",
                "given_name": "David A.",
                "orcid": "0000-0001-7534-7621",
                "clpid": "Van-Valen-David-A"
            }
        ],
        "local_group": [
            {
                "literal": "div_pma"
            }
        ],
        "abstract": "<p>As the quantity of sequenced genome data continues to multiply, our understanding of the transcriptional regulation of genomes has lagged behind. This deficit impinges on research throughout biology, from fundamental questions of how evolution proceeds to eminently practical questions such as how antibiotic resistance arises.</p>\r\n\r\n<p>In this thesis we present three threads that address the question of transcriptional regulation from distinct perspectives. The first thread focuses on the simplest nontrivial regulation motif common in bacteria. We analyze in turn a sampling of the myriad mathematical models previously proposed in the literature for this system. We attempt to shine light on the similarities and differences of the models\u2019 predictions, clarify their microscopic interpretations, and offer guidance as to situations when one model or another should be preferred or even distinguishable.</p>\r\n\r\n<p>The second thread considers a substantially more complicated genetic circuit, for which we build a minimal phenomenological model that retains intuitive microscopic meaning for all its parameters. The model neatly explains recent experimental observations of bistability in the circuit, and suggests natural generalizations to other metabolically important gene circuits with qualitatively similar architectures.</p>\r\n\r\n<p>Motivation for the third thread comes from even more complicated transcriptional regulation problems with a multitude of regulatory proteins and binding sites, where even enumerating all possible DNA-protein complexes manually is a formidable challenge. Here we propose a method to tackle this complexity that uses ideas from quantum field theory to encode assembly rules for macromolecular complexes. By specifying a small set of rules, we avoid manual enumeration of the much larger set of complexes, allowing the formalism to automatically generate this set for us.</p>",
        "doi": "10.7907/d042-rp26",
        "publication_date": "2021",
        "thesis_type": "phd",
        "thesis_year": "2021"
    },
    {
        "id": "thesis:13709",
        "collection": "thesis",
        "collection_id": "13709",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:05182020-141933604",
        "primary_object_url": {
            "basename": "NeumannAdam2020Thesis.pdf",
            "content": "final",
            "filesize": 13775647,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/13709/1/NeumannAdam2020Thesis.pdf",
            "version": "v12.0.0"
        },
        "type": "thesis",
        "title": "Towards Single Molecule Imaging Using Nanoelectromechanical Systems",
        "author": [
            {
                "family_name": "Neumann",
                "given_name": "Adam Patrick",
                "orcid": "0000-0002-2961-7640",
                "clpid": "Neumann-Adam-Patrick"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Roukes",
                "given_name": "Michael Lee",
                "orcid": "0000-0002-2916-6026",
                "clpid": "Roukes-M-L"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Phillips",
                "given_name": "Robert B.",
                "orcid": "0000-0003-3082-2809",
                "clpid": "Phillips-R-B"
            },
            {
                "family_name": "Roukes",
                "given_name": "Michael Lee",
                "orcid": "0000-0002-2916-6026",
                "clpid": "Roukes-M-L"
            },
            {
                "family_name": "Beauchamp",
                "given_name": "Jesse L.",
                "orcid": "0000-0001-8839-4822",
                "clpid": "Beauchamp-J-L"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Sader",
                "given_name": "John E.",
                "orcid": "0000-0002-7096-0627",
                "clpid": "Sader-J-E"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>We incorporate nanoelectromechanical systems (NEMS) into a state-of-the-art commercial mass spectrometer (Q Exactive Plus with Orbitrap detection). This unique hybrid instrument is capable of ionizing molecules up to 4.5 MDa in their intact native state, isolating molecules of interest according to their mass-to-charge ratio, performing high resolution mass spectrometry (MS), and delivering those molecules to the NEMS. We use NEMS optimized for detecting the inertial mass of adsorbed species directly, which contrasts with indirect measurements of the mass-to-charge ratio performed with typical instruments. This unique form of mass spectrometry, NEMS-MS, with its single-molecule sensitivity, has promising applications to the fields of proteomics and native mass spectrometry, including deep proteomic profiling, single-cell proteomics, mass spectrometry-based imaging, or identifying viruses in their <i>in vivo</i> state.</p>\r\n\r\n<p>We analyze intact <i>E. coli</i> GroEL chaperonin, a noncovalent 801 kDa complex consisting of 14 identical subunits. GroEL was sent to NEMS operated with the first two vibrational modes monitored in real time. Molecules physisorbing to the NEMS cause an abrupt shift in its resonance frequencies. The change in resonance frequencies is used to calculate the mass of each molecule. A mass spectrum is compiled with a main peak of 846 kDa, close to the expected value, and a secondary peak resolved near twice the mass of GroEL.</p>\r\n<p>Measurements are then performed operating the first three modes simultaneously. Using a technique called inertial imaging, frequency shifts are used to calculate the first three mass moments: mass, position, and variance (size). This is used to distinguish between adsorbates arriving in a single, point-like distribution or a more extended distribution, thus demonstrating a rudimentary form of molecular imaging.</p>\r\n\r\n<p>Two new theories are presented for analyzing frequency-shift data. The first approach offers a more streamlined approach for calculating the mass moments. This approach is used to improve the mass spectrum of the GroEL calculated using three-mode data, producing a main peak almost fully resolved at 805 kDa. An entirely different approach is presented that allows for obtaining the mass density distribution of an adsorbed molecule (i.e., imaging) with a higher number of modes.</p>",
        "doi": "10.7907/n4ap-7h91",
        "publication_date": "2020",
        "thesis_type": "phd",
        "thesis_year": "2020"
    },
    {
        "id": "thesis:13609",
        "collection": "thesis",
        "collection_id": "13609",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:12162019-183140887",
        "primary_object_url": {
            "basename": "Thesis_Dong-Wook_Kim_v3.pdf",
            "content": "final",
            "filesize": 17443112,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/13609/1/Thesis_Dong-Wook_Kim_v3.pdf",
            "version": "v5.0.0"
        },
        "type": "thesis",
        "title": "Multimodal Analysis of Cell Types in a Hypothalamic Node Controlling Social Behavior in Mice",
        "author": [
            {
                "family_name": "Kim",
                "given_name": "Dong-Wook",
                "orcid": "0000-0002-5497-5853",
                "clpid": "Kim-Dong-Wook"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Anderson",
                "given_name": "David J.",
                "orcid": "0000-0001-6175-3872",
                "clpid": "Anderson-D-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Pachter",
                "given_name": "Lior S.",
                "orcid": "0000-0002-9164-6231",
                "clpid": "Pachter-Lior-S"
            },
            {
                "family_name": "Anderson",
                "given_name": "David J.",
                "orcid": "0000-0001-6175-3872",
                "clpid": "Anderson-D-J"
            },
            {
                "family_name": "Oka",
                "given_name": "Yuki",
                "orcid": "0000-0003-2686-0677",
                "clpid": "Oka-Yuki"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            }
        ],
        "local_group": [
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>The advent and recent advances of single-cell RNA sequencing (scRNA-seq) have yielded transformative insights into our understanding of cellular diversity in the central nervous system (CNS) with unprecedented detail. However, due to current experimental and computational limitations on defining transcriptomic cell types (T-types) and the multiple phenotypic features of cell types in the CNS, an integrative and multimodal approach should be required for the comprehensive classification of cell types.</p>\r\n\r\n<p>To this end, performing multimodal analysis of scRNA-seq in hypothalamus would be very beneficial in that hypothalamus, controlling homeostatic and innate survival behaviors which known to be highly conserved across a wide range of species and encoded in hard-wired brain circuits, is likely to display the more straightforward relationship between transcriptomic identity, axonal projections, and behavioral activation, respectively. In my dissertation, I have been focused on the cell type characterizations of a hypothalamic node controlling innate social behavior in mice, the ventrolateral subdivision of the ventromedial hypothalamus (VMHvl). VMHvl only contains ~4,000 neurons per hemisphere in mice but due to its behavioral, anatomical, and molecular heterogeneity, which T-types in VMHvl are related to connectivity and behavioral function is largely unknown.</p>\r\n\r\n<p>In Chapter II, I described my main thesis work to perform scRNA-seq in VMHvl using two independent platforms: SMART-seq2 (~4,500 neurons sequenced) and 10x (~78,000 neurons sequenced). Specifically, 17 joint VMHvl T-types including several sexually dimorphic clusters were identified by canonical correlation analysis (CCA) in Seurat, and the majority of them were validated by multiplexed single-molecule FISH (seqFISH). Correspondence between transcriptomic identity, and axonal projections or behavioral activation, respectively, was also investigated. Immediate early gene analysis identified T-types exhibiting preferential responses to intruder males versus females but only rare examples of behavior-specific activation. Unexpectedly, many VMHvl T-types comprise a mixed population of neurons with different projection target preferences. Overall our analysis revealed that, surprisingly, few VMHvl T-types exhibit a clear correspondence with behavior-specific activation and connectivity.</p>\r\n\r\n<p>In Chapter III, I will discuss about future directions for a deeper and better understanding of VMHvl cell types. Briefly, my previous data from whole-cell patch clamp recording in VMHvl slices suggested that there were at least 4 distinct electrophysiological cell types (E-types). Additionally, two distinct neuromodulatory effects on VMHvl were observed (persistently activated by vasopressin/oxytocin vs. silenced by nitric oxide) by monitoring populational activities using two-photon Ca2+ imaging in slices. Based on the results from the first part and combined with advanced molecular techniques (e.g. Patch-seq and CRISPR-Cas9), we can further dissect out the cellular diversity in VMHvl and their functional implications.</p>",
        "doi": "10.7907/RGVK-9962",
        "publication_date": "2020",
        "thesis_type": "phd",
        "thesis_year": "2020"
    },
    {
        "id": "thesis:11243",
        "collection": "thesis",
        "collection_id": "11243",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:10232018-150005837",
        "primary_object_url": {
            "basename": "AngelesAlbores_David_2019.pdf",
            "content": "final",
            "filesize": 8920493,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/11243/1/AngelesAlbores_David_2019.pdf",
            "version": "v6.0.0"
        },
        "type": "thesis",
        "title": "A Theory of Genetic Analysis Using Transcriptomic Phenotypes",
        "author": [
            {
                "family_name": "Angeles-Albores",
                "given_name": "David",
                "orcid": "0000-0001-5497-8264",
                "clpid": "Angeles-Albores-David"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Sternberg",
                "given_name": "Paul W.",
                "orcid": "0000-0002-7699-0173",
                "clpid": "Sternberg-P-W"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Newman",
                "given_name": "Dianne K.",
                "orcid": "0000-0003-1647-1918",
                "clpid": "Newman-D-K"
            },
            {
                "family_name": "Meyerowitz",
                "given_name": "Elliot M.",
                "orcid": "0000-0003-4798-5153",
                "clpid": "Meyerowitz-E-M"
            },
            {
                "family_name": "Thomson",
                "given_name": "Matthew",
                "orcid": "0000-0003-1021-1234",
                "clpid": "Thomson-Matthew"
            },
            {
                "family_name": "Sternberg",
                "given_name": "Paul W.",
                "orcid": "0000-0002-7699-0173",
                "clpid": "Sternberg-P-W"
            }
        ],
        "local_group": [
            {
                "literal": "WormBase"
            },
            {
                "literal": "div_bbe"
            }
        ],
        "abstract": "<p>This thesis deals with the conceptual and computational framework required to use transcriptomes as effective phenotypes for genetic analysis. I demonstrate that there are powerful theoretical reasons why Batesonian epistasis should feature prominently in transcriptional phenotypes. I also show how to compute and interpret the aggregate statistics for transcriptome-wide epistasis and transcriptome-wide dominance using whole-organism transcriptomic profiles of C. elegans mutants. Finally, I developed the WormBase Enrichment Suite for enrichment analysis of genomic data.</p>\r\n\r\n<p>RNA-seq as a tool has enormous potential because it relies on protocols that are fast, simple and increasingly cheap. In spite of their potential, transcriptomes have seen their use largely limited to single-factor experiments. Even when many transcriptomes are collected, the main analytic approach is to apply clustering algorithms that correlate responses but do not have any power to identify causal mechanisms.</p>\r\n\r\n<p>I demonstrate that if a complete genetic experimental design is used (in the form of a full two-factor matrix), transcriptomes can establish genetic interactions between a pair of genes without the need for clustering algorithms. Surprisingly, when we performed epistasis analyses of hypoxia pathway mutants in C. elegans we did not simply observe a generalized epistatic interaction between the mutants. In fact, the transcriptomes recapitulated the same Batesonian epistatic relationship that had been observed using classical phenotypes. In other words, we observed that the transcriptomic phenotype of one gene can be masked by the transcriptomic phenotype of a second gene, such that a double mutant of these two genes has exactly the same phenotype as a single mutant of the epistatic gene. Motivated by this observation, we developed methods to recognize and interpret Batesonian epistasis at the transcriptomic level. This method relies on the calculation of a single aggregate coefficient that we named the transcriptome-wide epistasis coefficient.</p>\r\n\r\n<p>The observation that Batesonian epistasis could be reproduced on a transcriptomic level was surprising. To explain how transcriptome-wide epistasis can arise, I studied a simplified model of transcriptional regulation using statistical mechanics. These studies demonstrate that epistatic analysis is equivalent to a perturbative analysis of the partition function of a promoter. Moreover, these studies revealed that a sufficient condition for Batesonian epistasis to occur is if the two genes encode variables that are transformed and multiplied together to form an effective single compound variable. Finally, these studies clearly demonstrate the connection between statistical (or generalized) epistasis and Batesonian epistasis and establish a physical basis for genetic logic.</p>\r\n\r\n<p>Genetic analyses of gene functional units can also be carried out using allelic series in tandem with complementation (also known as dominance) tests. I developed a statistical coefficient known as transcriptome-wide dominance to enable analyses of allelic series using expression profiles. A crucial aspect of allelic series is the ability to enumerate the independent phenotypes associated with an arbitrary set of alleles. I developed the concept of phenotypic classes as a transcriptomic analogue of classical phenotypes for this purpose. Briefly, a phenotypic class is a set of transcripts that are differentially expressed in a specific set of genotypes. Thus, an allelic series consisting of two mutant alleles (and a wild-type) can at most result in 7 phenotypic classes. However, some of these phenotypic classes may be artifactual as a result of the significant false positive and false negative rates that are associated with RNA-seq. I developed a simple algorithm that tries to identify phenotypic classes that are artifactual, though often these classes may also be identified through a critical evaluation of their biological implications. I applied these concepts to a small allelic series of the dpy-22 gene, which encodes a Mediator subunit in C. elegans, and identified 3\u20134 functional units along with their sequence requirements.</p>\r\n\r\n<p>Finally, I developed the WormBase Enrichment Suite by implementing a hypergeometric test on the tissue, gene and phenotype ontology for C. elegans. The importance of this tool derives mainly from its integration to WormBase, the repository of all C. elegans knowledge, which means that the databases that are tested will undergo continuous improvement and curation, and thus will yield the most accurate results.</p>",
        "doi": "10.7907/JRNS-NS05",
        "publication_date": "2019",
        "thesis_type": "phd",
        "thesis_year": "2019"
    }
]