[
    {
        "id": "thesis:7925",
        "collection": "thesis",
        "collection_id": "7925",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:08172013-192316055",
        "primary_object_url": {
            "basename": "Sean_Keller_PhD_Thesis_2014.pdf",
            "content": "final",
            "filesize": 5183773,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/7925/1/Sean_Keller_PhD_Thesis_2014.pdf",
            "version": "v5.0.0"
        },
        "type": "thesis",
        "title": "Robust Near-Threshold QDI Circuit Analysis and Design",
        "author": [
            {
                "family_name": "Keller",
                "given_name": "Sean Jason",
                "clpid": "Keller-Sean-Jason"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Wierman",
                "given_name": "Adam C.",
                "clpid": "Wierman-A-C"
            },
            {
                "family_name": "Emami",
                "given_name": "Azita",
                "clpid": "Emami-A"
            },
            {
                "family_name": "Harris",
                "given_name": "David Money",
                "clpid": "Harris-D-M"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "The two most important digital-system design goals today are to reduce power consumption and to increase reliability. Reductions in power consumption improve battery life in the mobile space and reductions in energy lower operating costs in the datacenter. Increased robustness and reliability shorten down time, improve yield, and are invaluable in the context of safety-critical systems. While optimizing towards these two goals is important at all design levels, optimizations at the circuit level have the furthest reaching effects; they apply to all digital systems. This dissertation presents a study of robust minimum-energy digital circuit design and analysis. It introduces new device models, metrics, and methods of calculation\u2014all necessary first steps towards building better systems\u2014and demonstrates how to apply these techniques. It analyzes a fabricated chip (a full-custom QDI microcontroller designed at Caltech and taped-out in 40-nm silicon) by calculating the minimum energy operating point and quantifying the chip\u2019s robustness in the face of both timing and functional failures.",
        "doi": "10.7907/79EJ-Q945",
        "publication_date": "2014",
        "thesis_type": "phd",
        "thesis_year": "2014"
    },
    {
        "id": "thesis:7226",
        "collection": "thesis",
        "collection_id": "7226",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:10072012-230900231",
        "primary_object_url": {
            "basename": "Nikil-Mehta-2013.pdf",
            "content": "final",
            "filesize": 1724373,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/7226/1/Nikil-Mehta-2013.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "An Ultra-Low-Energy, Variation-Tolerant FPGA Architecture Using Component-Specific Mapping",
        "author": [
            {
                "family_name": "Mehta",
                "given_name": "Nikil",
                "clpid": "Mehta-Nikil"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "DeHon",
                "given_name": "Andre",
                "clpid": "DeHon-A"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "DeHon",
                "given_name": "Andre",
                "clpid": "DeHon-A"
            },
            {
                "family_name": "Calhoun",
                "given_name": "Benton H.",
                "clpid": "Calhoun-B-H"
            },
            {
                "family_name": "Emami",
                "given_name": "Azita",
                "clpid": "Emami-A"
            },
            {
                "family_name": "Hajimiri",
                "given_name": "Ali",
                "clpid": "Hajimiri-A"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>As feature sizes scale toward atomic limits, parameter variation continues to increase, leading to increased margins in both delay and energy.  Parameter variation both slows down devices and causes devices to fail.  For applications that require high performance, the possibility of very slow devices on critical paths forces designers to reduce clock speed in order to meet timing.  For an important and emerging class of applications that target energy-minimal operation at the cost of delay, the impact of variation-induced defects at very low voltages mandates the sizing up of transistors and operation at higher voltages to maintain functionality.</p>  \r\n\r\n<p>With post-fabrication configurability, FPGAs have the opportunity to self-measure the impact of variation, determining the speed and functionality of each individual resource. Given that information, a delay-aware router can use slow devices on non-critical paths, fast devices on critical paths, and avoid known defects.  By mapping each component individually and customizing designs to a component's unique physical characteristics, we demonstrate that we can eliminate delay margins and reduce energy margins caused by variation.</p>    \r\n\r\n<p>To quantify the potential benefit we might gain from component-specific mapping, we first measure the margins associated with parameter variation, and then focus primarily on the energy benefits of FPGA delay-aware routing over a wide range of predictive technologies (45 nm--12 nm) for the Toronto20 benchmark set.  We show that relative to delay-oblivious routing, delay-aware routing without any significant optimizations can reduce minimum energy/operation by 1.72x at 22 nm.  We demonstrate how to construct an FPGA architecture specifically tailored to further increase the minimum energy savings of component-specific mapping by using the following techniques: power gating, gate sizing, interconnect sparing, and LUT remapping.  With all optimizations considered we show a minimum energy/operation savings of 2.66x at 22 nm, or 1.68--2.95x when considered across 45--12 nm.  As there are many challenges to measuring resource delays and mapping per chip, we discuss methods that may make component-specific mapping more practical.  We demonstrate that a simpler, defect-aware routing achieves 70% of the energy savings of delay-aware routing.  Finally, we show that without variation tolerance, scaling from 16 nm to 12 nm results in a net increase in minimum energy/operation; component-specific mapping, however, can extend minimum energy/operation scaling to 12 nm and possibly beyond.</p>  \r\n",
        "doi": "10.7907/358S-CW22",
        "publication_date": "2013",
        "thesis_type": "phd",
        "thesis_year": "2013"
    },
    {
        "id": "thesis:7188",
        "collection": "thesis",
        "collection_id": "7188",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:08192012-145253489",
        "type": "thesis",
        "title": "GRAph Parallel Actor Language: A Programming Language for Parallel Graph Algorithms",
        "author": [
            {
                "family_name": "DeLorimier",
                "given_name": "Michael John",
                "clpid": "DeLorimier-Michael-John"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "DeHon",
                "given_name": "Andre",
                "clpid": "DeHon-A"
            },
            {
                "family_name": "Desbrun",
                "given_name": "Mathieu",
                "clpid": "Desbrun-M"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "DeHon",
                "given_name": "Andre",
                "clpid": "DeHon-A"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            },
            {
                "family_name": "Meiron",
                "given_name": "Daniel I.",
                "clpid": "Meiron-D-I"
            },
            {
                "family_name": "Shrobe",
                "given_name": "Howard",
                "clpid": "Shrobe-H"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "We introduce a domain-specific language, GRAph Parallel Actor Language, that enables parallel graph algorithms to be written in a natural, high-level form. GRAPAL is based on our GraphStep compute model, which enables a wide range of parallel graph algorithms that are high-level, deterministic, free from race conditions, and free from deadlock. Programs written in GRAPAL are easy for a compiler and runtime to map to efficient parallel field programmable gate array (FPGA) implementations. We show that the GRAPAL compiler can verify that the structure of operations conforms to the GraphStep model. We allocate many small processing elements in each FPGA that take advantage of the high on-chip memory bandwidth (5x the sequential processor) and process one graph edge per clock cycle per processing element. We show how to automatically choose parameters for the logic architecture so the high-level GRAPAL programming model is independent of the target FPGA architecture. We compare our GRAPAL applications mapped to a platform with four 65 nm Virtex-5 SX95T FPGAs to sequential programs run on a single 65 nm Xeon 5160. Our implementation achieves a total mean speedup of 8x with a maximum speedup of 28x. The speedup per chip is 2x with a maximum of 7x. The ratio of energy used by our GRAPAL implementation over the sequential implementation has a mean of 1/10 with a minimum of 1/80.",
        "doi": "10.7907/M3TW-7Y53",
        "publication_date": "2013",
        "thesis_type": "phd",
        "thesis_year": "2013"
    },
    {
        "id": "thesis:6481",
        "collection": "thesis",
        "collection_id": "6481",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:05312011-123940546",
        "primary_object_url": {
            "basename": "jwhite.phd.pdf",
            "content": "final",
            "filesize": 1446611,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/6481/1/jwhite.phd.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Applying Formal Methods to Distributed Algorithms Using Local-Global Relations  ",
        "author": [
            {
                "family_name": "White",
                "given_name": "Jerome S.",
                "clpid": "White-Jerome-S"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Murray",
                "given_name": "Richard M.",
                "clpid": "Murray-R-M"
            },
            {
                "family_name": "Doyle",
                "given_name": "John Comstock",
                "clpid": "Doyle-J-C"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Holzmann",
                "given_name": "Gerard J.",
                "clpid": "Holzmann-G-J"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>This thesis deals with the design and analysis of distributed systems in which homogeneous, autonomous agents collaborate to achieve a common goal. The class of problems studied includes consensus algorithms in which all agents eventually come to an agreement about a specific action. The thesis proposes a framework, called local-global, for analyzing these systems. A local interaction is an interaction among subsets of agents, while a global interaction is one among all agents in the system. Global interactions, in practice, are rare, yet they are the basis by which correctness of a system is measured. For example, if the problem is to compute the average of a measurement made separately by each agent, and all the agents in the system could exchange values in a single action, then the solution is straightforward: each agent gets the values of all others and computes the average independently. However, if the system consists of a large number of agents with unreliable communication, this scenario is highly unlikely. Thus, the design challenge is to ensure that sequences of local interactions lead, or converge, to the same state as a global interaction.</p>\r\n\r\n<p>The local-global framework addresses this challenge by describing each local interaction as if were a global one, encompassing all agents within the system. This thesis outlines the concept in detail, using it to design algorithms, prove their correctness, and ultimately develop executable implementations that are reliable. To this end, the tools of formal methods are employed: algorithms are modeled, and mechanically checked, within the PVS theorem prover; programs are also verified using the Spin model checker; and interface specification languages are used to ensure local-global properties are still maintained within Java and C# implementations. The thesis presents example applications of the framework and discusses a class of problems to which the framework can be applied.</p>",
        "doi": "10.7907/8FRW-ZF17",
        "publication_date": "2011",
        "thesis_type": "phd",
        "thesis_year": "2011"
    },
    {
        "id": "thesis:6159",
        "collection": "thesis",
        "collection_id": "6159",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:10262010-082537998",
        "primary_object_url": {
            "basename": "submit_oct26.pdf",
            "content": "final",
            "filesize": 4906533,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/6159/1/submit_oct26.pdf",
            "version": "v5.0.0"
        },
        "type": "thesis",
        "title": "SPICE\u00b2: A Spatial, Parallel Architecture for Accelerating the Spice Circuit Simulator\r ",
        "author": [
            {
                "family_name": "Kapre",
                "given_name": "Nachiket Ganesh",
                "orcid": "0000-0002-2187-0406",
                "clpid": "Kapre-Nachiket-Ganesh"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "DeHon",
                "given_name": "Andre",
                "clpid": "DeHon-A"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Meiron",
                "given_name": "Daniel I.",
                "clpid": "Meiron-D-I"
            },
            {
                "family_name": "Bruck",
                "given_name": "Jehoshua",
                "clpid": "Bruck-J"
            },
            {
                "family_name": "Trimberger",
                "given_name": "Steven",
                "clpid": "Trimberger-S"
            },
            {
                "family_name": "DeHon",
                "given_name": "Andre",
                "clpid": "DeHon-A"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>Spatial processing of sparse, irregular floating-point computation using a single FPGA enables up to an order of magnitude speedup (mean 2.8X speedup) over a conventional microprocessor for the SPICE circuit simulator. We deliver this speedup using a hybrid parallel architecture that spatially implements the heterogeneous forms of parallelism available in SPICE. We decompose SPICE into its three constituent phases: Model-Evaluation, Sparse Matrix-Solve, and Iteration Control and parallelize each phase independently. We exploit data-parallel device evaluations in the Model-Evaluation phase, sparse dataflow parallelism in the Sparse Matrix-Solve phase and compose the complete design in streaming fashion. We name our parallel architecture SPICE\u00b2: Spatial Processors Interconnected for Concurrent Execution for accelerating the SPICE circuit simulator.  We program the parallel architecture with a high-level, domain-specific framework that identifies, exposes and exploits parallelism available in the SPICE circuit simulator. This design is optimized with an auto-tuner that can scale the design to use larger FPGA capacities without expert intervention and can even target other parallel architectures with the assistance of automated code-generation.  This FPGA architecture is able to outperform conventional processors due to a combination of factors including high utilization of statically-scheduled resources, low-overhead dataflow scheduling of fine-grained tasks, and overlapped processing of the control algorithms.</p>\r\n\r\n<p>We demonstrate that we can independently accelerate Model-Evaluation by a mean factor of 6.5X(1.4--23X) across a range of non-linear device models and Matrix-Solve by 2.4X(0.6--13X) across various benchmark matrices while delivering a mean combined speedup of 2.8X(0.2--11X) for the two together when comparing a Xilinx Virtex-6 LX760 (40nm) with an Intel Core i7 965 (45nm).  With our high-level framework, we can also accelerate Single-Precision Model-Evaluation on NVIDIA GPUs, ATI GPUs, IBM Cell, and Sun Niagara 2 architectures.</p>\r\n\r\n<p>We expect approaches based on exploiting spatial parallelism to become important as frequency scaling slows down and modern processing architectures turn to parallelism (\\eg multi-core, GPUs) due to constraints of power consumption. This thesis shows how to express, exploit and optimize spatial parallelism for an important class of problems that are challenging to parallelize.</p>\r\n",
        "doi": "10.7907/QVZR-VB52",
        "publication_date": "2011",
        "thesis_type": "phd",
        "thesis_year": "2011"
    },
    {
        "id": "thesis:320",
        "collection": "thesis",
        "collection_id": "320",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-01242008-012650",
        "primary_object_url": {
            "basename": "Helia_Naeimi_PhD_Thesis.pdf",
            "content": "final",
            "filesize": 1357385,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/320/1/Helia_Naeimi_PhD_Thesis.pdf",
            "version": "v2.0.0"
        },
        "type": "thesis",
        "title": "Reliable Integration of Terascale Systems with Nanoscale Devices",
        "author": [
            {
                "family_name": "Naeimi",
                "given_name": "Helia",
                "clpid": "Naeimi-Helia"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "DeHon",
                "given_name": "Andre",
                "orcid": "0000-0001-9177-7699",
                "clpid": "DeHon-A"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "DeHon",
                "given_name": "Andre",
                "orcid": "0000-0001-9177-7699",
                "clpid": "DeHon-A"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Heath",
                "given_name": "James R.",
                "orcid": "0000-0001-5356-4385",
                "clpid": "Heath-J-R"
            },
            {
                "family_name": "Ho",
                "given_name": "Tracey C.",
                "clpid": "Ho-Tracey"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>Nanotechnology design has attracted considerable attention in recent years and seems to be the technology for the future generation of the electronic devices, either as scaled and more restricted conventional lithographic technology, or as emerging sublithographic technologies, such as nanowires, carbon nanotubes, NDR (Negative Differential Resistance) devices, or other nanotechnology devices. Each of these technologies provides one or more design benefits including feature-size scaling, high on\u2013off ratios, and faster devices. However, all of these techniques share their most challenging design issue: reliability. Providing reliability is becoming constantly more challenging due to increases in both the device failure rate and system complexity. This work develops techniques that make achieving reliability in such systems feasible with practical area overhead and considerable improvement in area overhead and system reliability compared to related techniques.</p>\r\n\r\n<p>Conventional reliability techniques focus on low defect and fault rates, i.e., single event upset (SEU). These techniques cannot simply be scaled to larger systems with more unreliable devices. If these techniques are directly applied to the high defect and fault rate of the nanotechnology regime, they suffer impractically high overhead, or they may not achieve the desired reliability. Our approach in this thesis exploits the following design patterns to achieve a considerable area reduction compared to related works and achieve high reliability:<br />\r\n(1) Fine-grained reliability: In this technique, the system is partitioned into fine\u2013grained blocks, and the reliability is provided for each block. This technique is used to contain the area overhead and bound the impact on the throughput.<br /> \r\n(2) Using alternative resources: This technique improves the design quality by sparing other resources when system is tight on one resource. In our work we replace some of the spacial redundancies with temporal redundancy to limit the area overhead.  We further improve the system throughput to limit the throughput cost as well.<br />\r\n(3) Defect pattern matching: With this techniques, the defective resources are located and the design is reconfigured considering the defect pattern of the chip.  Then the design configuration is mapped to the chip. This technique isolates the defective resources and make use of most of defect free resources.<br />\r\n(4) Global reliability: This technique is used to unify the reliability techniques used in different parts of the system. When using one unified technique to protect the system, the area overhead provided to protect one resource can be reused to protect other resources as well.</p>\r\n\r\n<p>In the present work, we report considerable improvement in the area overhead using the above techniques. We show that using Fine-Grained Reliability, Alternative Resources, and Defect Pattern Matching, high permanent defect rates (e.g., 10%) which is the result of imperfect manufacturing can be tolerated with moderate area overhead (about 30% on average for typical designs). Again Using Alternative Resources and Fine-Grained Reliability improve the area overhead of the transient fault-tolerant designs by close to an order of magnitude compared to recent reliable works. Finally we report a fully reliable memory system that employs a Global Reliability scheme to tolerate permanent defects and transient faults, both in the memory and in the supporting logic and still achieves 100 Gbit/cm2 density for fault rate of 10\u221218 errors per bit per cycle and 10% junction defect rate.</p>",
        "doi": "10.7907/P842-7B49",
        "publication_date": "2008",
        "thesis_type": "phd",
        "thesis_year": "2008"
    },
    {
        "id": "thesis:5260",
        "collection": "thesis",
        "collection_id": "5260",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-11092007-180524",
        "type": "thesis",
        "title": "Soft-Error Tolerant Quasi Delay-insensitive Circuits",
        "author": [
            {
                "family_name": "Jang",
                "given_name": "Wonjin",
                "clpid": "Jang-Wonjin"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Bruck",
                "given_name": "Jehoshua",
                "orcid": "0000-0001-8474-0812",
                "clpid": "Bruck-J"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "orcid": "0000-0001-9190-1290",
                "clpid": "Chandy-K-M"
            },
            {
                "family_name": "Ho",
                "given_name": "Tracey C.",
                "clpid": "Ho-Tracey"
            },
            {
                "family_name": "Abu-Mostafa",
                "given_name": "Yaser S.",
                "clpid": "Abu-Mostafa-Y-S"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>A hard error is an error that damages a circuit irrevocably; a soft error flips the logic states without causing any physical damage to the circuit, resulting in transient corruption of data. They result in transient, inconsistent corruption of data.</p>\r\n\r\n<p>The soft-error tolerance of logic circuits is recently getting more attention, since the soft- error rate of advanced CMOS devices is higher than before. As a response to the concern on soft errors, we propose a new method for making asynchronous circuits tolerant to soft errors. Since it relies on a property unique to asynchronous circuits, the method is different from what is done in synchronous circuits with triple modular redundancy. Asynchronous circuits have been attractive to the designers of reliable systems, because of their clock-less design, which makes them more robust to variations on computation time of modules. The quasi delay-insensitive (QDI) design style is one of the most robust asynchronous design styles for general computation; it makes one minimal assumption on delays in gates and wires. QDI circuits are easy to verify, simple, and modular, because the correct operation of a QDI circuit is independent of delays in gates and wires.</p>\r\n\r\n<p>Here, we shall overview how to design a QDI circuit, and what will happen if a soft error occurs on a QDI circuit. Then the crucial components of the method are shown: (1) a special kind of duplication for random logic (when each bit has to be corrected individually), (2) special protection circuitry for arbiter and synchronizer (as needed for example for external interrupts), (3) reconfigurable circuits using a special configuration unit, and (4) error correcting for memory arrays and other structures in which the data bits can be self- corrected. The solution of protecting random logic is compared with alternatives, which use other types of error correcting codes (e.g., parity code) in a QDI circuit. It turns out that the duplication generates efficient circuits more commonly than other possible constructions. Finally, the design of a soft-error tolerant asynchronous microprocessor is detailed and testing results of the soft-error tolerance of the microprocessor are shown.</p>\r\n\r\n",
        "doi": "10.7907/ZVFF-WE07",
        "publication_date": "2008",
        "thesis_type": "phd",
        "thesis_year": "2008"
    },
    {
        "id": "thesis:2118",
        "collection": "thesis",
        "collection_id": "2118",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-05262008-234258",
        "primary_object_url": {
            "basename": "thesis.pdf",
            "content": "final",
            "filesize": 917396,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/2118/1/thesis.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "Throughput Optimization of Quasi Delay Insensitive Circuits via Slack Matching",
        "author": [
            {
                "family_name": "Prakash",
                "given_name": "Piyush",
                "clpid": "Prakash-Piyush"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "DeHon",
                "given_name": "Andre",
                "clpid": "DeHon-A"
            },
            {
                "family_name": "Umans",
                "given_name": "Christopher M.",
                "clpid": "Umans-C-M"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "Though the logical correctness of an asynchronous circuit is independent of implementation delays, the cycle time of an asynchronous circuit is of great importance to the designer.  Oftentimes, the insertion of buffers to such circuits reduces the cycle time of the circuit without affecting the logical correctness of the circuit.  This optimization is called slack matching.  In this thesis the slack matching problem is formulated.  I show that this problem is NP-complete via a reduction from subset sum.  I describe two methods for expressing slack matching as a mixed integer linear program(MILP).  The first method is applicable to any QDI circuit, while the second method produces a smaller MILP for circuits comprised solely of half buffers.  These two formulations of slack matching were applied to the design of a fetch loop in an asynchronous micro-controller.  Slack matching reduced the cycle time of the circuit by a factor of 3.  For a circuit composed of 14 byte wide processes and a 8k instruction memory, 30s were required to generate the first MILP.  It was solved in 2s.  When the memory is modeled as a pipeline of half buffers, the second MILP could be formulated in 0.1s and solved in 0.6s.  This MILP had half the number of integer variables as the first formulation.",
        "doi": "10.7907/9HMY-RR92",
        "publication_date": "2008",
        "thesis_type": "phd",
        "thesis_year": "2008"
    },
    {
        "id": "thesis:2267",
        "collection": "thesis",
        "collection_id": "2267",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-05292008-231048",
        "primary_object_url": {
            "basename": "thesis.pdf",
            "content": "final",
            "filesize": 1612108,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/2267/1/thesis.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Adaptive Learning Algorithms and Data Cloning",
        "author": [
            {
                "family_name": "Pratap",
                "given_name": "Amrit",
                "clpid": "Pratap-Amrit"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Abu-Mostafa",
                "given_name": "Yaser S.",
                "clpid": "Abu-Mostafa-Y-S"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Abu-Mostafa",
                "given_name": "Yaser S.",
                "clpid": "Abu-Mostafa-Y-S"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Bruck",
                "given_name": "Jehoshua",
                "clpid": "Bruck-J"
            },
            {
                "family_name": "Perona",
                "given_name": "Pietro",
                "clpid": "Perona-P"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>This thesis is in the field of machine learning: the use of data to automatically learn a hypothesis to predict the future behavior of a system. It summarizes three of my research projects.</p>\r\n\r\n<p>We first investigate the role of margins in the phenomenal success of the Boosting Algorithms. AdaBoost (Adaptive Boosting) is an algorithm for generating an ensemble of hypotheses for classification. The superior out-of-sample performance of AdaBoost has been attributed to the fact that it can generate a classifier which classifies the points with a large margin of confidence. This led to the development of many new algorithms focusing on optimizing the margin of confidence. It was observed that directly optimizing the margins leads to a poor performance. This apparent contradiction has been the topic of a long unresolved debate in the machine-learning community. We introduce new algorithms which are expressly designed to test the margin hypothesis and provide concrete evidence which refutes the margin argument.</p>\r\n\r\n<p>We then propose a novel algorithm for Adaptive sampling under Monotonicity constraint. The typical learning problem takes examples of the target function as input information and produces a hypothesis that approximates the target as an output. We consider a generalization of this paradigm by taking different types of information as input, and producing only specific properties of the target as output. This is a very common setup which occurs in many different real-life settings where the samples are expensive to obtain. We show experimentally that our algorithm achieves better performance than the existing methods, such as Staircase procedure and PEST.</p>\r\n\r\n<p>One of the major pitfalls in machine learning research is that of selection bias. This is mostly introduced unconsciously due to the choices made during the learning process, which often lead to over-optimistic estimates of the performance. In the third project, we introduce a new methodology for systematically reducing selection bias. Experiments show that using cloned datasets for model selection can lead to better performance and reduce the selection bias.</p>",
        "doi": "10.7907/GV3D-AB69",
        "publication_date": "2008",
        "thesis_type": "phd",
        "thesis_year": "2008"
    },
    {
        "id": "thesis:1516",
        "collection": "thesis",
        "collection_id": "1516",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-04262007-131214",
        "primary_object_url": {
            "basename": "basset-thesis.pdf",
            "content": "final",
            "filesize": 2106825,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/1516/1/basset-thesis.pdf",
            "version": "v2.0.0"
        },
        "type": "thesis",
        "title": "CMOS Imaging Technology with Embedded Early Image Processing",
        "author": [
            {
                "family_name": "Basset",
                "given_name": "Christophe Jean-Michel",
                "clpid": "Basset-Christophe-Jean-Michel"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Perona",
                "given_name": "Pietro",
                "orcid": "0000-0002-7583-5809",
                "clpid": "Perona-P"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Perona",
                "given_name": "Pietro",
                "orcid": "0000-0002-7583-5809",
                "clpid": "Perona-P"
            },
            {
                "family_name": "Hajimiri",
                "given_name": "Ali",
                "orcid": "0000-0001-6736-8019",
                "clpid": "Hajimiri-A"
            },
            {
                "family_name": "Pain",
                "given_name": "Bedabrata",
                "clpid": "Pain-B"
            },
            {
                "family_name": "Mathur",
                "given_name": "Bimal",
                "clpid": "Mathur-B"
            },
            {
                "family_name": "Koch",
                "given_name": "Christof",
                "orcid": "0000-0001-6482-8067",
                "clpid": "Koch-C"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>As imaging technology evolves, so does the need for accurate, low-power and high-data-rate low-level image processing in a variety of computationally intensive vision applications. These applications include optical-flow computation, autonomous navigation, object avoidance or intercept, real-time target tracking, and recognition. To reach this goal, a single chip was developed, which functions as a camera able to preprocess the image in real time. It processes images through a convolution filter with a user-chosen kernel.</p>\r\n\r\n<p>One of the particulars of this project is to combine the processing unit with an active pixel sensors (APS) pixel array. This complementary metal-oxide semiconductor (CMOS) technology for building imager chips allows on-focal plane signal processing, as opposed to their charge-coupled device (CCD) counterparts that need to serially output the flow of pixels to an external processing chip. The filtering can therefore be implemented as a fast, low-power analog circuit.</p>\r\n\r\n<p>Convolution is achieved by matching a kernel to an image using a computation unit. The chip has an integrated imager array and a digital memory large enough to store a generic, up-loadable kernel. When recognizing or tracking a target, the uploaded kernel represents the template. Other convolution filters are implemented by setting the kernel to the set of parameters corresponding to the desired task. Filtering is performed through a column-parallel architecture of computing units, so real time computation can be achieved.</p>\r\n\r\n<p>Several versions of the convolution circuit are investigated. They have been fabricated, fully tested and characterized. A number of important design changes have occurred, either to address issues that could be improved on or to experiment with alternative approaches. Timed and geometrical amplifier controls have also been investigated. By implementing image arrays of different sizes, we also demonstrate the scalability of the architecture in the spatial domain to an arbitrarily sized imager. Test results show the analog convolution chip is a viable solution for highly integrated embedded early image processing.</p>",
        "doi": "10.7907/2GZN-T836",
        "publication_date": "2007",
        "thesis_type": "phd",
        "thesis_year": "2007"
    },
    {
        "id": "thesis:1960",
        "collection": "thesis",
        "collection_id": "1960",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-05222007-211909",
        "primary_object_url": {
            "basename": "thesis_xin.pdf",
            "content": "final",
            "filesize": 729119,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/1960/1/thesis_xin.pdf",
            "version": "v2.0.0"
        },
        "type": "thesis",
        "title": "Reflection and Its Application to Mechanized MetaReasoning About Programming Languages",
        "author": [
            {
                "family_name": "Yu",
                "given_name": "Xin",
                "clpid": "Yu-Xin"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Hickey",
                "given_name": "Jason J.",
                "clpid": "Hickey-J-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Hickey",
                "given_name": "Jason J.",
                "clpid": "Hickey-J-J"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Joshi",
                "given_name": "Rajeev",
                "clpid": "Joshi-R"
            },
            {
                "family_name": "Low",
                "given_name": "Steven H.",
                "clpid": "Low-S-H"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>It is well known that adding reflective reasoning can tremendously increase the power of a proof assistant. In order for this theoretical increase of power to become accessible to users in practice, the proof assistant needs to provide a great deal of infrastructure to support reflective reasoning. In this thesis we explore the problem of creating a practical implementation of such a support layer.</p> \r\n\r\n<p>Our implementation takes a specification of a logical theory (which is identical to how it would be specified if we simply intended to reason within this logical theory, instead of reflecting it) and automatically generates the necessary definitions, lemmas, and proofs that are needed to enable the reflected metareasoning in the provided theory.</p>\r\n\r\n<p>One of the key features of our approach is that the structure of a logic is preserved when it is reflected, including variables, meta variables, and binding structure. This allows the structure of proofs to be preserved as well, and there is a one-to-one map from proof steps in the original logic to proof steps in the reflected logic. The act of reflecting a language is automated; all definitions, theorems, and proofs are preserved by the transformation and all the key lemmas (such as proof and structural induction) are automatically derived.</p>\r\n\r\n<p>The principal representation used by the reflected logic is higher-order abstract syntax (HOAS). However, reasoning about terms in HOAS can be awkward in some cases, especially for variables. For this reason, we define a computationally equivalent variable-free de Bruijn representation that is interchangeable with the HOAS in all contexts. The de Bruijn representation inherits the properties of substitution and alpha-equality from the logical framework, and it is not complicated by administrative issues like variable renumbering.</p>\r\n\r\n<p>We further develop the concepts and principles of proofs, provability, and structural and proof induction. This work is fully implemented in the MetaPRL theorem prover. We illustrate with an application to [F...] as defined in the POPLmark challenge.</p>",
        "doi": "10.7907/S0HG-RT72",
        "publication_date": "2007",
        "thesis_type": "phd",
        "thesis_year": "2007"
    },
    {
        "id": "thesis:155",
        "collection": "thesis",
        "collection_id": "155",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-01132006-152609",
        "primary_object_url": {
            "basename": "main.pdf",
            "content": "final",
            "filesize": 1062451,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/155/1/main.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "Rigorous Analog Verification of Asynchronous Circuits",
        "author": [
            {
                "family_name": "Papadantonakis",
                "given_name": "Karl Spyros",
                "clpid": "Papadantonakis-Karl-Spyros"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "DeHon",
                "given_name": "Andre",
                "clpid": "DeHon-A"
            },
            {
                "family_name": "Winfree",
                "given_name": "Erik",
                "clpid": "Winfree-E"
            },
            {
                "family_name": "Hickey",
                "given_name": "Jason J.",
                "clpid": "Hickey-J-J"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "This thesis shows that rigorous verification of some analog implementation of any Quasi-Delay-Insensitive (QDI) asynchronous circuit is possible.  That is, we show that in an accurate analog model, any behavior will adhere to the digital computation specifications under any possible noise and environment timing. Unlike a traditional simulation, we can analyze all of the infinitely many possible analog behaviors, in a time linear in the circuit size. A problem that arises in asynchronous circuit design is that the analog implementations of digital computations do not in general exhibit all properties demanded by the digital model assumed in circuit construction. For example, the digital model is atomic, in a sense we define. By contrast, analog models are non-atomic, and, as a result, we can give examples of real circuits with operational failures. There exist other attributes of analog models which can cause failures, and no complete classification exists. Ultimately there is only one way to solve this problem: we must show that all possible analog behaviors obey the atomic model. We focus on CMOS implementations, and the associated accepted bulk-scale model. Given any canonically-generated implementation of a general computation, we can rigorously verify it. The only exception to this rule is that restoring delay elements must be inserted into some implementations (fortunately, this change has no semantic effect on QDI circuits, by definition). Our theorem guarantees that when any possible analog behavior is properly observed, we obtain a valid, atomic digital execution. Several rigorous verifications have been produced, including one for an asynchronous pipeline circuit with dual-rail data.",
        "doi": "10.7907/4R8F-WF03",
        "publication_date": "2006",
        "thesis_type": "phd",
        "thesis_year": "2006"
    },
    {
        "id": "thesis:1591",
        "collection": "thesis",
        "collection_id": "1591",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-05032004-153842",
        "primary_object_url": {
            "basename": "marc.riedel.phd.pdf",
            "content": "final",
            "filesize": 2195990,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/1591/1/marc.riedel.phd.pdf",
            "version": "v2.0.0"
        },
        "type": "thesis",
        "title": "Cyclic Combinational Circuits",
        "author": [
            {
                "family_name": "Riedel",
                "given_name": "Marcus D.",
                "orcid": "0000-0002-3318-346X",
                "clpid": "Riedel-Marcus-D"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Bruck",
                "given_name": "Jehoshua",
                "clpid": "Bruck-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Bruck",
                "given_name": "Jehoshua",
                "clpid": "Bruck-J"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Hajimiri",
                "given_name": "Ali",
                "clpid": "Hajimiri-A"
            },
            {
                "family_name": "Viterbi",
                "given_name": "Andrew",
                "clpid": "Viterbi-A"
            },
            {
                "family_name": "Winfree",
                "given_name": "Erik",
                "clpid": "Winfree-E"
            },
            {
                "family_name": "Abu-Mostafa",
                "given_name": "Yaser S.",
                "clpid": "Abu-Mostafa-Y-S"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>A collection of logic gates forms a combinational circuit if the outputs can be described as Boolean functions of the current input values only. Optimizing combinational circuitry, for instance, by reducing the number of gates (the area) or by reducing the length of the signal paths (the delay), is an overriding concern in the design of digital integrated circuits.</p>\r\n\r\n<p>The accepted wisdom is that combinational circuits must have acyclic (i.e., loop-free or feed-forward) topologies. In fact, the idea that \"combinational\" and \"acyclic\" are synonymous terms is so thoroughly ingrained that many textbooks provide the latter as a definition of the former. And yet simple examples suggest that this is incorrect. In this dissertation, we advocate the design of cyclic combinational circuits (i.e., circuits with loops or feedback paths). We demonstrate that circuits can be optimized effectively for area and for delay by introducing cycles.</p>\r\n\r\n<p>On the theoretical front, we discuss lower bounds and we show that certain cyclic circuits are one-half the size of the best possible equivalent acyclic implementations. On the practical front, we describe an efficient approach for analyzing cyclic circuits, and we provide a general framework for synthesizing such circuits. On trials with industry-accepted benchmark circuits, we obtained significant improvements in area and delay in nearly all cases. Based on these results, we suggest that it is time to re-write the definition: combinational might well mean cyclic.</p>",
        "doi": "10.7907/410B-XR25",
        "publication_date": "2004",
        "thesis_type": "phd",
        "thesis_year": "2004"
    },
    {
        "id": "thesis:5393",
        "collection": "thesis",
        "collection_id": "5393",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:11192009-161338958",
        "primary_object_url": {
            "basename": "wong_catherine_grace_2004.pdf",
            "content": "final",
            "filesize": 1069347,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/5393/1/wong_catherine_grace_2004.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "High-Level Synthesis and Rapid Prototyping of Asynchronous VLSI Systems",
        "author": [
            {
                "family_name": "Wong",
                "given_name": "Catherine Grace",
                "clpid": "Wong-Catherine-Grace"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "DeHon",
                "given_name": "Andre",
                "clpid": "DeHon-A"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            },
            {
                "family_name": "Hickey",
                "given_name": "Jason J.",
                "clpid": "Hickey-J-J"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>This thesis introduces data-driven decomposition (DDD), a new method for the high-level synthesis of asynchronous VLSI systems and the first method to target high-performance asynchronous circuits. Given a sequential description of circuit behavior, DDD produces an equivalent network of communicating processes that can each be directly implemented as fine-grained asynchronous pipeline stages. Control and datapath are integrated within each pipeline stage of the final system.</p>\r\n\r\n<p>We present many aspects of the synthesis of asynchronous VLSI systems, including general circuit templates that DDD uses to estimate low-level performance and energy metrics while optimizing the concurrent system. We also introduce a new circuit model and new techniques for slack matching, a performance optimization that inserts pipelining into a system to modify asynchronous handshake dynamics and increase throughput. The entire method is then applied to a complex control unit from an asynchronous 8051 microcontroller, as an example.</p>\r\n\r\n<p>This thesis also introduces a new architecture for an asynchronous field-programmable gate array (FPGA). The architecture is cluster-based and, unlike most FPGA designs, contains an entirely delay-insensitive interconnect. The basic reconfigurable cells of this FPGA fit the asynchronous pipeline-stage circuit-template used by DDD, and the reconfigurable clusters include circuitry that implements features assumed by an optimization phase of DDD, which reduces the energy  consumption of the system.</p>",
        "doi": "10.7907/5N2N-0W58",
        "publication_date": "2004",
        "thesis_type": "phd",
        "thesis_year": "2004"
    },
    {
        "id": "thesis:4821",
        "collection": "thesis",
        "collection_id": "4821",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-12072001-160019",
        "primary_object_url": {
            "basename": "thesis-online.pdf",
            "content": "final",
            "filesize": 850716,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/4821/1/thesis-online.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "Dynamic UNITY",
        "author": [
            {
                "family_name": "Zimmerman",
                "given_name": "Daniel Marc",
                "clpid": "Zimmerman-Daniel-Marc"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Hickey",
                "given_name": "Jason J.",
                "clpid": "Hickey-J-J"
            },
            {
                "family_name": "Bruck",
                "given_name": "Jehoshua",
                "orcid": "0000-0001-8474-0812",
                "clpid": "Bruck-J"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "Dynamic distributed systems, where a changing set of communicating processes must interoperate to accomplish particular computational tasks, are becoming extremely important. Designing and implementing these systems, and verifying the correctness of the designs and implementations, are difficult tasks. The goal of this thesis is to make these tasks easier.\r\n\r\nThis thesis presents a specification language for dynamic distributed systems, based on Chandy and Misra's UNITY language. It extends the UNITY language to enable process creation, process deletion, and dynamic communication patterns.\r\n\r\nThe thesis defines an execution model for systems specified in this language, which leads to a proof logic similar to that of UNITY. While extending UNITY logic to correctly handle systems with dynamic behavior, this logic retains the familiar UNITY operators and most of the proof rules associated with them. \r\n\r\nThe thesis presents specifications for three example dynamic distributed systems to demonstrate the use of the specification language, and full correctness proofs for two of these systems and a partial correctness proof for the third to demonstrate the use of the proof logic. \r\n\r\nThe thesis details a method for determining whether a system in the specification language can be transformed into an implementation in a standard programming language, as well as a method for performing this transformation on those specifications that can. This guarantees a correct implementation for any specification that can be so transformed. \r\n",
        "doi": "10.7907/AC6E-WE21",
        "publication_date": "2002",
        "thesis_type": "phd",
        "thesis_year": "2002"
    },
    {
        "id": "thesis:6263",
        "collection": "thesis",
        "collection_id": "6263",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:03022011-131111881",
        "primary_object_url": {
            "basename": "Penzes_pi_2002.pdf",
            "content": "final",
            "filesize": 105482562,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/6263/1/Penzes_pi_2002.pdf",
            "version": "v5.0.0"
        },
        "type": "thesis",
        "title": "Energy-Delay Complexity of Asynchronous Circuits",
        "author": [
            {
                "family_name": "P\u00e9nzes",
                "given_name": "Paul Ivan",
                "clpid": "P\u00e9nzes-Paul-Ivan"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "DeHon",
                "given_name": "Andre",
                "clpid": "DeHon-A"
            },
            {
                "family_name": "Hajimiri",
                "given_name": "Ali",
                "orcid": "0000-0001-6736-8019",
                "clpid": "Hajimiri-A"
            },
            {
                "family_name": "Hickey",
                "given_name": "Jason J.",
                "clpid": "Hickey-J-J"
            },
            {
                "family_name": "Nystr\u00f6em",
                "given_name": "Mika",
                "clpid": "Nystr\u00f6em-M"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "In this thesis, a circuit-level theory of energy-delay complexity is developed for asynchronous circuits. The energy-delay efficiency of a circuit is characterized using the metric Et^n , where E is the energy consumed by the computation, t is the delay of the computation, and n is a positive number that reflects a chosen trade-off between\r\nenergy and delay. Based on theoretical and experimental evidence, it is argued that for a circuit optimized for minimal Et^n, the consumed energy is independent, in first\r\napproximation, of the types of gates (NAND, NOR, etc.) used by the circuit and is solely dependent on n and the total amount of wiring capacitance switched during computation. Conversely, the circuit speed is independent, in first approximation, of the wiring capacitance and depends only on n and the types of gates used.\r\n\r\nThe complexity model allows us to compare the energy-delay efficiency of two circuits implementing the same computation. On the other hand, the complexity model itself does not say much about the actual transistor sizes that achieve the optimum. For this reason, the problem of transistor sizing of circuits optimized for Et^n is investigated, as well. A set of analytical formulas that closely approximate the optimal transistor sizes are explored. An efficient iteration procedure that can further\r\nimprove the original analytical solution is then studied. Based on these results, a novel transistor-sizing algorithm for energy-delay efficiency is introduced.\r\n\r\nIt is shown that the Et^n  metric for the energy-delay efficiency index n \u2265 0 characterizes any optimal trade-off between the energy and the delay of a computation. For\r\nexample, any problem of minimizing the energy of a system for a given target delay can be restated as minimizing Et^n for a certain n. The notion of minimum-energy function is developed and applied to the parallel and sequential composition of circuits in general and, in particular, to circuits optimized through transistor sizing and\r\nvoltage scaling. Bounds on the energy and delay of the optimized circuits are computed, and necessary and sufficient conditions are given under which these bounds are\r\nreached. Necessary and sufficient conditions are also given under which components of a design can be optimized independently so as to yield a global optimum when\r\ncomposed. Through these applications, the utility of the minimum-energy function is demonstrated. The use of this minimum-energy function yields practical insight into\r\nways of improving the overall energy-delay efficiency of circuits.\r\n",
        "doi": "10.7907/9jpj-5s67",
        "publication_date": "2002",
        "thesis_type": "phd",
        "thesis_year": "2002"
    },
    {
        "id": "thesis:3236",
        "collection": "thesis",
        "collection_id": "3236",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-08272001-155016",
        "primary_object_url": {
            "basename": "00ch0.pdf",
            "content": "final",
            "filesize": 144139,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/3236/1/00ch0.pdf",
            "version": "v2.0.0"
        },
        "type": "thesis",
        "title": "Why multicast protocols (don't) scale: an analysis of multipoint algorithms for scalable group communication",
        "author": [
            {
                "family_name": "Schooler",
                "given_name": "Eve Meryl",
                "clpid": "Schooler-Eve-Meryl"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Estrin",
                "given_name": "Deborah",
                "clpid": "Estrin-D"
            },
            {
                "family_name": "Hickey",
                "given_name": "Jason J.",
                "clpid": "Hickey-J-J"
            },
            {
                "family_name": "Bruck",
                "given_name": "Jehoshua",
                "clpid": "Bruck-J"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "With the exponential growth of the Internet, there is a critical need to design efficient, scalable and robust protocols to support the network infrastructure.  A new class of protocols has emerged to address these challenges, and these protocols rely on a few key techniques, or micro-algorithms, to achieve scalability.  By scalability, we mean the ability of groups of communicating processes to grow very large in size.  We study the behavior of several of these fundamental techniques that appear in many deployed and emerging Internet standards:  Suppression, Announce-Listen, and Leader Election.\r\n\r\nThese algorithms are based on the principle of efficient multipoint communication, often in combination with periodic messaging.  We assume a loosely-coupled communication model, where acknowledged messaging among groups of processes is not required.  Thus, processes infer information from the periodic receipt or loss of messages from other processes.\r\n\r\nWe present an analysis, validated by simulation, of the performance tradeoffs of each of these techniques.  Toward this end, we derive a series of performance metrics that help us to evaluate these algorithms under lossy conditions:  expected response time, network usage, memory overhead, consistency attainable, and convergence time.  In addition, we study the impact of both correlated and uncorrelated loss on groups of communicating processes.\r\n\r\nAs a result, this thesis provides insights into the scalability of multicast protocols that rely upon these techniques.  We provide a systematic framework for calibrating as well as predicting protocol behavior over a range of operating conditions.  In the process, we establish a general methodology for the analysis of these and other scalability techniques.  Finally, we explore a theory of composition; if we understand the behavior of these micro-algorithms, then we can bound analytically the performance of the more complex algorithms that rely upon them.",
        "doi": "10.7907/44QZ-R465",
        "publication_date": "2001",
        "thesis_type": "phd",
        "thesis_year": "2001"
    },
    {
        "id": "thesis:6147",
        "collection": "thesis",
        "collection_id": "6147",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:10152010-145548970",
        "primary_object_url": {
            "basename": "Nystrom_m_2001.pdf",
            "content": "final",
            "filesize": 7148606,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/6147/1/Nystrom_m_2001.pdf",
            "version": "v5.0.0"
        },
        "type": "thesis",
        "title": "Asynchronous Pulse Logic",
        "author": [
            {
                "family_name": "Nystr\u00f6m",
                "given_name": "Mika",
                "clpid": "Nystr\u00f6m-Mika"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "DeHon",
                "given_name": "Andre",
                "clpid": "DeHon-A"
            },
            {
                "family_name": "Manohar",
                "given_name": "Rajit",
                "clpid": "Manohar-R"
            },
            {
                "family_name": "Hajimiri",
                "given_name": "Ali",
                "clpid": "Hajimiri-A"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>This thesis explores a new way of computing with CMOS digital circuits, single-track\u2014handshake asynchronous pulse-logic (STAPL). These circuits are similar to quasi delay-insensitive (QDI) circuits, but the normal four-phase QDI handshake is replaced with a simpler two-phase pulsed handshake. While a delay-insensitive two-phase handshake requires complicated decoding circuits, the pulsed handshake maintains the simpler, electrically beneficial signaling senses of four-phase handshaking by using timing assumptions that are easy to meet.</p>\r\n\r\n<p>We cover many aspects of designing moderately large digital systems out of STAPL circuits, from the communicating-process level to the production-rule and transistor level.</p>\r\n\r\n<p>We study the theory of operation of pulsed asynchronous circuits, starting with simple pulse repeaters; hence we progress to a general theory of operation for pulsed asynchronous circuits. This theory is a generalization of the theory of operation of synchronous digital circuits.</p>\r\n\r\n<p>We then develop the family of STAPL circuits. This is a complete family of dataflow processes: the presented circuits can compute unconditionally as well as conditionally; they can also store state and arbitrate.</p>\r\n\r\n<p>Next, we present some aspects of automatic design-tools for compiling from a higher-level description to STAPL circuits. Many of these aspects apply equally well to tools for QDI circuits; in particular, we study boolean-simplification operations that may be used for improving the performance of slack-elastic asynchronous systems.</p>\r\n\r\n<p>Finally, a simple 32-bit microprocessor is presented as a demonstration that the circuits and design methods work as described. Comparisons are made, mainly with QDI asynchronous design-styles: SPICE simulations in 0.6-\u00b5m CMOS suggest that a system built out of automatically compiled STAPL circuits performs at about three times higher throughput (650-700 MHz in 0.6-\u00b5m CMOS) compared with a similar system built out of carefully hand-compiled QDI circuits; the STAPL system uses about twice the energy per operation and twice the area; in other words, the STAPL system improves on the QDI system by four to five times as measured by the Et^2 and At^2 metrics.</p>",
        "doi": "10.7907/B107-MW15",
        "publication_date": "2001",
        "thesis_type": "phd",
        "thesis_year": "2001"
    },
    {
        "id": "thesis:3858",
        "collection": "thesis",
        "collection_id": "3858",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-10022001-201911",
        "primary_object_url": {
            "basename": "thesis.pdf",
            "content": "final",
            "filesize": 827364,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/3858/1/thesis.pdf",
            "version": "v2.0.0"
        },
        "type": "thesis",
        "title": "Analog Computation and Learning in VLSI",
        "author": [
            {
                "family_name": "Koosh",
                "given_name": "Vincent Frank",
                "clpid": "Koosh-Vincent-Frank"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Goodman",
                "given_name": "Rodney M.",
                "clpid": "Goodman-R-M"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Goodman",
                "given_name": "Rodney M.",
                "clpid": "Goodman-R-M"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Diorio",
                "given_name": "Christopher J.",
                "clpid": "Diorio-Christopher-J"
            },
            {
                "family_name": "Koch",
                "given_name": "Christof",
                "orcid": "0000-0001-6482-8067",
                "clpid": "Koch-C"
            },
            {
                "family_name": "Bruck",
                "given_name": "Jehoshua",
                "orcid": "0000-0001-8474-0812",
                "clpid": "Bruck-J"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>Nature has evolved highly advanced systems capable of performing complex computations, adaptation, and learning using analog components. Although digital systems have significantly surpassed analog systems in terms of performing precise, high speed, mathematical computations, digital systems cannot outperform analog systems in terms of power. Furthermore, nature has evolved techniques to deal with imprecise analog components by using redundancy and massive connectivity. In this thesis, analog VLSI circuits are presented for performing arithmetic functions and for implementing neural networks. These circuits draw on the power of the analog building blocks to perform low power and parallel computations.</p>\r\n\r\n<p>The arithmetic function circuits presented are based on MOS transistors operating in the subthreshold region with capacitive dividers as inputs to the gates. Because the inputs to the gates of the transistors are floating, digital switches are used to dynamically reset the charges on the floating gates to perform the computations. Circuits for performing squaring, square root, and multiplication/division are shown. A circuit that performs a vector normalization, based on cascading the preceding circuits, is shown to display the ease with which simpler circuits may be combined to obtain more complicated functions. Test results are shown for all of the circuits.</p>\r\n\r\n<p>Two feedforward neural network implementations are also presented. The first uses analog synapses and neurons with a digital serial weight bus. The chip is trained in loop with the computer performing control and weight updates. By training with the chip in the loop, it is possible to learn around circuit offsets. The second neural network also uses a computer for the global control operations, but all of the local operations are performed on chip. The weights are implemented digitally, and counters are used to adjust them. A parallel perturbative weight update algorithm is used. The chip uses multiple, locally generated, pseudorandom bit streams to perturb all of the weights in parallel. If the perturbation causes the error function to decrease, the weight change is kept, otherwise it is discarded. Test results are shown of both networks successfully learning digital functions such as AND and XOR.</p>",
        "doi": "10.7907/9B65-TB43",
        "publication_date": "2001",
        "thesis_type": "phd",
        "thesis_year": "2001"
    },
    {
        "id": "thesis:3095",
        "collection": "thesis",
        "collection_id": "3095",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-08112005-114144",
        "primary_object_url": {
            "basename": "Manohar_r_1998.pdf",
            "content": "final",
            "filesize": 7772553,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/3095/1/Manohar_r_1998.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "The impact of asynchrony on computer architecture",
        "author": [
            {
                "family_name": "Manohar",
                "given_name": "Rajit",
                "clpid": "Manohar-R"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Barr",
                "given_name": "Alan H.",
                "clpid": "Barr-A-H"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            },
            {
                "family_name": "Abu-Mostafa",
                "given_name": "Yaser S.",
                "clpid": "Abu-Mostafa-Y-S"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "The performance characteristics of asynchronous circuits are quite different from those of their synchronous counterparts. As a result, the best asynchronous design of a particular system does not necessarily correspond to the best synchronous design, even at the algorithmic level. The goal of this thesis is to examine certain aspects of computer architecture and design in the context of an asynchronous VLSI implementation.\n\nWe present necessary and sufficient conditions under which the degree of pipelining of a component can be modified without affecting the correctness of an asynchronous computation.\n\nAs an instance of the improvements possible using an asynchronous architecture, we present circuits to solve the prefix problem with average-case behavior better than that possible by any synchronous solution in the case when the prefix operator has a right zero. We show that our circuit implementations are area-optimal given their performance characteristics, and have the best possible average-case latency.\n\nAt the level of processor design, we present a mechanism for the implementation of precise exceptions in asynchronous processors. The novel feature of this mechanism is that it permits the presence of a data-dependent number of instructions in the execution pipeline of the processor.\n\nFinally, at the level of processor architecture, we present the architecture of a processor with an independent instruction stream for branches. The instruction set permits loops and function calls to be executed with minimal control-flow overhead.",
        "doi": "10.7907/xzwa-p598",
        "publication_date": "1999",
        "thesis_type": "phd",
        "thesis_year": "1999"
    },
    {
        "id": "thesis:321",
        "collection": "thesis",
        "collection_id": "321",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-01242008-074143",
        "primary_object_url": {
            "basename": "Massingill_bl_1998.pdf",
            "content": "final",
            "filesize": 5884120,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/321/1/Massingill_bl_1998.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "A structured approach to parallel programming",
        "author": [
            {
                "family_name": "Massingill",
                "given_name": "Berna Linda",
                "clpid": "Massingill-B-L"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Meiron",
                "given_name": "Daniel I.",
                "clpid": "Meiron-D-I"
            },
            {
                "family_name": "Van de Velde",
                "given_name": "Eric",
                "clpid": "van-de-Velde-E"
            },
            {
                "family_name": "Arvo",
                "given_name": "James R.",
                "clpid": "Arvo-J-R"
            },
            {
                "family_name": "Abu-Mostafa",
                "given_name": "Yaser S.",
                "clpid": "Abu-Mostafa-Y-S"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "Parallel programs are more difficult to develop and reason about than sequential programs. There are two broad classes of parallel programs: (1) programs whose specifications describe ongoing behavior and interaction with an environment, and (2) programs whose specifications describe the relation between initial and final states. This thesis presents a simple, structured approach to developing parallel programs of the latter class that allows much of the work of development and reasoning to be done using the same techniques and tools used for sequential programs. In this approach, programs are initially developed in a primary programming model that combines the standard sequential model with a restricted form of parallel composition that is semantically equivalent to sequential composition. Such programs can be reasoned about using sequential techniques and executed sequentially for testing. They are then transformed for execution on typical parallel architectures via a sequence of semantics-preserving transformations, making use of two secondary programming models, both based on parallel composition with barrier synchronization and one incorporating data partitioning. The transformation process for a particular program is typically guided and assisted by a parallel programming archetype, an abstraction that captures the commonality of a class of programs with similar computational features and provides a class-specific strategy for producing efficient parallel programs. Transformations may be applied manually or via a parallelizing compiler. Correctness of transformations within the primary programming model is proved using standard sequential techniques. Correctness of transformations between the programming models and between the models and practical programming languages is proved using a state-transition-based operational model.\n\nThis thesis presents: (1) the primary and secondary programming models, (2) an operational model that provides a common framework for reasoning about programs in all three models, (3) a collection of example program transformations with arguments for their correctness, and (4) two groups of experiments in which our overall approach was used to develop example applications. The specific contribution of this work is to present a unified theory/practice framework for this approach to parallel program development, tying together the underlying theory, the program transformations, and the program-development methodology.\n",
        "doi": "10.7907/5ma9-h225",
        "publication_date": "1998",
        "thesis_type": "phd",
        "thesis_year": "1998"
    },
    {
        "id": "thesis:341",
        "collection": "thesis",
        "collection_id": "341",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-01252008-095244",
        "primary_object_url": {
            "basename": "Sivilotti_pag_1998.pdf",
            "content": "final",
            "filesize": 6977574,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/341/1/Sivilotti_pag_1998.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "A method for the specification, composition, and testing of distributed object systems",
        "author": [
            {
                "family_name": "Sivilotti",
                "given_name": "Paolo A. G.",
                "clpid": "Sivilotti-P-A-G"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Arvo",
                "given_name": "James R.",
                "clpid": "Arvo-J-R"
            },
            {
                "family_name": "Bagrodia",
                "given_name": "Rajive",
                "clpid": "Bagrodia-R"
            },
            {
                "family_name": "Abu-Mostafa",
                "given_name": "Yaser S.",
                "clpid": "Abu-Mostafa-Y-S"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "The formation of a distributed system from a collection of individual components requires the ability for components to exchange syntactically well-formed messages. Several technologies exist that provide this fundamental functionality, as well as the ability to locate components dynamically based on syntactic requirements. The formation of a correct distributed system requires, in addition, that these interactions between components be semantically well-formed. The method presented in this thesis is intended to assist in the development of correct distributed systems.\n\nWe present a specification methodology based on three fundamental operators from temporal logic: initially, next, and transient. From these operators we derive a collection of higher-level operators that are used for component specification. The novel aspect of our specification methodology is that we require that these operators be used in the following restricted manner:\n\n\u2022A specification statement can refer only to properties that are local to a single component.\n\u2022A single component must be able to guarantee unilaterally the validity of the specification statement for any distributed system of which it is a part.  Specification statements that conform to these two restrictions we call certificates.\n\nThe first restriction is motivated by our desire for these component specifications to be testable in a relatively efficient manner. In fact, we describe a set of simplified certificates that can be translated into a testing harness by a simple parser with very little programmer intervention. The second restriction is motivated by our desire for a simple theory of composition: If a certificate is a property of a component, that certificate is also a property of any system containing that component.\n\nAnother novel aspect of our methodology is the introduction of a new temporal operator that combines both safety and progress properties. The concept underlying this operator has been used implicitly before; but by extracting this concept into a first-class operator, we are able to prove several new theorems about such properties. We demonstrate the utility of this operator and of our theorems by using them to simplify several proofs.\n\nThe restrictions imposed on certificates are severe. Although they have pleasing consequences as described above, they can also lead to lengthy proofs of system properties that are not simple conjunctions. To compensate for this difficulty, we introduce collections of certificates that we call services. Services facilitate proof reuse by encapsulating common component interactions used to establish various system properties.\n\nWe experiment with our methodology by applying it to several extended examples. These experiments illustrate the utility of our approach and convince us of the practicality of component-based distributed system development. This thesis addresses three parts of the development cycle for distributed object systems: (i) the specification of systems and components, (ii) the compositional reasoning used to verify that a collection of components satisfy a system specification, and (iii) the validation of component implementations.\n",
        "doi": "10.7907/z89g-gm27",
        "publication_date": "1998",
        "thesis_type": "phd",
        "thesis_year": "1998"
    },
    {
        "id": "thesis:91",
        "collection": "thesis",
        "collection_id": "91",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-01092008-085128",
        "primary_object_url": {
            "basename": "Boahen_ka_1997.pdf",
            "content": "final",
            "filesize": 15731637,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/91/1/Boahen_ka_1997.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "Retinomorphic vision systems : reverse engineering the vertebrate retina",
        "author": [
            {
                "family_name": "Boahen",
                "given_name": "Kwabena Adu",
                "clpid": "Boahen-Kwabena-Adu"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Mead",
                "given_name": "Carver",
                "orcid": "0000-0003-4051-0462",
                "clpid": "Mead-C-A"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Koch",
                "given_name": "Christof",
                "orcid": "0000-0001-6482-8067",
                "clpid": "Koch-C"
            },
            {
                "family_name": "Andersen",
                "given_name": "Richard A.",
                "orcid": "0000-0002-7947-0472",
                "clpid": "Andersen-R-A"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Mead",
                "given_name": "Carver",
                "orcid": "0000-0003-4051-0462",
                "clpid": "Mead-C-A"
            },
            {
                "family_name": "Sterling",
                "given_name": "P.",
                "clpid": "Sterling-P"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "This thesis seeks to explain how the retina satisfies both top-down constraints (functional) and the bottom-up constraints (structural) by analyzing simple physical models of the retina and mimicking its structure and function in silicon. In particular, I examine spatiotemporal filtering in the outer plexiform layer of the vertebrate retina, and show how outer retina processing is augmented by further processing in the inner plexiform layer, creating an efficient implementation that encodes moving stimuli efficiently over a wide range of speeds.\r\n\r\nMy working hypothesis is that biological sensory systems seek to optimize both functional and structural constraints. On the functional side, they must maximize information uptake from the environment while they minimize redundancy in their outputs. On the structural side, they must maximize resolving power in space and time, by making the processing elements small and fast, while they minimize wiring and energy consumption. If structure and function did indeed coevolve, as I assume, studying how structural and functional constraints are optimized simultaneously is our only hope of understanding why nature picks the solutions that we observe.\r\n\r\nAddressing both structural and functional constraints requires combining science and engineering. Scientists study an existing structure, and seek to understand how it functions in an optimal or near-optimal fashion, based on theoretical grounds. Rarely does a scientist ask: Will the structure be more cost effective, more reliable, or more reproducible if a less-than-optimum function is chosen? Engineers, on the other hand, design an optimal implementation for some desired function, based on an existing set of standard primitives. Rarely does an engineer ask: Is this the most natural set of primitives to use for this particular function? Thus, neither discipline attempts to optimize both function and structure globally. In contrast, evolution, operating in a purely opportunistic fashion, continuously seeks increasingly elegant solutions that meet these constraints.\r\n\r\nFor these reasons, I have adopted a multidisciplinary engineering-science approach that combines analysis with synthesis. When tailored synergestically, this approach can shed light on questions about which neurobiologists care, while advancing the state of the art in sensory-system design.",
        "doi": "10.7907/96W6-N605",
        "publication_date": "1997",
        "thesis_type": "phd",
        "thesis_year": "1997"
    },
    {
        "id": "thesis:4036",
        "collection": "thesis",
        "collection_id": "4036",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-10112007-083903",
        "primary_object_url": {
            "basename": "Hofstee_hp_1995.pdf",
            "content": "final",
            "filesize": 4307477,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/4036/1/Hofstee_hp_1995.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "Synchronizing processes",
        "author": [
            {
                "family_name": "Hofstee",
                "given_name": "H. Peter",
                "clpid": "Hofstee-H.-Peter"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Van de Snepscheut",
                "given_name": "Jan L. A.",
                "clpid": "Van-de-Snepscheut-J-L-A"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Van de Snepscheut",
                "given_name": "Jan L. A.",
                "clpid": "Van-de-Snepscheut-J-L-A"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            },
            {
                "family_name": "Bagrodia",
                "given_name": "Rajive",
                "clpid": "Bagrodia-R"
            },
            {
                "family_name": "Abu-Mostafa",
                "given_name": "Yaser S.",
                "clpid": "Abu-Mostafa-Y-S"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "In this monograph we develop a mathematical theory for a concurrent language based on angelic and demonic nondeterminism. An underlying model is defined with sets of sets of sequences of synchronization actions. A refinement relation is defined for the model, and equivalence classes under this relation are identified with processes. Processes, together with the refinement relation, form a complete distributive lattice.\r\n\r\n\tWe define a language with parallel composition, sequential composition, angelic and demonic nondeterminism, and an operator that connects pairs of synchronization actions into synchronization statements and hides these actions from observation. Also, angelic and demonic iteration are defined. All operators are monotonic with respect to the refinement ordering. Many algebraic properties are proven from these definitions. We study duals of processes and prove that they can be related to the most demonic environment in which a process will not deadlock. We give a simple example to illustrate the use of duals.\r\n\r\n\tWe study classes of programs for which angelic choice can be implemented by probing the environment for its next action. To this end specifications of processes are extended with simple conditions on the environment. We give a more elaborate example to illustrate the use of these conditions and the compositionality of the method.\r\n\r\n\tFinally we briefly introduce an operational model that describes implementable processes only. This model mentions probes explicitly. Such a model may form a basis for a language that is less restrictive than ours, but that will also have less attractive algebraic properties.\r\n",
        "doi": "10.7907/G620-GG65",
        "publication_date": "1995",
        "thesis_type": "phd",
        "thesis_year": "1995"
    },
    {
        "id": "thesis:4136",
        "collection": "thesis",
        "collection_id": "4136",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-10172007-090528",
        "primary_object_url": {
            "basename": "Lee_tk_1995.pdf",
            "content": "final",
            "filesize": 6208750,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/4136/1/Lee_tk_1995.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "A General Approach to Performance Analysis and Optimization of Asynchronous Circuits",
        "author": [
            {
                "family_name": "Lee",
                "given_name": "Tak Kwan",
                "clpid": "Lee-Tak-Kwan"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            },
            {
                "family_name": "Goodman",
                "given_name": "Rodney M.",
                "clpid": "Goodman-R-M"
            },
            {
                "family_name": "Burns",
                "given_name": "Steven",
                "clpid": "Burns-S"
            },
            {
                "family_name": "Abu-Mostafa",
                "given_name": "Yaser S.",
                "clpid": "Abu-Mostafa-Y-S"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "A systematic approach for evaluating and optimizing the performance of asynchronous VLSI circuits is presented. Index-priority simulation is introduced to efficiently find minimal cycles in the state graph of a given circuit. These minimal cycles are used to determine the causality relationships between all signal transitions in the circuit. Once these relationships are known, the circuit is then modeled as an extended event-rule system, which can be used to describe many circuits, including ones that are inherently disjunctive. An accurate indication of the performance of the circuit is obtained by analytically computing the period of the corresponding extended event-rule system.\r\n",
        "doi": "10.7907/ehzs-y537",
        "publication_date": "1995",
        "thesis_type": "phd",
        "thesis_year": "1995"
    },
    {
        "id": "thesis:4110",
        "collection": "thesis",
        "collection_id": "4110",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-10162007-093427",
        "primary_object_url": {
            "basename": "Van_der_goot_mr_1995.pdf",
            "content": "final",
            "filesize": 6021129,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/4110/1/Van_der_goot_mr_1995.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "Semantics of VLSI synthesis",
        "author": [
            {
                "family_name": "Van der Goot",
                "given_name": "Marcel Rene",
                "clpid": "Van-der-Goot-M-R"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Sanders",
                "given_name": "Beverly",
                "clpid": "Sanders-B"
            },
            {
                "family_name": "Hofstee",
                "given_name": "H. Peter",
                "clpid": "Hofstee-H-P"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            },
            {
                "family_name": "Abu-Mostafa",
                "given_name": "Yaser S.",
                "clpid": "Abu-Mostafa-Y-S"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "We develop a new form of formal operational semantics, suitable for concurrent programming languages. The semantics directly supports sequential and parallel composition, rendezvous synchronization, shared variables, and non-determinism. Based on an abstract notion of program execution, a refinement relation is defined. We show how the refinement relation can be used to prove that one program implements another.\r\n\r\nWe use the operational semantics as a semantic framework for a synthesis method for asynchronous VLSI circuits. We define the semantics of the programming notations that are used, and use the refinement relation to prove the correctness of the program transformations that form the basis of the synthesis method. Among other transformations, we proof the correctness of the replacement of atomic synchronization actions by handshake protocols, and the transformation of a sequence of actions into a network of concurrently executing gates.\r\n",
        "doi": "10.7907/SR5V-KT18",
        "publication_date": "1995",
        "thesis_type": "phd",
        "thesis_year": "1995"
    },
    {
        "id": "thesis:4114",
        "collection": "thesis",
        "collection_id": "4114",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-10162007-111256",
        "primary_object_url": {
            "basename": "Leino_krm_1995.pdf",
            "content": "final",
            "filesize": 7306673,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/4114/1/Leino_krm_1995.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "Toward reliable modular programs",
        "author": [
            {
                "family_name": "Leino",
                "given_name": "K. Rustan M.",
                "clpid": "Leino-K-M"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Van de Snepscheut",
                "given_name": "Jan L. A.",
                "clpid": "Van-de-Snepscheut-J-L-A"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            },
            {
                "family_name": "Nelson",
                "given_name": "Greg",
                "clpid": "Nelson-G"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Van de Snepscheut",
                "given_name": "Jan L. A.",
                "clpid": "Van-de-Snepscheut-J-L-A"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Sanders",
                "given_name": "Beverly",
                "clpid": "Sanders-B"
            },
            {
                "family_name": "Nelson",
                "given_name": "Greg",
                "clpid": "Nelson-G"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            },
            {
                "family_name": "Wilson",
                "given_name": "Richard M.",
                "clpid": "Wilson-R-M"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "Software is being applied in an ever-increasing number of areas. Computer programs and systems are becoming more complex and consisting of more delicately interconnected components. Errors surfacing in programs are still a conspicuous and costly problem. It's about time we employ some techniques that guide us toward higher reliability of practical programs. The goal of this thesis is just that.\n\nThis thesis presents a theory for verifying programs based on Dijkstra's weakest-precondition calculus. A variety of program paradigms used in practice, such as exceptions, procedures, object orientation, and modularity, are dealt with.\n\nThe thesis sheds new light on the theory behind programs with exceptions. It develops an elegant algebra, and shows it to be the foundation on which the semantics of exceptions rests. It develops a trace semantics for programs with exceptions, from which the weakest-precondition semantics is derived. It also proves a theorem on programming methodology relating to exceptions, and applies this theorem in the novel derivation of a simple program.\n\nThe thesis presents a simple model for object-oriented data types, in which concerns have been separated, resulting in the simplicity of the model.\n\nTo deal with large programs, this thesis takes a practical look at modularity and abstraction. It reveals a problem that arises in writing specifications for modular programs where previous techniques fail. The thesis introduces a new specification construct that solves that problem, and gives a formal proof of soundness for modular verification using that construct. The model is a generalization of Hoare's classical data refinement. However, there are more problems to be solved. The thesis reports on some of these problems and suggests some future directions toward more reliable modular programs.\n",
        "doi": "10.7907/ynt2-nn65",
        "publication_date": "1995",
        "thesis_type": "phd",
        "thesis_year": "1995"
    },
    {
        "id": "thesis:3284",
        "collection": "thesis",
        "collection_id": "3284",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-08302007-094049",
        "primary_object_url": {
            "basename": "Ko_tm_1993.pdf",
            "content": "final",
            "filesize": 3124950,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/3284/1/Ko_tm_1993.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "On the VLSI decompositions for complete graphs, DeBruijn graphs, hypercubes, hyperplanes, meshes, and shuffle-exchange graphs",
        "author": [
            {
                "family_name": "Ko",
                "given_name": "Tsz-Mei",
                "clpid": "Ko-T"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "McEliece",
                "given_name": "Robert J.",
                "clpid": "McEliece-R-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "McEliece",
                "given_name": "Robert J.",
                "clpid": "McEliece-R-J"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            },
            {
                "family_name": "Posner",
                "given_name": "Edward C.",
                "clpid": "Posner-E-C"
            },
            {
                "family_name": "Abu-Mostafa",
                "given_name": "Yaser S.",
                "clpid": "Abu-Mostafa-Y-S"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "A C-chip VLSI decomposition of a graph G is a collection of C vertex-disjoint subgraphs of G which together contain all of G's vertices and a subset of its edges. If the vertex-disjoint subgraphs are isomorphic to each other, we call one of these isomorphic subgraphs a building block. The efficiency of a VLSI decomposition is defined to be the fraction of edges of G that are in the subgraphs. In this thesis, motivated by the need to construct large Viterbi decoders, we study VLSI decompositions for deBruijn graphs. We obtain some strong necessary conditions for a graph to be a building block for a deBruijn graph, and some slightly more restrictive sufficient conditions which allow us to construct some efficient building blocks for deBruijn graphs. By using the methods described in this thesis, we have found a 64-chip VLSI decomposition of the deBruijn graph B13 with efficiency 0.754. This decomposition is being used by JPL design engineers to build a single-board Viterbi decoder for the K = 15, rate 1/4 convolutional code which will be used on NASA's Galileo mission.\n\nFurthermore, we study VLSI decompositions for the families of complete graphs, hypercubes, hyperplanes, meshes, and shuffle-exchange graphs. In each of these cases, we obtain very efficient or even optimal decompositions. We also prove several general theorems that can be applied to obtain bounds on the efficiencies for VLSI decompositions of other complex graphs. In general, the results presented in this thesis are useful for implementing massively parallel computers.\n",
        "doi": "10.7907/s7w7-a995",
        "publication_date": "1993",
        "thesis_type": "phd",
        "thesis_year": "1993"
    },
    {
        "id": "thesis:2955",
        "collection": "thesis",
        "collection_id": "2955",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-07202007-132706",
        "primary_object_url": {
            "basename": "Hazewindus_pj_1992.pdf",
            "content": "final",
            "filesize": 6917964,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/2955/1/Hazewindus_pj_1992.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "Testing delay-insensitive circuits",
        "author": [
            {
                "family_name": "Hazewindus",
                "given_name": "Pieter Johannes",
                "clpid": "Hazewindus-P-J"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            },
            {
                "family_name": "Van de Snepscheut",
                "given_name": "Jan L. A.",
                "clpid": "Van-de-Snepscheut-J-L-A"
            },
            {
                "family_name": "Abu-Mostafa",
                "given_name": "Yaser S.",
                "clpid": "Abu-Mostafa-Y-S"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "A method is developed to test delay-insensitive circuits, using the single stuck-at fault model. These circuits are synthesized from a high-level specification. Since the circuits are hazard-free by construction, there is no test for hazards in the circuit. Most faults cause the circuit to halt during test, since they cause an acknowledgement not to occur when it should. There are stuck-at faults that do not cause the circuit to halt under any condition. These are stimulating faults; they cause a premature firing of a production rule. For such a stimulating fault to be testable, the premature firing has to be propagated to a primary output. If this is not guaranteed to occur, then one or more test points have to be added to the circuit. Any stuck-at fault is testable, with the possible addition of test points. For combinational delay-insensitive circuits, finding test vectors is reduced to the same problem as for synchronous combinational logic. For sequential circuits, the synthesis method is used to find a test for each fault efficiently, to find the location of the test points, and to find a test that detects all faults in a circuit.\r\n\r\nThe number of test points needed to fully test the circuit is very low, and the size of the additional testing circuitry is small. A test derived with a simple transformation of the handshaking expansion yields high fault coverage. Adding tests for the remaining faults results in a small complete test for the circuit.",
        "doi": "10.7907/0d7v-9d09",
        "publication_date": "1992",
        "thesis_type": "phd",
        "thesis_year": "1992"
    },
    {
        "id": "thesis:6663",
        "collection": "thesis",
        "collection_id": "6663",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:09122011-094355148",
        "primary_object_url": {
            "basename": "Mahowald_ma_1992.pdf",
            "content": "final",
            "filesize": 33653072,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/6663/1/Mahowald_ma_1992.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "VLSI Analogs of Neuronal Visual Processing: A Synthesis of Form and Function",
        "author": [
            {
                "family_name": "Mahowald",
                "given_name": "Michelle A. (Misha)",
                "clpid": "Mahowald-Michelle-A-Misha"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Mead",
                "given_name": "Carver",
                "orcid": "0000-0003-4051-0462",
                "clpid": "Mead-C-A"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Mead",
                "given_name": "Carver",
                "orcid": "0000-0003-4051-0462",
                "clpid": "Mead-C-A"
            },
            {
                "family_name": "Perona",
                "given_name": "Pietro",
                "orcid": "0000-0002-7583-5809",
                "clpid": "Perona-P"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Allman",
                "given_name": "John Morgan",
                "clpid": "Allman-J-M"
            },
            {
                "family_name": "Pine",
                "given_name": "Jerome",
                "clpid": "Pine-J"
            }
        ],
        "local_group": [
            {
                "literal": "div_biol"
            }
        ],
        "abstract": "This thesis describes the development and testing of a simple visual system fabricated using complementary metal-oxide-semiconductor (CMOS) very large scale integration\r\n(VLSI) technology. This visual system is composed of three subsystems. A silicon retina, fabricated on a single chip, transduces light and performs signal processing in a manner similar to a simple vertebrate retina. A stereocorrespondence chip uses bilateral retinal input to\r\nestimate the location of objects in depth. A silicon optic nerve allows communication between chips by a method that preserves the idiom of action potential transmission in the\r\nnervous system. Each of these subsystems illuminates various aspects of the relationship between VLSI analogs and their neurobiological counterparts. The overall\r\nsynthetic visual system demonstrates that analog VLSI can capture a significant portion of the function of neural structures at a systems level, and concomitantly, that incorporating neural architectures leads to new engineering approaches to computation in VLSI. The relationship\r\nbetween neural systems and VLSI is rooted in the shared limitations imposed by computing in similar physical media. The systems discussed in this text support the belief that the physical limitations imposed by the computational medium significantly affect the evolving algorithm. Since circuits are essentially physical structures, I advocate the use of analog VLSI as powerful medium of abstraction, suitable for understanding and expressing the function of real neural systems. The working chip elevates the circuit description to a kind of synthetic formalism. The behaving physical circuit provides a formal test of theories of\r\nfunction that can be expressed in the language of circuits.\r\n",
        "doi": "10.7907/4bdw-fg34",
        "publication_date": "1992",
        "thesis_type": "phd",
        "thesis_year": "1992"
    },
    {
        "id": "thesis:2863",
        "collection": "thesis",
        "collection_id": "2863",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-07122007-134330",
        "primary_object_url": {
            "basename": "Sivilotti_ma_1991.pdf",
            "content": "final",
            "filesize": 9071203,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/2863/1/Sivilotti_ma_1991.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "Wiring Considerations in Analog VLSI Systems, with Application to Field-Programmable Networks",
        "author": [
            {
                "family_name": "Sivilotti",
                "given_name": "Massimo Antonio",
                "clpid": "Sivilotti-Massimo-Antonio"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Mead",
                "given_name": "Carver",
                "orcid": "0000-0003-4051-0462",
                "clpid": "Mead-C-A"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Mead",
                "given_name": "Carver",
                "orcid": "0000-0003-4051-0462",
                "clpid": "Mead-C-A"
            },
            {
                "family_name": "Abu-Mostafa",
                "given_name": "Yaser S.",
                "clpid": "Abu-Mostafa-Y-S"
            },
            {
                "family_name": "Barr",
                "given_name": "Alan H.",
                "clpid": "Barr-A-H"
            },
            {
                "family_name": "Bhatt",
                "given_name": "Sandeep Nautam",
                "clpid": "Bhatt-Sandeep-Nautam"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "This thesis develops a theoretical model for the wiring complexity of wide classes of systems, relating the degree of connectivity of a circuit to the dimensionality of its interconnect technology. This model is used to design an efficient, hierarchical interconnection network capable of accommodating large classes of circuits. Predesigned circuit elements can be incorporated into this hierarchy, permitting semi-customization for particular classes of systems (e.g., photoreceptors included on vision chips). A polynomial-time programming algorithm for embedding the desired circuit graph onto the prefabricated routing resources is presented, and is implemented as part of a general design tool for specifying, manipulating and comparing circuit netlists.\r\n\r\nThis thesis presents a system intended to facilitate analog circuit design. At its core is a VLSI chip that is electrically configured in the field by selectively connecting predesigned elements to form a desired circuit, which is then tested electrically. The system may be considered a hardware accelerator for simulation, and its large capacity permits testing system ideas, which is impractical using current means. A fast-turnaround simulator permitting rapid conception and evaluation of circuit ideas is an invaluable aid to developing an understanding of system design in a VLSI context.\r\n\r\nWe have constructed systems using both reconfigurable interconnection switches and laser-programmed interconnect. Prototypes capable of synthesizing circuits consisting of over 1000 transistors have been constructed. The flexibility of the system has been demonstrated, and data from parametric tests have proven the validity of the approach.\r\n\r\nFinally, this thesis presents several new circuits that have become key components in many analog VLSI systems. Fast, dense and provably safe one-phase latches and hierarchical arbiters are presented, as are a low-noise analog switch, an isotropic novelty filter, a dense, active high-resistance element, and a subthreshold differential amplifier with a large linear input range.",
        "doi": "10.7907/stj4-kh72",
        "publication_date": "1991",
        "thesis_type": "phd",
        "thesis_year": "1991"
    },
    {
        "id": "thesis:2835",
        "collection": "thesis",
        "collection_id": "2835",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-07092007-072640",
        "primary_object_url": {
            "basename": "Burns_sm_1991.pdf",
            "content": "final",
            "filesize": 6219416,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/2835/1/Burns_sm_1991.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "Performance analysis and optimization of asynchronous circuits",
        "author": [
            {
                "family_name": "Burns",
                "given_name": "Steven Morgan",
                "clpid": "Burns-Steven-Morgan"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            },
            {
                "family_name": "Van de Snepscheut",
                "given_name": "Jan L. A.",
                "clpid": "Van-de-Snepscheut-J-L-A"
            },
            {
                "family_name": "Franklin",
                "given_name": "Joel N.",
                "clpid": "Franklin-J-N"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "Analytical techniques are developed to determine the performance of asynchronous digital circuits. These techniques can be used to guide the designer during the synthesis of such a circuit, leading to a high-performance, efficient implementation. Optimization techniques are also developed that further improve this implementation by determining the optimal sizes of the low-level devices (CMOS transistors) that compose the circuit.\r\n",
        "doi": "10.7907/kez1-7q52",
        "publication_date": "1991",
        "thesis_type": "phd",
        "thesis_year": "1991"
    },
    {
        "id": "thesis:6862",
        "collection": "thesis",
        "collection_id": "6862",
        "cite_using_url": "https://resolver.caltech.edu/CaltechTHESIS:03222012-091423469",
        "primary_object_url": {
            "basename": "Su_w-k_1990.pdf",
            "content": "final",
            "filesize": 29003250,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/6862/1/Su_w-k_1990.pdf",
            "version": "v5.0.0"
        },
        "type": "thesis",
        "title": "Reactive-Process Programming and Distributed Discrete-Event Simulation",
        "author": [
            {
                "family_name": "Su",
                "given_name": "Wen-King",
                "clpid": "Su-Wen-King"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Sturtevant",
                "given_name": "Bradford",
                "clpid": "Sturtevant-B"
            },
            {
                "family_name": "Van de Velde",
                "given_name": "Eric",
                "clpid": "van-de-Velde-E"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>The same forces that spurred the development of multicomputers - the demand for\r\nbetter performance and economy - are driving the evolution of multicomputers in\r\nthe direction of more abundant and less expensive computing nodes - the direction\r\nof fine-grain multicomputers. This evolution in multicomputer architecture derives\r\nfrom advances in integrated circuit, packaging, and message-routing technologies,\r\nand carries far-reaching implications in programming and applications. This thesis\r\npursues that trend with a balanced treatment of multicomputer programming and\r\napplications. First, a reactive-process programming system - Reactive-C - is\r\ninvestigated; then, a model application- discrete-event simulation - is developed;\r\nfinally, a number of logic-circuit simulators written in the Reactive-C notation are\r\nevaluated.</p>\r\n\r\n<p>One difficulty m multicomputer applications is the inefficiency of many distributed\r\nalgorithms compared to their sequential counterparts. When better formulations\r\nare developed, they often scale poorly with increasing numbers of nodes,\r\nand their beneficial effects eventually vanish when many nodes are used. However,\r\nrules for programming are quite different when nodes are plentiful and cheap: The\r\nprimary concern is to utilize all of the concurrency available in an application, rather\r\nthan to utilize all of the computing cycles available in a machine. We have shown in\r\nour research that it is possible to extract the maximum concurrency of a simulation\r\nsubject, even one as difficult as a logic circuit, when one simulation element is assigned\r\nto each node. Despite the initial inefficiency of a straightforward algorithm,\r\nas the the number of nodes increases, the computation time decreases linearly until\r\nthere are only a few elements in each node. We conclude by suggesting a technique\r\nto further increase the available concurrency when there are many more nodes than\r\nsimulation elements.</p>",
        "doi": "10.7907/9qzd-kv20",
        "publication_date": "1990",
        "thesis_type": "phd",
        "thesis_year": "1990"
    },
    {
        "id": "thesis:630",
        "collection": "thesis",
        "collection_id": "630",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-02132007-153533",
        "type": "thesis",
        "title": "A Framework for Adaptive Routing in Multicomputer Networks",
        "author": [
            {
                "family_name": "Ngai",
                "given_name": "John Yee-Keung",
                "clpid": "Ngai-John-Yee-Keung"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Posner",
                "given_name": "Edward C.",
                "clpid": "Posner-E-C"
            },
            {
                "family_name": "Franklin",
                "given_name": "Joel N.",
                "clpid": "Franklin-J-N"
            },
            {
                "family_name": "Chandy",
                "given_name": "K. Mani",
                "clpid": "Chandy-K-M"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>Message-passing concurrent computers, also known as multicomputers, such as the Caltech Cosmic Cube [47] and its commercial descendents, consist of many computing nodes that interact with each other by sending and receiving messages over communication channels between the nodes. The communication networks of the second-generation machines, such as the Symult Series 2010 and the Intel iPSC2 [2], employ an oblivious wormhole-routing technique that guarantees deadlock freedom. The network performance of this highly evolved oblivious technique has reached a limit of being capable of delivering, under random traffic, a stable maximum sustained throughput of \u2248 45 to 50% of the limit set by the network bisection bandwidth, while maintaining acceptable network latency. This thesis examines the possibility of performing adaptive routing as an approach to further improving upon the performance and reliability of these networks. In an adaptive multipath routing scheme, message trajectories are no longer deterministic, but are continuously perturbed by local message loading. Message packets will tend to follow their shortest-distance routes to destinations in normal traffic loading, but can be detoured to longer but less-loaded routes as local congestion occurs.</p>\r\n\r\n<p>A simple adaptive cut-through packet-switching framework is described, and a number of fundamental issues concerning the theoretical feasibility of the adaptive approach are studied. Freedom of communication deadlock is achieved by following a coherent channel protocol and by applying voluntary misrouting as needed. Packet deliveries are assured by resolving channel-access conflicts according to a priority assignment. Fairness of network access is assured either by sending round-trip packets or by having each node follow a local injection-synchronization protocol.</p>\r\n\r\n<p>The performance behavior of the proposed adaptive cut-through framework is studied with stochastic modeling and analysis, as well as through extensive simulation experiments for the 2D and 3D rectilinear networks. Theoretical bounds on various average network-performance metrics are derived for these rectilinear networks. These bounds provide a standard frame of reference for interpreting the performance results.</p>\r\n\r\n<p>In addition to the potential gain in network performance, the adaptive approach offers the potential for exploiting the inherent path redundancy found in richly connected networks in order to perform fault-tolerant routing. Two convexity-related notions are introduced to characterize the conditions under which our adaptive routing formulation is adequate to provide fault-tolerant routing, with minimal change in routing hardware. The effectiveness of these notions is studied through extensive simulations. The 2D octagonal-mesh network is suggested; this displays excellent fault-tolerant potential under the adaptive routing framework. Both performance and reliability behaviors of the octagonal mesh are studied in detail.</p>\r\n\r\n<p>A number of implementation issues are examined. Encoding schemes for packet headers that admit simple incremental updates while providing all necessary routing information in the first flit of a relatively narrow flit width are developed. A pipelined control structure that allows a packet to cut through an intermediate node with a minimum delay of two cycles is described. A distributed clocking scheme is developed that eliminates the problem of global clock-signal distribution. Under this clocking scheme, the adaptive routers can be tessellated to form a network of arbitrary size.</p>\r\n\r\n<p>[2] W.C. Athas and C.L. Seitz., \"Multicomputers: Message-Passing Concurrent Computers,\" IEEE Computer, August 1988, pp. 9-24.</p>\r\n\r\n<p>[47] C.L. Seitz, \"The Cosmic Cube,\" CACM, Vol. 28, No. 1, January 1985, pp. 22-33.</p>",
        "doi": "10.7907/a01h-0z81",
        "publication_date": "1989",
        "thesis_type": "phd",
        "thesis_year": "1989"
    },
    {
        "id": "thesis:809",
        "collection": "thesis",
        "collection_id": "809",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-02282008-091326",
        "primary_object_url": {
            "basename": "Athas_wc_1987.pdf",
            "content": "final",
            "filesize": 9194124,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/809/1/Athas_wc_1987.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "Fine Grain Concurrent Computations",
        "author": [
            {
                "family_name": "Athas",
                "given_name": "William C., Jr.",
                "clpid": "Athas-William-C-Jr"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            },
            {
                "family_name": "Feynman",
                "given_name": "Richard Phillips",
                "clpid": "Feynman-R-P"
            },
            {
                "family_name": "Franklin",
                "given_name": "Joel N.",
                "clpid": "Franklin-J-N"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Mead",
                "given_name": "Carver",
                "orcid": "0000-0003-4051-0462",
                "clpid": "Mead-C-A"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>This thesis develops a computational model, a programming notation, and a set of programming principles to further and to demonstrate the practicality of programming fine grain concurrent computers.</p>\r\n\r\n<p>Programs are expressed in the computational model as a collection of <i>definitions</i> of autonomous computing agents called <i>objects</i>. In the execution of a program, the objects communicate data and synchronize their actions exclusively by message-passing. An object executes its definition only in response to receiving a message, and its actions may include sending messages, creating new objects, and modifying its own internal state. The number of actions that occur in response to a message is finite; Turing computability is achieved not within a single object, but through the interaction of objects.</p>\r\n\r\n<p>A new concurrent programming notation <i>Cantor</i> is used to demonstrate the cognitive process of writing programs using the object model. Programs for numerical sieves, sorting, the eight queens problem, and Gaussian elimination are fully described. Each of these programs involve up to thousands of objects in their exectuion. The general programming strategy is to first partition objects by their overall behavior and then to program the behaviors to be self-organizing.</p>\r\n\r\n<p>The semantics of Cantor are made precise through the definition of a formal semantics for Cantor and the object model. Objects are modelled as finite automata. The formal semantics is useful for proving program properties and for building frameworks to capture specific properties of object programs. The mathematical frameworks are constructed for building object graphs independently of program execution and for systematically removing objects that are irrelevant to program execution (garbage collection).</p>\r\n\r\n<p>The formal semantics are complemented by experiments that allow one to study the dynamics of the execution of Cantor programs on fine grain concurrent computers. The clean semantics of Cantor suggests simple metrics for evaluating the execution of concurrent programs for an ideal, abstract implementation. Program performance is also evaluated for environments where computing resources are limited. From the results of these experiments, hardware and software architectures for organizing fine grain message- passing computations is proposed, including support for fault tolerance and for garbage collection.</p>",
        "doi": "10.7907/63xc-r308",
        "publication_date": "1987",
        "thesis_type": "phd",
        "thesis_year": "1987"
    },
    {
        "id": "thesis:811",
        "collection": "thesis",
        "collection_id": "811",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-02282008-111427",
        "primary_object_url": {
            "basename": "Choo_y_1987.pdf",
            "content": "final",
            "filesize": 4187282,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/811/1/Choo_y_1987.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "Logic from Programming Language Semantics",
        "author": [
            {
                "family_name": "Choo",
                "given_name": "Young-il",
                "clpid": "Choo-Young-il"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Kajiya",
                "given_name": "James Thomas",
                "clpid": "Kajiya-J-T"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Kajiya",
                "given_name": "James Thomas",
                "clpid": "Kajiya-J-T"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Thompson",
                "given_name": "Frederick B.",
                "clpid": "Thompson-F-B"
            },
            {
                "family_name": "Kechris",
                "given_name": "Alexander S.",
                "clpid": "Kechris-A-S"
            },
            {
                "family_name": "Moschovakis",
                "given_name": "Yiannis N.",
                "clpid": "Moschovakis-Yiannis-N"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>Logic for reasoning about programs must proceed from the programming language semantics. It is our thesis that programs be considered as mathematical objects that can be reasoned about directly, rather than as linguistic expressions whose meanings are embedded in an intermediate formalism.</p>\r\n\r\n<p>Since the semantics of many programming language features (including recursion, type-free application, infinite structures, self-reference, and reflection) require models that are constructed as limits of partial objects, a logic for dealing with partial objects is required.</p>\r\n\r\n<p>Using the <i>D<sub>\u221e</sub></i> model of the \u03bb-calculus, a logic (called <i>continuous logic</i>) for reasoning about partial objects is presented. In continuous logic, the logical operations (negation, implication, and quantification) are defined for each of the finite levels and then extended to the limit, giving us a model of type-free logic.</p>\r\n\r\n<p>The triples of Hoare Logic are interpreted as partial assertions over the domain of partial states, and contradictions arising from rules for function definitions are analyzed. Recursive procedures and recursive functions are both proved using mathematical induction.</p>\r\n\r\n<p>A domain of infinite lists is constructed as a model for languages with lazy evaluation and it is compared to an ordinal-heirarchic construction. A model of objects and multiple inheritance is constructed where objects are self-referential states and multiple inheritance is defined using the notion of product of classes. The reflective processor for a language with environment and continuation reflection is constructed as the projective limit of partial reflective processors of finite height.</p>",
        "doi": "10.7907/r9hf-1b88",
        "publication_date": "1987",
        "thesis_type": "phd",
        "thesis_year": "1987"
    },
    {
        "id": "thesis:1154",
        "collection": "thesis",
        "collection_id": "1154",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-03262008-092115",
        "primary_object_url": {
            "basename": "El-hammamsy_ss_1986.pdf",
            "content": "final",
            "filesize": 5929842,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/1154/1/El-hammamsy_ss_1986.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "Coupled-Inductor Inversion, Rectification and Cycloconversion",
        "author": [
            {
                "family_name": "El-Hamamsy",
                "given_name": "Sayed-Amr",
                "clpid": "El-Hamamsy-Sayed-Amr"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Middlebrook",
                "given_name": "Robert David",
                "clpid": "Middlebrook-R-D"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Middlebrook",
                "given_name": "Robert David",
                "clpid": "Middlebrook-R-D"
            },
            {
                "family_name": "Caughey",
                "given_name": "Thomas Kirk",
                "clpid": "Caughey-T-K"
            },
            {
                "family_name": "Martel",
                "given_name": "Hardy Cross",
                "clpid": "Martel-H-C"
            },
            {
                "family_name": "Thompson",
                "given_name": "Peter M.",
                "clpid": "Thompson-Peter-M"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>A new PWM approach using three state switching of negatively coupled inductors can be applied to any basic dc-to-dc converter to form single-phase dc-to-ac inverters. Current reference programming gives improvements in linearity, small-signal dynamics, and pulsed-load response. The current programming loops of the flyback and boost inverters are stable at all operating points. New multiple output magnetic structures are introduced that apply space and time multiplexing of magnetics to give non-interacting outputs. The magnetics are analysed for different operating conditions. The inverters with two independent outputs are derived by use of the multiple output magnetics. These are used to form the three-phase versions of the inverters, The corresponding three-phase ac-to-dc rectifiers are also derived with close to ideal current waveforms as well as power factor correction capabilities. Finally, to complete the family of power converters the polyphase ac-to-ac cycloconverters are derived incorporating the qualities of both the inverters and the rectifiers.</p>\r\n",
        "doi": "10.7907/BHKX-4J16",
        "publication_date": "1986",
        "thesis_type": "phd",
        "thesis_year": "1986"
    },
    {
        "id": "thesis:1023",
        "collection": "thesis",
        "collection_id": "1023",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-03192008-143903",
        "primary_object_url": {
            "basename": "Li_pp_1986.pdf",
            "content": "final",
            "filesize": 8579970,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/1023/1/Li_pp_1986.pdf",
            "version": "v3.0.0"
        },
        "type": "thesis",
        "title": "A Parallel Execution Model for Logic Programming",
        "author": [
            {
                "family_name": "Li",
                "given_name": "Peyyun Peggy",
                "clpid": "Li-Peyyun-Peggy"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            },
            {
                "family_name": "Abu-Mostafa",
                "given_name": "Yaser S.",
                "clpid": "Abu-Mostafa-Y-S"
            },
            {
                "family_name": "Van de Snepscheut",
                "given_name": "Jan L. A.",
                "clpid": "Van-de-Snepscheut-J-L-A"
            },
            {
                "family_name": "Kechris",
                "given_name": "Alexander S.",
                "orcid": "0000-0002-2226-0423",
                "clpid": "Kechris-A-S"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>The Sync Model, a parallel execution method for logic programming, is proposed. The Sync Model is a multiple-solution data-driven model that realizes AND-parallelism and OR-parallelism in a logic program assuming a message-passing multiprocessor system. AND parallelism is implemented by constructing a dynamic data flow graph of the literals in the clause body with an ordering algorithm. OR parallelism is achieved by adding special Synchronization signals to the stream of partial solutions and synchronizing the multiple streams with a merge algorithm.</p>\r\n\r\n<p>The Sync Model is proved to be sound and complete. Soundness means it only generates correct solutions and completeness means it generates all the correct solutions. The soundness and completeness of the Sync Model are implied by the correctness of the merge algorithm.</p>\r\n\r\n<p>A new class of interconnection networks, the Sneptree, is also presented. The Sneptree is an augmented complete binary tree which can simulate an unbounded complete binary tree optimally. Amongst different connection patterns of the Sneptree, some are regular and extensible so as to be well suited for VLSI implementation. A recursive method is presented to generate the H-structure layout of one type of the Sneptree, called the Cyclic Sneptree.  A message routing algorithm between any two leaf nodes of the Cyclic Sneptree is also presented. The routing algorithm, which is of O(n) complexity, gives a good approximation to the shortest path.</p>\r\n\r\n<p>The Sneptree is an ideal architecture for the Sync model, in which a dynamic process tree is constructed. With a simple mapping algorithm, the Sync Model can be mapped onto the Sneptree with highly-balanced load and low overhead.</p>\r\n",
        "doi": "10.7907/2ngs-bp80",
        "publication_date": "1986",
        "thesis_type": "phd",
        "thesis_year": "1986"
    },
    {
        "id": "thesis:1122",
        "collection": "thesis",
        "collection_id": "1122",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-03252008-140428",
        "type": "thesis",
        "title": "A VLSI Architecture for Concurrent Data Structures",
        "author": [
            {
                "family_name": "Dally",
                "given_name": "William James",
                "clpid": "Dally-William-James"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            },
            {
                "family_name": "Bryant",
                "given_name": "Randal E.",
                "clpid": "Bryant-R"
            },
            {
                "family_name": "Feynman",
                "given_name": "Richard Phillips",
                "clpid": "Feynman-R-P"
            },
            {
                "family_name": "Kajiya",
                "given_name": "James Thomas",
                "clpid": "Kajiya-J-T"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "McEliece",
                "given_name": "Robert J.",
                "clpid": "McEliece-R-J"
            }
        ],
        "local_group": [
            {
                "literal": "Caltech Distinguished Alumni Award"
            },
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>Concurrent data structures simplify the development of concurrent programs by encapsulating commonly used mechanisms for synchronization and communication into data structures. This thesis develops a notation for describing concurrent data structures, presents examples of concurrent data structures, and describes an architecture to support concurrent data structures.</p>\r\n\r\n<p>Concurrent Smailtalk (CST), a derivative of Smailtalk-80 with extensions for concurrency, is developed to describe concurrent data structures. CST allows the programmer to specify objects that are distributed over the nodes of a concurrent computer. These distributed objects have many <i>constituent objects</i> and thus can process many messages simultaneously. They are the foundation upon which concurrent data structures are built.</p>\r\n\r\n<p>The <i>balanced cube</i> is a concurrent data structure for ordered sets. The set is distributed by a balanced recursive partition that maps to the subcubes of a binary <i>n</i>-cube using a Gray code. A search algorithm, VW search, based on the distance properties of the Gray code, searches a balanced cube in <i>O</i>(log <i>N</i>) time. Because it does not have the root bottleneck that limits all tree-based data structures to <i>O</i>(1) concurrency, the balanced cube achieves <i>O</i>(<i>N</i>/log <i>N</i>) concurrency.</p>\r\n\r\n<p>Considering graphs as concurrent data structures, graph algorithms are presented for the shortest path problem, the max-flow problem, and graph partitioning. These algorithms introduce new synchronization techniques to achieve better performance than existing algorithms.</p>\r\n\r\n<p>A message-passing, concurrent architecture is developed that exploits the characteristics of VLSI technology to support concurrent data structures. Interconnection topologies are compared on the basis of dimension. It is shown that minimum latency is achieved with a very low dimensional network. A deadlock-free routing strategy is developed for this class of networks, and a prototype VLSI chip implementing this strategy is described. A message-driven processor complements the network by responding to messages with a very low latency. The processor directly executes messages, eliminating a level of interpretation. To take advantage of the performance offered by specialization while at the same time retaining flexibility, processing elements can be specialized to operate on a single class of objects. These <i>object experts</i> accelerate the performance of all applications using this class.</p>",
        "doi": "10.7907/f8d5-x741",
        "publication_date": "1986",
        "thesis_type": "phd",
        "thesis_year": "1986"
    },
    {
        "id": "thesis:1333",
        "collection": "thesis",
        "collection_id": "1333",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-04102008-142130",
        "primary_object_url": {
            "basename": "Lien_slc_1985.pdf",
            "content": "final",
            "filesize": 5827040,
            "license": "other",
            "mime_type": "application/pdf",
            "url": "/1333/1/Lien_slc_1985.pdf",
            "version": "v4.0.0"
        },
        "type": "thesis",
        "title": "Combining Computation with Geometry",
        "author": [
            {
                "family_name": "Lien",
                "given_name": "Sheue-Ling Chang",
                "clpid": "Lien-Sheue-Ling-Chang"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Kajiya",
                "given_name": "James Thomas",
                "clpid": "Kajiya-J-T"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Kajiya",
                "given_name": "James Thomas",
                "clpid": "Kajiya-J-T"
            },
            {
                "family_name": "Mead",
                "given_name": "Carver",
                "orcid": "0000-0003-4051-0462",
                "clpid": "Mead-C-A"
            },
            {
                "family_name": "Lewicki",
                "given_name": "George W.",
                "clpid": "Lewicki-G-W"
            },
            {
                "family_name": "Fender",
                "given_name": "Derek H.",
                "clpid": "Fender-D-H"
            },
            {
                "family_name": "Barr",
                "given_name": "Alan H.",
                "clpid": "Barr-A-H"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>This thesis seeks to establish mathematical principles and to provide efficient solutions to various time consuming operations in computer-aided geometric design. It contains a discussion of three major topics: (1) design validation by means of object interference detection, (2) object reconstruction through the union, intersection, and subtraction of two polyhedra, and (3) calculation of basic engineering properties such as volume, center of mass, or moments of inertia.</p>\r\n\r\n<p>Two criteria are presented for solving the problems of point-polygon enclosure and point-polyhedron enclosure in object interference detection. An algorithm for efficient point-polyhedron-enclosure detection is presented. Singularities encountered in point-polyhedron-enclosure detection are categorized and simple methods for resolving them are also included.</p>\r\n\r\n<p>A new scheme for representing solid objects, called skeletal polyhedron representation, is proposed. Also included are algorithms for performing set operations on polyhedra (or polygons) represented in skeletal polyhedron representation, algorithms for performing edge-edge intersection and face-face intersection in a set operation, and a perturbation method which can be used to resolve singularities for an easy execution of edge-edge intersection and face-face intersection.</p>\r\n\r\n<p>A symbolic method for calculating basic engineering properties (such as volume, center of mass, moments of inertia, and similar integral properties of geometrically complex solids) is proposed. The same method is generalized for computing the integral properties of a set combined polyhedron, and for computing the integral properties of an arbitrary polyhedron in m-dimensional (R<sup>m</sup>) space.</p>",
        "doi": "10.7907/n1qe-h846",
        "publication_date": "1985",
        "thesis_type": "phd",
        "thesis_year": "1985"
    },
    {
        "id": "thesis:3296",
        "collection": "thesis",
        "collection_id": "3296",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-08312006-094203",
        "type": "thesis",
        "title": "Space-time Algorithms: Semantics and Methodology",
        "author": [
            {
                "family_name": "Chen",
                "given_name": "Marina Chien-mei",
                "clpid": "Chen-Marina-Chien-mei"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Mead",
                "given_name": "Carver",
                "orcid": "0000-0003-4051-0462",
                "clpid": "Mead-C-A"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Johnson",
                "given_name": "William Lewis",
                "clpid": "Johnson-W-L"
            },
            {
                "family_name": "Kajiya",
                "given_name": "James Thomas",
                "clpid": "Kajiya-J-T"
            },
            {
                "family_name": "Kechris",
                "given_name": "Alexander S.",
                "orcid": "0000-0002-2226-0423",
                "clpid": "Kechris-A-S"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Mead",
                "given_name": "Carver",
                "orcid": "0000-0003-4051-0462",
                "clpid": "Mead-C-A"
            }
        ],
        "local_group": [
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>A methodology for specifying concurrent systems is presented. A model of computation for concurrent systems is presented first. The syntax and semantics of the language CRYSTAL are introduced. The specification of a system is called a space-time algorithm since space and time are explicit parameters in the description. Fixed-point semantics is used for abstracting the behavior of a system from its implementation. The consistency between an implementation and its description can therefore be ensured using this method. Formal semantics for an arbitrary transistor network is given. An \"interpreter\" for space-time algorithms -- a hierarchical simulator -- for VLSI systems is presented. The framework can be viewed as a concurrent programming notation when describing communicating processes and as a hardware description notation when specifying integrated circuits.</p>",
        "doi": "10.7907/bfpj-t811",
        "publication_date": "1983",
        "thesis_type": "phd",
        "thesis_year": "1983"
    },
    {
        "id": "thesis:3534",
        "collection": "thesis",
        "collection_id": "3534",
        "cite_using_url": "https://resolver.caltech.edu/CaltechETD:etd-09142006-085516",
        "type": "thesis",
        "title": "The Extension of Object-Oriented Languages to a Homogeneous, Concurrent Architecture",
        "author": [
            {
                "family_name": "Lang",
                "given_name": "Charles Richard, Jr.",
                "clpid": "Lang-Charles-Richard"
            }
        ],
        "thesis_advisor": [
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            }
        ],
        "thesis_committee": [
            {
                "family_name": "Bryant",
                "given_name": "Randy",
                "clpid": "Bryant-R"
            },
            {
                "family_name": "Fox",
                "given_name": "Geoffrey C.",
                "clpid": "Fox-G-C"
            },
            {
                "family_name": "Johnsson",
                "given_name": "S. Lennart",
                "clpid": "Johnsson-S-Lennart"
            },
            {
                "family_name": "Kajiya",
                "given_name": "James Thomas",
                "clpid": "Kajiya-J-T"
            },
            {
                "family_name": "Martin",
                "given_name": "Alain J.",
                "clpid": "Martin-A-J"
            },
            {
                "family_name": "Seitz",
                "given_name": "Charles L.",
                "clpid": "Seitz-C-L"
            }
        ],
        "local_group": [
            {
                "literal": "Computer Science Technical Reports"
            },
            {
                "literal": "div_eng"
            }
        ],
        "abstract": "<p>A homogeneous machine architecture, consisting of a regular interconnection of many identical elements, exploits the economic benefits of VLSI technology. A concurrent programming model is presented that is related to object oriented languages such as Simula and Smalltalk. Techniques are developed which permit the execution of general purpose object oriented programs on a homogeneous machine. Both the hardware architecture and the supporting software algorithms are demonstrated to scale their performance with the size of the system.</p>\r\n\r\n<p>The program objects communicate by passing messages. Objects may move about in the system and may have an arbitrary pointer topology. A distributed, on-the-fly garbage collection algorithm is presented which operates by message passing. Simulation of the algorithm demonstrates its ability to collect obsolete objects over the entire machine with acceptable overhead costs. Algorithms for maintaining the locality of object references and for implementing a virtual object capability are also presented.</p>\r\n\r\n<p>To insure the absence of hardware bottlenecks, a number of interconnection strategies are discussed and simulated for use in a homogeneous machine. Of those considered, the Boolean N-cube connection is demonstrated to provide the necessary characteristics.</p>\r\n\r\n<p>The object oriented machine will provide increased performance as its size is increased. It can execute a general purpose, concurrent, object oriented language where the size of the machine and its interconnection topology are transparent to the programmer.</p>",
        "doi": "10.7907/9EVC-2X08",
        "publication_date": "1982",
        "thesis_type": "phd",
        "thesis_year": "1982"
    }
]