Article records
https://feeds.library.caltech.edu/people/Wang-Zhiying/article.rss
A Caltech Library Repository Feedhttp://www.rssboard.org/rss-specificationpython-feedgenenSat, 13 Apr 2024 00:25:36 +0000Zigzag Codes: MDS Array Codes With Optimal Rebuilding
https://resolver.caltech.edu/CaltechAUTHORS:20130321-102330661
Authors: {'items': [{'id': 'Tamo-I', 'name': {'family': 'Tamo', 'given': 'Itzhak'}}, {'id': 'Wang-Zhiying', 'name': {'family': 'Wang', 'given': 'Zhiying'}}, {'id': 'Bruck-J', 'name': {'family': 'Bruck', 'given': 'Jehoshua'}, 'orcid': '0000-0001-8474-0812'}]}
Year: 2013
DOI: 10.1109/TIT.2012.2227110
Maximum distance separable (MDS) array codes are widely used in storage systems to protect data against erasures. We address the rebuilding ratio problem, namely, in the case of erasures, what is the fraction of the remaining information that needs to be accessed in order to rebuild exactly the lost information? It is clear that when the number of erasures equals the maximum number of erasures that an MDS code can correct, then the rebuilding ratio is 1 (access all the remaining information). However, the interesting and more practical case is when the number of erasures is smaller than the erasure correcting capability of the code. For example, consider an MDS code that can correct two erasures: What is the smallest amount of information that one needs to access in order to correct a single erasure? Previous work showed that the rebuilding ratio is bounded between 1/2 and 3/4; however, the exact value was left as an open problem. In this paper, we solve this open problem and prove that for the case of a single erasure with a two-erasure correcting code, the rebuilding ratio is 1/2. In general, we construct a new family of r-erasure correcting MDS array codes that has optimal rebuilding ratio of 1/(r) in the case of a single erasure. Our array codes have efficient encoding and decoding algorithms (for the cases r=2 and r=3, they use a finite field of size 3 and 4, respectively) and an optimal update property.https://authors.library.caltech.edu/records/c80wj-h6a11Access Versus Bandwidth in Codes for Storage
https://resolver.caltech.edu/CaltechAUTHORS:20140425-151109319
Authors: {'items': [{'id': 'Tamo-I', 'name': {'family': 'Tamo', 'given': 'Itzhak'}}, {'id': 'Wang-Zhiying', 'name': {'family': 'Wang', 'given': 'Zhiying'}}, {'id': 'Bruck-J', 'name': {'family': 'Bruck', 'given': 'Jehoshua'}, 'orcid': '0000-0001-8474-0812'}]}
Year: 2014
DOI: 10.1109/TIT.2014.2305698
Maximum distance separable (MDS) codes are widely used in storage systems to protect against disk (node) failures. A node is said to have capacity l over some field F, if it can store that amount of symbols of the field. An (n, k, l) MDS code uses n nodes of capacity l to store k information nodes. The MDS property guarantees the resiliency to any n-k node failures. An optimal bandwidth (respectively, optimal access) MDS code communicates (respectively, accesses) the minimum amount of data during the repair process of a single failed node. It was shown that this amount equals a fraction of 1/(n - k) of data stored in each node. In previous optimal bandwidth constructions, l scaled polynomially with k in codes when the asymptotic rate is less than 1. Moreover, in constructions with a constant number of parities, i.e., when the rate approaches 1, l is scaled exponentially with k. In this paper, we focus on the case of linear codes with linear repair operations and constant number of parities n - k = r, and ask the following question: given the capacity of a node l what is the largest number of information disks k in an optimal bandwidth (respectively, access) (k + r, k, l) MDS code? We give an upper bound for the general case, and two tight bounds in the special cases of two important families of codes. The first is a family of codes with optimal update property, and the second is a family with optimal access property. Moreover, the bounds show that in some cases optimal-bandwidth codes have larger k than optimal-access codes, and therefore these two measures are not equivalent.https://authors.library.caltech.edu/records/677nb-sgx08Explicit Minimum Storage Regenerating Codes
https://resolver.caltech.edu/CaltechAUTHORS:20160930-131654865
Authors: {'items': [{'id': 'Wang-Zhiying', 'name': {'family': 'Wang', 'given': 'Zhiying'}}, {'id': 'Tamo-Itzhak', 'name': {'family': 'Tamo', 'given': 'Itzhak'}}, {'id': 'Bruck-J', 'name': {'family': 'Bruck', 'given': 'Jehoshua'}, 'orcid': '0000-0001-8474-0812'}]}
Year: 2016
DOI: 10.1109/TIT.2016.2553675
In distributed storage, a file is stored in a set of nodes and protected by erasure-correcting codes. Regenerating code is a type of code with two properties: first, it can reconstruct the entire file in the presence of any r node erasures for some specified integer r; second, it can efficiently repair an erased node from any subset of remaining nodes with a given size. In the repair process, the amount of information transmitted from each node normalized by the storage size per node is termed repair bandwidth (fraction). When the storage size per node is minimized, the repair bandwidth is lower bounded by 1/r, where r is the number of parity nodes. A code attaining this lower bound is said to have optimal repair. We consider codes with minimum storage size per node and optimal repair, called minimum storage regenerating (MSR) codes. In particular, if an MSR code has r parities and any r erasures occur, then by transmitting all the information from the remaining nodes, the original file can be reconstructed. On the other hand, if only one erasure occurs, only a fraction of 1/r of the information in each remaining node needs to be transmitted. If we view each node as a vector or a column over some field, then the code forms a 2-D array. Given the length of the column l and the number of parities r, we explicitly construct the high-rate MSR codes. The number of systematic nodes of our construction is (r + 1) log_rl, which is longer than previously known results. Besides, we construct the MSR codes with other desirable properties: first, the codes with low complexity when the information is updated, and second, the codes with low access or storage node I/O cost during repair.https://authors.library.caltech.edu/records/17jmt-mh471Optimal Rebuilding of Multiple Erasures in MDS Codes
https://resolver.caltech.edu/CaltechAUTHORS:20170119-080421044
Authors: {'items': [{'id': 'Wang-Zhiying', 'name': {'family': 'Wang', 'given': 'Zhiying'}}, {'id': 'Tamo-Itzhak', 'name': {'family': 'Tamo', 'given': 'Itzhak'}}, {'id': 'Bruck-J', 'name': {'family': 'Bruck', 'given': 'Jehoshua'}, 'orcid': '0000-0001-8474-0812'}]}
Year: 2017
DOI: 10.1109/TIT.2016.2633411
Maximum distance separable (MDS) array codes are widely used in storage systems due to their computationally efficient encoding and decoding procedures. An MDS code with r redundancy nodes can correct any r node erasures by accessing (reading) all the remaining information in the surviving nodes. However, in practice, e erasures are a more likely failure event, for some 1≤ehttps://authors.library.caltech.edu/records/bk2bc-4f252Switch Codes: Codes for Fully Parallel Reconstruction
https://resolver.caltech.edu/CaltechAUTHORS:20170315-151626957
Authors: {'items': [{'id': 'Wang-Zhiying', 'name': {'family': 'Wang', 'given': 'Zhiying'}}, {'id': 'Kiah-Han-Mao', 'name': {'family': 'Kiah', 'given': 'Han Mao'}}, {'id': 'Cassuto-Y', 'name': {'family': 'Cassuto', 'given': 'Yuval'}, 'orcid': '0000-0001-6369-6699'}, {'id': 'Bruck-J', 'name': {'family': 'Bruck', 'given': 'Jehoshua'}, 'orcid': '0000-0001-8474-0812'}]}
Year: 2017
DOI: 10.1109/TIT.2017.2664867
Network switches and routers scale in rate by distributing the packet read/write operations across multiple memory banks. Rate scaling is achieved so long as sufficiently many packets can be written and read in parallel. However, due to the non-determinism of the read process, parallel pending read requests may contend on memory banks, and thus significantly lower the switching rate. In this paper, we provide a constructive study of codes that guarantee fully parallel data reconstruction without contention. We call these codes "switch codes," and construct three optimal switch-code families with different parameters. All the constructions use only simple XOR-based encoding and decoding operations, an important advantage when operated in ultra-high speeds. Switch codes achieve their good performance by spanning simultaneous disjoint local-decoding sets for all their information symbols. Switch codes may be regarded as an extreme version of the previously studied batch codes, where the switch version requires parallel reconstruction of all the information symbols.https://authors.library.caltech.edu/records/323t5-36794