Book Section records
https://feeds.library.caltech.edu/people/Abu-Mostafa-Y-S/book_section.rss
A Caltech Library Repository Feedhttp://www.rssboard.org/rss-specificationpython-feedgenenTue, 16 Apr 2024 13:15:22 +0000Connectivity Versus Entropy
https://resolver.caltech.edu/CaltechAUTHORS:20160107-155110636
Authors: {'items': [{'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Yaser S.'}}]}
Year: 1988
How does the connectivity of a neural network (number of synapses per
neuron) relate to the complexity of the problems it can handle (measured by
the entropy)? Switching theory would suggest no relation at all, since all Boolean
functions can be implemented using a circuit with very low connectivity (e.g.,
using two-input NAND gates). However, for a network that learns a problem
from examples using a local learning rule, we prove that the entropy of the
problem becomes a lower bound for the connectivity of the network.https://authors.library.caltech.edu/records/ca7c6-x5f57On the K-Winners-Take-All Network
https://resolver.caltech.edu/CaltechAUTHORS:20160107-160213913
Authors: {'items': [{'id': 'Majani-E', 'name': {'family': 'Majani', 'given': 'E.'}}, {'id': 'Erlanson-R', 'name': {'family': 'Erlanson', 'given': 'R.'}}, {'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Y.'}}]}
Year: 1989
We present and rigorously analyze a generalization of the Winner-Take-All Network: the K-Winners-Take-All Network. This network
identifies the K largest of a set of N real numbers. The
network model used is the continuous Hopfield model.https://authors.library.caltech.edu/records/6wj5t-rrn63A Method for the Associative Storage of Analog Vectors
https://resolver.caltech.edu/CaltechAUTHORS:20160107-161548455
Authors: {'items': [{'id': 'Atiya-A-F', 'name': {'family': 'Atiya', 'given': 'Amir'}}, {'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Yaser'}}]}
Year: 1990
A method for storing analog vectors in Hopfield's continuous feedback
model is proposed. By analog vectors we mean vectors whose
components are real-valued. The vectors to be stored are set as
equilibria of the network. The network model consists of one layer
of visible neurons and one layer of hidden neurons. We propose
a learning algorithm, which results in adjusting the positions of
the equilibria, as well as guaranteeing their stability. Simulation
results confirm the effectiveness of the method.https://authors.library.caltech.edu/records/htz96-3nv21Analog Neural Networks as Decoders
https://resolver.caltech.edu/CaltechAUTHORS:20160119-162724779
Authors: {'items': [{'id': 'Erlanson-R', 'name': {'family': 'Erlanson', 'given': 'Ruth'}}, {'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Yaser'}}]}
Year: 1991
Analog neural networks with feedback can be used to implement l(Winner-Take-All (KWTA) networks. In turn, KWTA networks can be
used as decoders of a class of nonlinear error-correcting codes. By interconnecting
such KWTA networks, we can construct decoders capable
of decoding more powerful codes. We consider several families of interconnected
KWTA networks, analyze their performance in terms of coding
theory metrics, and consider the feasibility of embedding such networks in
VLSI technologies.https://authors.library.caltech.edu/records/fzqcs-ntm16A Method for Learning from Hints
https://resolver.caltech.edu/CaltechAUTHORS:20160128-163222557
Authors: {'items': [{'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Yaser S.'}}]}
Year: 1993
We address the problem of learning an unknown function by
putting together several pieces of information (hints) that we know about the function. We introduce a method that generalizes learning from examples to learning from hints. A canonical representation of hints is defined and illustrated for new types of hints. All the hints are represented to the learning process by examples, and
examples of the function are treated on equal footing with the rest of the hints. During learning, examples from different hints are selected for processing according to a given schedule. We present two types of schedules; fixed schedules that specify the relative emphasis of each hint, and adaptive schedules that are based on how well each hint has been learned so far. Our learning method is compatible with any descent technique that we may choose to use.https://authors.library.caltech.edu/records/mwmvb-va834An algorithm for learning from hints
https://resolver.caltech.edu/CaltechAUTHORS:ABUijcnn93
Authors: {'items': [{'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Y. S.'}}]}
Year: 1993
DOI: 10.1109/IJCNN.1993.716969
To take advantage of prior knowledge (hints) about the function one wants to learn, we introduce a method that generalizes learning from examples to learning from hints. A canonical representation of hints is defined and illustrated. All hints are represented to the learning process by examples, and examples of the function are treated on equal footing with the rest of the hints. During learning, examples from different hints are selected for processing according to a given schedule. We present two types of schedules; fixed schedules that specify the relative emphasis of each hint, and adaptive schedules that are based on how well each hint has been learned so far. Our learning method is compatible with any descent technique.https://authors.library.caltech.edu/records/c6enm-z1321Financial Applications of Learning from Hints
https://resolver.caltech.edu/CaltechAUTHORS:20150305-151907939
Authors: {'items': [{'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Yaser S.'}}]}
Year: 1995
The basic paradigm for learning in neural networks is 'learning from examples' where a training set of input-output examples is used to teach the network the target function. Learning from hints is a generalization
of learning from examples where additional information
about the target function can be incorporated in the same learning process. Such information can come from common sense rules or special expertise. In financial market applications where the training data is very noisy, the use of such hints can have a decisive advantage. We demonstrate the use of hints in foreign-exchange trading of the U.S. Dollar versus the British Pound, the German
Mark, the Japanese Yen, and the Swiss Franc, over a period of 32 months. We explain the general method of learning from hints and how it can be applied to other markets. The learning model for this method is not restricted to neural networks.https://authors.library.caltech.edu/records/x02vv-1w353Monotonicity: Theory and Implementation
https://resolver.caltech.edu/CaltechAUTHORS:20190710-141334973
Authors: {'items': [{'id': 'Sill-J', 'name': {'family': 'Sill', 'given': 'Joseph'}}, {'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Yaser'}}]}
Year: 1997
DOI: 10.1007/978-1-4612-2018-3_6
We present a systematic method for incorporating prior knowledge (hints) into the learning-from-examples paradigm. The hints are represented in a canonical form that is compatible with descent techniques for learning. We focus in particular on the monotonicity hint, which states that the function to be learned is monotonic in some or all of the input variables. The application of monotonicity hints is demonstrated on two real-world problems-a credit card application task, and a problem in medical diagnosis. We report experimental results which show that using monotonicity hints leads to a statistically significant improvement in performance on both problems. Monotonicity is also analyzed from a theoretical perspective. We consider the class M of monotonically increasing binary output functions. Necessary and sufficient conditions for monotonic separability of a dichotomy are proven. The capacity of M is shown to depend heavily on the input distribution.https://authors.library.caltech.edu/records/5517z-zb676Monotonicity Hints
https://resolver.caltech.edu/CaltechAUTHORS:20160223-161511946
Authors: {'items': [{'id': 'Sill-J', 'name': {'family': 'Sill', 'given': 'Joseph'}}, {'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Yaser S.'}}]}
Year: 1997
A hint is any piece of side information about the target function to be learned. We consider the monotonicity hint, which states that the function to be learned is monotonic in some or all of the input variables. The application of monotonicity hints is demonstrated on two real-world problems- a credit card application task, and a problem in medical diagnosis. A measure of the monotonicity error
of a candidate function is defined and an objective function for the enforcement of monotonicity is derived from Bayesian principles. We report experimental results which show that using monotonicity hints leads to a statistically significant improvement in performance
on both problems.https://authors.library.caltech.edu/records/g83x3-ng568Incorporating Contextual Information in White Blood Cell Identification
https://resolver.caltech.edu/CaltechAUTHORS:20160224-143921726
Authors: {'items': [{'id': 'Song-Xubo', 'name': {'family': 'Song', 'given': 'Xubo'}}, {'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Yaser'}}, {'id': 'Sill-J', 'name': {'family': 'Sill', 'given': 'Joseph'}}, {'id': 'Kasdan-H-L', 'name': {'family': 'Kasdan', 'given': 'Harvey'}}]}
Year: 1998
In this paper we propose a technique to incorporate contextual information into object classification. In the real world there are cases where the identity of an object is ambiguous due to the noise in the measurements
based on which the classification should be made. It is helpful to reduce the ambiguity by utilizing extra information referred to as context, which in our case is the identities of the accompanying objects. This
technique is applied to white blood cell classification. Comparisons are made against "no context" approach, which demonstrates the superior classification performance achieved by using context. In our particular
application, it significantly reduces false alarm rate and thus greatly reduces the cost due to expensive clinical tests.https://authors.library.caltech.edu/records/jkktk-ekg96Image Recognition in Context: Application to Microscopic Urinalysis
https://resolver.caltech.edu/CaltechAUTHORS:20160229-163056107
Authors: {'items': [{'id': 'Song-Xubo', 'name': {'family': 'Song', 'given': 'Xubo'}}, {'id': 'Sill-J', 'name': {'family': 'Sill', 'given': 'Joseph'}}, {'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Yaser'}}, {'id': 'Kasdan-H', 'name': {'family': 'Kasdan', 'given': 'Harvey'}}]}
Year: 2000
We propose a new and efficient technique for incorporating contextual information into object classification. Most of the current techniques face the problem of exponential computation cost. In this paper, we propose a new general framework that incorporates partial context at a linear cost. This technique is applied to microscopic urinalysis image recognition, resulting in a significant improvement of recognition rate over the context free approach. This gain would have been impossible using conventional context incorporation techniques.https://authors.library.caltech.edu/records/s3nvm-sh902The Multilevel Classification Problem and a Monotonicity Hint
https://resolver.caltech.edu/CaltechAUTHORS:20190702-152246968
Authors: {'items': [{'id': 'Magdon-Ismail-M', 'name': {'family': 'Magdon-Ismail', 'given': 'Malik'}}, {'id': 'Chen-Hung-Ching', 'name': {'family': 'Chen', 'given': 'Hung-Ching'}}, {'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Yaser S.'}}]}
Year: 2002
DOI: 10.1007/3-540-45675-9_61
We introduce and formalize the multilevel classification problem, in which each category can be subdivided into different levels. We analyze the framework in a Bayesian setting using Normal class conditional densities. Within this framework, a natural monotonicity hint converts the problem into a nonlinear programming task, with non-linear constraints. We present Monte Carlo and gradient based techniques for addressing this task, and show the results of simulations. Incorporation of monotonicity yields a systematic improvement in performance.https://authors.library.caltech.edu/records/v3q22-et309Emergent Specialization in Swarm Systems
https://resolver.caltech.edu/CaltechAUTHORS:20190702-150156265
Authors: {'items': [{'id': 'Li-Ling', 'name': {'family': 'Li', 'given': 'Ling'}}, {'id': 'Martinoli-A', 'name': {'family': 'Martinoli', 'given': 'Alcherio'}}, {'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Yaser S.'}}]}
Year: 2002
DOI: 10.1007/3-540-45675-9_43
Distributed learning is the learning process of multiple autonomous agents in a varying environment, where each agent has only partial information about the global task. In this paper, we investigate the influence of different reinforcement signals (local and global) and team diversity (homogeneous and heterogeneous agents) on the learned solutions. We compare the learned solutions with those obtained by systematic search in a simple case study in which pairs of agents have to collaborate in order to solve the task without any explicit communication. The results show that policies which allow teammates to specialize find an adequate diversity of the team and, in general, achieve similar or better performances than policies which force homogeneity. However, in this specific case study, the achieved team performances appear to be independent of the locality or globality of the reinforcement signal.https://authors.library.caltech.edu/records/mqcjd-3wx35The maximum drawdown of the Brownian motion
https://resolver.caltech.edu/CaltechAUTHORS:MAGcife03
Authors: {'items': [{'id': 'Magon-Ismail-M', 'name': {'family': 'Magdon-Ismail', 'given': 'Malik'}}, {'id': 'Atiya-A-F', 'name': {'family': 'Atiya', 'given': 'Amir'}}, {'id': 'Pratap-A', 'name': {'family': 'Pratap', 'given': 'Amrit'}}, {'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Yaser'}}]}
Year: 2003
DOI: 10.1109/CIFER.2003.1196267
The MDD is defined as the maximum loss incurred from peak to bottom during a specified period of time. It is often preferred over some of the other risk measures because of the tight relationship between large drawdowns and fund redemptions. Also, a large drawdown can even indicate the start of a deterioration of an otherwise successful trading system, for example due to a market regime switch. Overall, the MDD is a very important risk measure. To be able to use it more insightfully, its analytical properties have to be understood. As a step towards this direction, we have presented in this article some analytic results that we have developed. We hope more and more results will come out from the research community analyzing this important measure.https://authors.library.caltech.edu/records/2aprk-1ee98Improving Generalization by Data Categorization
https://resolver.caltech.edu/CaltechAUTHORS:20190702-142717858
Authors: {'items': [{'id': 'Li-Ling', 'name': {'family': 'Li', 'given': 'Ling'}}, {'id': 'Pratap-A', 'name': {'family': 'Pratap', 'given': 'Amrit'}}, {'id': 'Lin-Hsuan-Tien', 'name': {'family': 'Lin', 'given': 'Hsuan-Tien'}}, {'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Yaser S.'}}]}
Year: 2005
DOI: 10.1007/11564126_19
In most of the learning algorithms, examples in the training set are treated equally. Some examples, however, carry more reliable or critical information about the target than the others, and some may carry wrong information. According to their intrinsic margin, examples can be grouped into three categories: typical, critical, and noisy. We propose three methods, namely the selection cost, SVM confidence margin, and AdaBoost data weight, to automatically group training examples into these three categories. Experimental results on artificial datasets show that, although the three methods have quite different nature, they give similar and reasonable categorization. Results with real-world datasets further demonstrate that treating the three data categories differently in learning can improve generalization.https://authors.library.caltech.edu/records/4q0t9-sqp86Pruning training sets for learning of object categories
https://resolver.caltech.edu/CaltechAUTHORS:ANGcvpr05
Authors: {'items': [{'id': 'Angelova-A', 'name': {'family': 'Angelova', 'given': 'Anelia'}}, {'id': 'Abu-Mostafa-Y-S', 'name': {'family': 'Abu-Mostafa', 'given': 'Yaser S.'}}, {'id': 'Perona-P', 'name': {'family': 'Perona', 'given': 'Pietro'}, 'orcid': '0000-0002-7583-5809'}]}
Year: 2005
DOI: 10.1109/CVPR.2005.283
Training datasets for learning of object categories are often contaminated or imperfect. We explore an approach to automatically identify examples that are noisy or troublesome for learning and exclude them from the training set. The problem is relevant to learning in semi-supervised or unsupervised setting, as well as to learning when the training data is contaminated with wrongly labeled examples or when correctly labeled, but hard to learn examples, are present. We propose a fully automatic mechanism for noise cleaning, called 'data pruning', and demonstrate its success on learning of human faces. It is not assumed that the data or the noise can be modeled or that additional training examples are available. Our experiments show that data pruning can improve on generalization performance for algorithms with various robustness to noise. It outperforms methods with regularization properties and is superior to commonly applied aggregation methods, such as bagging.https://authors.library.caltech.edu/records/5feec-dr959