The Informational Complexity of Learning

Lieferzeit: Lieferbar innerhalb 14 Tagen

106,99 

Perspectives on Neural Networks and Generative Grammar

ISBN: 0792380819
ISBN 13: 9780792380818
Autor: Niyogi, Partha
Verlag: Springer Verlag GmbH
Umfang: xxiii, 224 S.
Erscheinungsdatum: 30.11.1997
Produktform: Gebunden/Hardback
Einband: GEB

This work seeks to bridge the gap between two learning problems within the same analytical framework. The first concerns the problem of learning functional mappings using neural networks, followed by learning natural language grammars in the principles and parameters tradition of Chomsky.

Artikelnummer: 1625215 Kategorie:

Beschreibung

InhaltsangabeList of Figures. Foreword. Preface. 1. Introduction. 2. Generalization Error for Neural Nets. 3. Active Learning. 4. Language Learning. 5. Language Change. 6. Conclusions. References.

Inhaltsverzeichnis

Inhaltsangabe1. Introduction.- 1.1 The Components of a Learning Paradigm.- 1.1.1 Concepts, Hypotheses, and Learners.- 1.1.2 Generalization, Learnability, Successful learning.- 1.1.3 Informational Complexity.- 1.2 Parametric Hypothesis Spaces.- 1.3 Technical Contents and Major Contributions.- 1.3.1 A Final Word.- 2. Generalization Error For Neural Nets.- 2.1 Introduction.- 2.2 Definitions and Statement of the Problem.- 2.2.1 Random Variables and Probability Distributions.- 2.2.2 Learning from Examples and Estimators.- 2.2.3 The Expected Risk and the Regression Function.- 2.2.4 The Empirical Risk.- 2.2.5 The Problem.- 2.2.6 Bounding the Generalization Error.- 2.2.7 A Note on Models and Model Complexity.- 2.3 Stating the Problem for Radial Basis Functions.- 2.4 Main Result.- 2.5 Remarks.- 2.5.1 Observations on the Main Result.- 2.5.2 Extensions.- 2.5.3 Connections with Other Results.- 2.6 Implications of the Theorem in Practice: Putting In the Numbers.- 2.6.1 Rate of Growth of n for Guaranteed Convergence.- 2.6.2 Optimal Choice of n.- 2.6.3 Experiments.- 2.7 Conclusion.- 2-A Notations.- 2-B A Useful Decomposition of the Expected Risk.- 2-C A Useful Inequality.- 2-D Proof of the Main Theorem.- 2-D.1 Bounding the approximation error.- 2-D.2 Bounding the estimation error.- 2-D.3 Bounding the generalization error.- 3. Active Learning.- 3.1 A General Framework For Active Approximation.- 3.1.1 Preliminaries.- 3.1.2 The Problem of Collecting Examples.- 3.1.3 In Context.- 3.2 Example 1: A Class of Monotonically Increasing Bounded Functions.- 3.2.1 Lower Bound for Passive Learning.- 3.2.2 Active Learning Algorithms.- 3.2.2.1 Derivation of an optimal sampling strategy.- 3.2.3 Empirical Simulations, and other Investigations.- 3.2.3.1 Distribution of Points selected.- 3.2.3.2 Classical Optimal Recovery.- 3.2.3.3 Error Rates and Sample Complexities for some Arbitrary Functions: Some Simulations.- 3.3 Example 2: A Class of Functions with Bounded First Derivative.- 3.3.1 Lower Bounds.- 3.3.2 Active Learning Algorithms.- 3.3.2.1 Derivation of an optimal sampling strategy.- 3.3.3 Some Simulations.- 3.3.3.1 Distribution of points selected.- 3.3.3.2 Error Rates:.- 3.4 Conclusions, Extensions, and Open Problems.- 3.5 A Simple Example.- 3.6 Generalizations.- 3.6.1 Localized Function Classes.- 3.6.2 The General ?-focusing strategy;.- 3.6.3 Generalizations and Open Problems.- 4. Language Learning.- 4.1 Language Learning and The Poverty of Stimulus.- 4.2 Constrained Grammars-Principles and Parameters.- 4.2.1 Example: A 3-parameter System from Syntax.- 4.2.2 Example: Parameterized Metrical Stress in Phonology.- 4.3 Learning in the Principles and Parameters Framework.- 4.4 Formal Analysis of the Triggering Learning Algorithm.- 4.4.1 Background.- 4.4.2 The Markov formulation.- 4.4.2.1 Parameterized Grammars and their Corresponding Markov Chains.- 4.4.2.2 Markov Chain Criteria for Learnability.- 4.4.2.3 The Markov chain for the 3-parameter Example.- 4.4.3 Derivation of the transition probabilities for the Markov TLA structure.- 4.4.3.1 Formalization.- 4.4.3.2 Additional Properties of the Learning System.- 4.5 Characterizing Convergence Times for the Markov Chain Model.- 4.5.1 Some Transition Matrices and Their Convergence Curves.- 4.5.2 Absorption Times.- 4.5.3 Eigenvalue Rates of Convergence.- 4.5.3.1 Eigenvalues and Eigenvectors.- 4.5.3.2 Representation of Tk.- 4.5.3.3 Initial Conditions and Limiting Distributions.- 4.5.3.4 Rate of Convergence.- 4.5.3.5 Transition Matrix Recipes:.- 4.6 Exploring Other Points.- 4.6.1 Changing the Algorithm.- 4.6.2 Distributional Assumptions.- 4.6.3 Natural Distributions-CHILDES COORPUS.- 4.7 Batch Learning Upper and Lower Bounds: An Aside.- 4.8 Conclusions, Open Questions, and Future Directions.- 4-A Unembedded Sentences For Parametric Grammars.- 4-B Memoryless Algorithms and Markov Chains.- 4-C Proof of Learnability Theorem.- 4-C.1 Markov state terminology.- 4-C.2 Canonical Decomposition.- 4-D Formal Proof.- 5. Language Change.- 5.1 I

Das könnte Ihnen auch gefallen …