Regularized System Identification : : Learning Dynamic Models from Data.
Saved in:
Superior document: | Communications and Control Engineering Series |
---|---|
: | |
TeilnehmendeR: | |
Place / Publishing House: | Cham : : Springer International Publishing AG,, 2022. ©2022. |
Year of Publication: | 2022 |
Edition: | 1st ed. |
Language: | English |
Series: | Communications and Control Engineering Series
|
Online Access: | |
Physical Description: | 1 online resource (394 pages) |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Table of Contents:
- Intro
- Preface
- Acknowledgements
- Contents
- Abbreviations and Notation
- Notation
- Abbreviations
- 1 Bias
- 1.1 The Stein Effect
- 1.1.1 The James-Stein Estimator
- 1.1.2 Extensions of the James-Stein Estimator
- 1.2 Ridge Regression
- 1.3 Further Topics and Advanced Reading
- 1.4 Appendix: Proof of Theorem 1.1
- References
- 2 Classical System Identification
- 2.1 The State-of-the-Art Identification Setup
- 2.2 mathcalM: Model Structures
- 2.2.1 Linear Time-Invariant Models
- 2.2.2 Nonlinear Models
- 2.3 mathcalI: Identification Methods-Criteria
- 2.3.1 A Maximum Likelihood (ML) View
- 2.4 Asymptotic Properties of the Estimated Models
- 2.4.1 Bias and Variance
- 2.4.2 Properties of the PEM Estimate as Ntoinfty
- 2.4.3 Trade-Off Between Bias and Variance
- 2.5 X: Experiment Design
- 2.6 mathcalV: Model Validation
- 2.6.1 Falsifying Models: Residual Analysis
- 2.6.2 Comparing Different Models
- 2.6.3 Cross-Validation
- References
- 3 Regularization of Linear Regression Models
- 3.1 Linear Regression
- 3.2 The Least Squares Method
- 3.2.1 Fundamentals of the Least Squares Method
- 3.2.2 Mean Squared Error and Model Order Selection
- 3.3 Ill-Conditioning
- 3.3.1 Ill-Conditioned Least Squares Problems
- 3.3.2 Ill-Conditioning in System Identification
- 3.4 Regularized Least Squares with Quadratic Penalties
- 3.4.1 Making an Ill-Conditioned LS Problem Well Conditioned
- 3.4.2 Equivalent Degrees of Freedom
- 3.5 Regularization Tuning for Quadratic Penalties
- 3.5.1 Mean Squared Error and Expected Validation Error
- 3.5.2 Efficient Sample Reuse
- 3.5.3 Expected In-Sample Validation Error
- 3.6 Regularized Least Squares with Other Types of Regularizers
- 3.6.1 ell1-Norm Regularization
- 3.6.2 Nuclear Norm Regularization
- 3.7 Further Topics and Advanced Reading
- 3.8 Appendix.
- 3.8.1 Fundamentals of Linear Algebra
- 3.8.2 Proof of Lemma 3.1
- 3.8.3 Derivation of Predicted Residual Error Sum of Squares (PRESS)
- 3.8.4 Proof of Theorem 3.7
- 3.8.5 A Variant of the Expected In-Sample Validation Error and Its Unbiased Estimator
- References
- 4 Bayesian Interpretation of Regularization
- 4.1 Preliminaries
- 4.2 Incorporating Prior Knowledge via Bayesian Estimation
- 4.2.1 Multivariate Gaussian Variables
- 4.2.2 The Gaussian Case
- 4.2.3 The Linear Gaussian Model
- 4.2.4 Hierarchical Bayes: Hyperparameters
- 4.3 Bayesian Interpretation of the James-Stein Estimator
- 4.4 Full and Empirical Bayes Approaches
- 4.5 Improper Priors and the Bias Space
- 4.6 Maximum Entropy Priors
- 4.7 Model Approximation via Optimal Projection
- 4.8 Equivalent Degrees of Freedom
- 4.9 Bayesian Function Reconstruction
- 4.10 Markov Chain Monte Carlo Estimation
- 4.11 Model Selection Using Bayes Factors
- 4.12 Further Topics and Advanced Reading
- 4.13 Appendix
- 4.13.1 Proof of Theorem 4.1
- 4.13.2 Proof of Theorem 4.2
- 4.13.3 Proof of Lemma 4.1
- 4.13.4 Proof of Theorem 4.3
- 4.13.5 Proof of Theorem 4.6
- 4.13.6 Proof of Proposition 4.3
- 4.13.7 Proof of Theorem 4.8
- References
- 5 Regularization for Linear System Identification
- 5.1 Preliminaries
- 5.2 MSE and Regularization
- 5.3 Optimal Regularization for FIR Models
- 5.4 Bayesian Formulation and BIBO Stability
- 5.5 Smoothness and Contractivity: Time- and Frequency-Domain Interpretations
- 5.5.1 Maximum Entropy Priors for Smoothness and Stability: From Splines to Dynamical Systems
- 5.6 Regularization and Basis Expansion
- 5.7 Hankel Nuclear Norm Regularization
- 5.8 Historical Overview
- 5.8.1 The Distributed Lag Estimator: Prior Means and Smoothing
- 5.8.2 Frequency-Domain Smoothing and Stability.
- 5.8.3 Exponential Stability and Stochastic Embedding
- 5.9 Further Topics and Advanced Reading
- 5.10 Appendix
- 5.10.1 Optimal Kernel
- 5.10.2 Proof of Lemma 5.1
- 5.10.3 Proof of Theorem 5.5
- 5.10.4 Proof of Corollary 5.1
- 5.10.5 Proof of Lemma 5.2
- 5.10.6 Proof of Theorem 5.6
- 5.10.7 Proof of Lemma 5.5
- 5.10.8 Forward Representations of Stable-Splines Kernels
- References
- 6 Regularization in Reproducing Kernel Hilbert Spaces
- 6.1 Preliminaries
- 6.2 Reproducing Kernel Hilbert Spaces
- 6.2.1 Reproducing Kernel Hilbert Spaces Induced by Operations on Kernels
- 6.3 Spectral Representations of Reproducing Kernel Hilbert Spaces
- 6.3.1 More General Spectral Representation
- 6.4 Kernel-Based Regularized Estimation
- 6.4.1 Regularization in Reproducing Kernel Hilbert Spaces and the Representer Theorem
- 6.4.2 Representer Theorem Using Linear and Bounded Functionals
- 6.5 Regularization Networks and Support Vector Machines
- 6.5.1 Regularization Networks
- 6.5.2 Robust Regression via Huber Loss
- 6.5.3 Support Vector Regression
- 6.5.4 Support Vector Classification
- 6.6 Kernels Examples
- 6.6.1 Linear Kernels, Regularized Linear Regression and System Identification
- 6.6.2 Kernels Given by a Finite Number of Basis Functions
- 6.6.3 Feature Map and Feature Space
- 6.6.4 Polynomial Kernels
- 6.6.5 Translation Invariant and Radial Basis Kernels
- 6.6.6 Spline Kernels
- 6.6.7 The Bias Space and the Spline Estimator
- 6.7 Asymptotic Properties
- 6.7.1 The Regression Function/Optimal Predictor
- 6.7.2 Regularization Networks: Statistical Consistency
- 6.7.3 Connection with Statistical Learning Theory
- 6.8 Further Topics and Advanced Reading
- 6.9 Appendix
- 6.9.1 Fundamentals of Functional Analysis
- 6.9.2 Proof of Theorem 6.1
- 6.9.3 Proof of Theorem 6.10
- 6.9.4 Proof of Theorem 6.13.
- 6.9.5 Proofs of Theorems 6.15 and 6.16
- 6.9.6 Proof of Theorem 6.21
- References
- 7 Regularization in Reproducing Kernel Hilbert Spaces for Linear System Identification
- 7.1 Regularized Linear System Identification in Reproducing Kernel Hilbert Spaces
- 7.1.1 Discrete-Time Case
- 7.1.2 Continuous-Time Case
- 7.1.3 More General Use of the Representer Theorem for Linear System Identification
- 7.1.4 Connection with Bayesian Estimation of Gaussian Processes
- 7.1.5 A Numerical Example
- 7.2 Kernel Tuning
- 7.2.1 Marginal Likelihood Maximization
- 7.2.2 Stein's Unbiased Risk Estimator
- 7.2.3 Generalized Cross-Validation
- 7.3 Theory of Stable Reproducing Kernel Hilbert Spaces
- 7.3.1 Kernel Stability: Necessary and Sufficient Conditions
- 7.3.2 Inclusions of Reproducing Kernel Hilbert Spaces in More General Lebesque Spaces
- 7.4 Further Insights into Stable Reproducing Kernel Hilbert Spaces
- 7.4.1 Inclusions Between Notable Kernel Classes
- 7.4.2 Spectral Decomposition of Stable Kernels
- 7.4.3 Mercer Representations of Stable Reproducing Kernel Hilbert Spaces and of Regularized Estimators
- 7.4.4 Necessary and Sufficient Stability Condition Using Kernel Eigenvectors and Eigenvalues
- 7.5 Minimax Properties of the Stable Spline Estimator
- 7.5.1 Data Generator and Minimax Optimality
- 7.5.2 Stable Spline Estimator
- 7.5.3 Bounds on the Estimation Error and Minimax Properties
- 7.6 Further Topics and Advanced Reading
- 7.7 Appendix
- 7.7.1 Derivation of the First-Order Stable Spline Norm
- 7.7.2 Proof of Proposition 7.1
- 7.7.3 Proof of Theorem 7.5
- 7.7.4 Proof of Theorem 7.7
- 7.7.5 Proof of Theorem 7.9
- References
- 8 Regularization for Nonlinear System Identification
- 8.1 Nonlinear System Identification
- 8.2 Kernel-Based Nonlinear System Identification.
- 8.2.1 Connection with Bayesian Estimation of Gaussian Random Fields
- 8.2.2 Kernel Tuning
- 8.3 Kernels for Nonlinear System Identification
- 8.3.1 A Numerical Example
- 8.3.2 Limitations of the Gaussian and Polynomial Kernel
- 8.3.3 Nonlinear Stable Spline Kernel
- 8.3.4 Numerical Example Revisited: Use of the Nonlinear Stable Spline Kernel
- 8.4 Explicit Regularization of Volterra Models
- 8.5 Other Examples of Regularization in Nonlinear System Identification
- 8.5.1 Neural Networks and Deep Learning Models
- 8.5.2 Static Nonlinearities and Gaussian Process (GP)
- 8.5.3 Block-Oriented Models
- 8.5.4 Hybrid Models
- 8.5.5 Sparsity and Variable Selection
- References
- 9 Numerical Experiments and Real World Cases
- 9.1 Identification of Discrete-Time Output Error Models
- 9.1.1 Monte Carlo Studies with a Fixed Output Error Model
- 9.1.2 Monte Carlo Studies with Different Output Error Models
- 9.1.3 Real Data: A Robot Arm
- 9.1.4 Real Data: A Hairdryer
- 9.2 Identification of ARMAX Models
- 9.2.1 Monte Carlo Experiment
- 9.2.2 Real Data: Temperature Prediction
- 9.3 Multi-task Learning and Population Approaches
- 9.3.1 Kernel-Based Multi-task Learning
- 9.3.2 Numerical Example: Real Pharmacokinetic Data
- References
- Appendix Index
- Index.