Regularized System Identification : : Learning Dynamic Models from Data.

Saved in:
Bibliographic Details
Superior document:Communications and Control Engineering Series
:
TeilnehmendeR:
Place / Publishing House:Cham : : Springer International Publishing AG,, 2022.
©2022.
Year of Publication:2022
Edition:1st ed.
Language:English
Series:Communications and Control Engineering Series
Online Access:
Physical Description:1 online resource (394 pages)
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Intro
  • Preface
  • Acknowledgements
  • Contents
  • Abbreviations and Notation
  • Notation
  • Abbreviations
  • 1 Bias
  • 1.1 The Stein Effect
  • 1.1.1 The James-Stein Estimator
  • 1.1.2 Extensions of the James-Stein Estimator
  • 1.2 Ridge Regression
  • 1.3 Further Topics and Advanced Reading
  • 1.4 Appendix: Proof of Theorem 1.1
  • References
  • 2 Classical System Identification
  • 2.1 The State-of-the-Art Identification Setup
  • 2.2 mathcalM: Model Structures
  • 2.2.1 Linear Time-Invariant Models
  • 2.2.2 Nonlinear Models
  • 2.3 mathcalI: Identification Methods-Criteria
  • 2.3.1 A Maximum Likelihood (ML) View
  • 2.4 Asymptotic Properties of the Estimated Models
  • 2.4.1 Bias and Variance
  • 2.4.2 Properties of the PEM Estimate as Ntoinfty
  • 2.4.3 Trade-Off Between Bias and Variance
  • 2.5 X: Experiment Design
  • 2.6 mathcalV: Model Validation
  • 2.6.1 Falsifying Models: Residual Analysis
  • 2.6.2 Comparing Different Models
  • 2.6.3 Cross-Validation
  • References
  • 3 Regularization of Linear Regression Models
  • 3.1 Linear Regression
  • 3.2 The Least Squares Method
  • 3.2.1 Fundamentals of the Least Squares Method
  • 3.2.2 Mean Squared Error and Model Order Selection
  • 3.3 Ill-Conditioning
  • 3.3.1 Ill-Conditioned Least Squares Problems
  • 3.3.2 Ill-Conditioning in System Identification
  • 3.4 Regularized Least Squares with Quadratic Penalties
  • 3.4.1 Making an Ill-Conditioned LS Problem Well Conditioned
  • 3.4.2 Equivalent Degrees of Freedom
  • 3.5 Regularization Tuning for Quadratic Penalties
  • 3.5.1 Mean Squared Error and Expected Validation Error
  • 3.5.2 Efficient Sample Reuse
  • 3.5.3 Expected In-Sample Validation Error
  • 3.6 Regularized Least Squares with Other Types of Regularizers
  • 3.6.1 ell1-Norm Regularization
  • 3.6.2 Nuclear Norm Regularization
  • 3.7 Further Topics and Advanced Reading
  • 3.8 Appendix.
  • 3.8.1 Fundamentals of Linear Algebra
  • 3.8.2 Proof of Lemma 3.1
  • 3.8.3 Derivation of Predicted Residual Error Sum of Squares (PRESS)
  • 3.8.4 Proof of Theorem 3.7
  • 3.8.5 A Variant of the Expected In-Sample Validation Error and Its Unbiased Estimator
  • References
  • 4 Bayesian Interpretation of Regularization
  • 4.1 Preliminaries
  • 4.2 Incorporating Prior Knowledge via Bayesian Estimation
  • 4.2.1 Multivariate Gaussian Variables
  • 4.2.2 The Gaussian Case
  • 4.2.3 The Linear Gaussian Model
  • 4.2.4 Hierarchical Bayes: Hyperparameters
  • 4.3 Bayesian Interpretation of the James-Stein Estimator
  • 4.4 Full and Empirical Bayes Approaches
  • 4.5 Improper Priors and the Bias Space
  • 4.6 Maximum Entropy Priors
  • 4.7 Model Approximation via Optimal Projection
  • 4.8 Equivalent Degrees of Freedom
  • 4.9 Bayesian Function Reconstruction
  • 4.10 Markov Chain Monte Carlo Estimation
  • 4.11 Model Selection Using Bayes Factors
  • 4.12 Further Topics and Advanced Reading
  • 4.13 Appendix
  • 4.13.1 Proof of Theorem 4.1
  • 4.13.2 Proof of Theorem 4.2
  • 4.13.3 Proof of Lemma 4.1
  • 4.13.4 Proof of Theorem 4.3
  • 4.13.5 Proof of Theorem 4.6
  • 4.13.6 Proof of Proposition 4.3
  • 4.13.7 Proof of Theorem 4.8
  • References
  • 5 Regularization for Linear System Identification
  • 5.1 Preliminaries
  • 5.2 MSE and Regularization
  • 5.3 Optimal Regularization for FIR Models
  • 5.4 Bayesian Formulation and BIBO Stability
  • 5.5 Smoothness and Contractivity: Time- and Frequency-Domain Interpretations
  • 5.5.1 Maximum Entropy Priors for Smoothness and Stability: From Splines to Dynamical Systems
  • 5.6 Regularization and Basis Expansion
  • 5.7 Hankel Nuclear Norm Regularization
  • 5.8 Historical Overview
  • 5.8.1 The Distributed Lag Estimator: Prior Means and Smoothing
  • 5.8.2 Frequency-Domain Smoothing and Stability.
  • 5.8.3 Exponential Stability and Stochastic Embedding
  • 5.9 Further Topics and Advanced Reading
  • 5.10 Appendix
  • 5.10.1 Optimal Kernel
  • 5.10.2 Proof of Lemma 5.1
  • 5.10.3 Proof of Theorem 5.5
  • 5.10.4 Proof of Corollary 5.1
  • 5.10.5 Proof of Lemma 5.2
  • 5.10.6 Proof of Theorem 5.6
  • 5.10.7 Proof of Lemma 5.5
  • 5.10.8 Forward Representations of Stable-Splines Kernels
  • References
  • 6 Regularization in Reproducing Kernel Hilbert Spaces
  • 6.1 Preliminaries
  • 6.2 Reproducing Kernel Hilbert Spaces
  • 6.2.1 Reproducing Kernel Hilbert Spaces Induced by Operations on Kernels
  • 6.3 Spectral Representations of Reproducing Kernel Hilbert Spaces
  • 6.3.1 More General Spectral Representation
  • 6.4 Kernel-Based Regularized Estimation
  • 6.4.1 Regularization in Reproducing Kernel Hilbert Spaces and the Representer Theorem
  • 6.4.2 Representer Theorem Using Linear and Bounded Functionals
  • 6.5 Regularization Networks and Support Vector Machines
  • 6.5.1 Regularization Networks
  • 6.5.2 Robust Regression via Huber Loss
  • 6.5.3 Support Vector Regression
  • 6.5.4 Support Vector Classification
  • 6.6 Kernels Examples
  • 6.6.1 Linear Kernels, Regularized Linear Regression and System Identification
  • 6.6.2 Kernels Given by a Finite Number of Basis Functions
  • 6.6.3 Feature Map and Feature Space
  • 6.6.4 Polynomial Kernels
  • 6.6.5 Translation Invariant and Radial Basis Kernels
  • 6.6.6 Spline Kernels
  • 6.6.7 The Bias Space and the Spline Estimator
  • 6.7 Asymptotic Properties
  • 6.7.1 The Regression Function/Optimal Predictor
  • 6.7.2 Regularization Networks: Statistical Consistency
  • 6.7.3 Connection with Statistical Learning Theory
  • 6.8 Further Topics and Advanced Reading
  • 6.9 Appendix
  • 6.9.1 Fundamentals of Functional Analysis
  • 6.9.2 Proof of Theorem 6.1
  • 6.9.3 Proof of Theorem 6.10
  • 6.9.4 Proof of Theorem 6.13.
  • 6.9.5 Proofs of Theorems 6.15 and 6.16
  • 6.9.6 Proof of Theorem 6.21
  • References
  • 7 Regularization in Reproducing Kernel Hilbert Spaces for Linear System Identification
  • 7.1 Regularized Linear System Identification in Reproducing Kernel Hilbert Spaces
  • 7.1.1 Discrete-Time Case
  • 7.1.2 Continuous-Time Case
  • 7.1.3 More General Use of the Representer Theorem for Linear System Identification
  • 7.1.4 Connection with Bayesian Estimation of Gaussian Processes
  • 7.1.5 A Numerical Example
  • 7.2 Kernel Tuning
  • 7.2.1 Marginal Likelihood Maximization
  • 7.2.2 Stein's Unbiased Risk Estimator
  • 7.2.3 Generalized Cross-Validation
  • 7.3 Theory of Stable Reproducing Kernel Hilbert Spaces
  • 7.3.1 Kernel Stability: Necessary and Sufficient Conditions
  • 7.3.2 Inclusions of Reproducing Kernel Hilbert Spaces in More General Lebesque Spaces
  • 7.4 Further Insights into Stable Reproducing Kernel Hilbert Spaces
  • 7.4.1 Inclusions Between Notable Kernel Classes
  • 7.4.2 Spectral Decomposition of Stable Kernels
  • 7.4.3 Mercer Representations of Stable Reproducing Kernel Hilbert Spaces and of Regularized Estimators
  • 7.4.4 Necessary and Sufficient Stability Condition Using Kernel Eigenvectors and Eigenvalues
  • 7.5 Minimax Properties of the Stable Spline Estimator
  • 7.5.1 Data Generator and Minimax Optimality
  • 7.5.2 Stable Spline Estimator
  • 7.5.3 Bounds on the Estimation Error and Minimax Properties
  • 7.6 Further Topics and Advanced Reading
  • 7.7 Appendix
  • 7.7.1 Derivation of the First-Order Stable Spline Norm
  • 7.7.2 Proof of Proposition 7.1
  • 7.7.3 Proof of Theorem 7.5
  • 7.7.4 Proof of Theorem 7.7
  • 7.7.5 Proof of Theorem 7.9
  • References
  • 8 Regularization for Nonlinear System Identification
  • 8.1 Nonlinear System Identification
  • 8.2 Kernel-Based Nonlinear System Identification.
  • 8.2.1 Connection with Bayesian Estimation of Gaussian Random Fields
  • 8.2.2 Kernel Tuning
  • 8.3 Kernels for Nonlinear System Identification
  • 8.3.1 A Numerical Example
  • 8.3.2 Limitations of the Gaussian and Polynomial Kernel
  • 8.3.3 Nonlinear Stable Spline Kernel
  • 8.3.4 Numerical Example Revisited: Use of the Nonlinear Stable Spline Kernel
  • 8.4 Explicit Regularization of Volterra Models
  • 8.5 Other Examples of Regularization in Nonlinear System Identification
  • 8.5.1 Neural Networks and Deep Learning Models
  • 8.5.2 Static Nonlinearities and Gaussian Process (GP)
  • 8.5.3 Block-Oriented Models
  • 8.5.4 Hybrid Models
  • 8.5.5 Sparsity and Variable Selection
  • References
  • 9 Numerical Experiments and Real World Cases
  • 9.1 Identification of Discrete-Time Output Error Models
  • 9.1.1 Monte Carlo Studies with a Fixed Output Error Model
  • 9.1.2 Monte Carlo Studies with Different Output Error Models
  • 9.1.3 Real Data: A Robot Arm
  • 9.1.4 Real Data: A Hairdryer
  • 9.2 Identification of ARMAX Models
  • 9.2.1 Monte Carlo Experiment
  • 9.2.2 Real Data: Temperature Prediction
  • 9.3 Multi-task Learning and Population Approaches
  • 9.3.1 Kernel-Based Multi-task Learning
  • 9.3.2 Numerical Example: Real Pharmacokinetic Data
  • References
  • Appendix Index
  • Index.