Projection-Based Clustering Through Self-Organization and Swarm Intelligence : : Combining Cluster Analysis with the Visualization of High-Dimensional Data.
Saved in:
: | |
---|---|
Place / Publishing House: | Wiesbaden : : Springer Fachmedien Wiesbaden GmbH,, 2018. ©2018. |
Year of Publication: | 2018 |
Edition: | 1st ed. |
Language: | English |
Online Access: | |
Physical Description: | 1 online resource (210 pages) |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Table of Contents:
- Intro
- Acknowledgments
- Table of contents
- List of figures
- List of tables
- Zusammenfassung
- Abstract
- 1 Introduction
- 2 Fundamentals
- 2.1 Basic Definitions
- 2.2 Concepts of Graph Theory Applied to Patterns
- 2.3 Overview of Knowledge Discovery
- 2.3.1 Feature Selection
- 2.3.2 Preprocessing
- 2.3.3 Feature Extraction
- 2.3.3.1 Transformations
- 2.3.3.2 Dimensionality Reduction
- 2.3.4 Cluster Analysis
- 2.3.5 An Approach to Knowledge Acquisition
- 3 Approaches to Cluster Analysis
- 3.1 Common Clustering Methods
- 3.2 Structure of Natural Clusters
- 3.2.1 Types of Structures Sought by Clustering Algorithms
- 3.2.2 Quality of Clustering
- 3.2.2.1 Heatmaps
- 3.2.2.2 Silhouette plots
- 3.3 Problems with Clustering Methods
- 4 Methods of Projection
- 4.1 Common Approaches
- 4.1.1 Principal Component Analysis (PCA)
- 4.1.2 Independent Component Analysis (ICA)
- 4.1.3 Non-linear metric multidimensional scaling (MDS) techniques
- 4.1.4 Curvilinear Component Analysis (CCA)
- 4.1.5 t-Distributed Stochastic Neighbor Embedding (t-SNE)
- 4.1.6 Neighborhood Retrieval Visualizer (NeRV)
- 4.2 Emergent Self-Organizing Map (ESOM)
- 4.2.1 Visualizations of SOMs
- 4.2.2 Clustering with ESOM
- 4.3 Types of Projection Methods
- 5 Visualizing the Output Space
- 5.1 Examples
- 5.2 Structure Preservation
- 5.3 Generating a Topographic Map from the Generalized U*-matrix
- 5.3.1 Simplified ESOM
- 5.3.2 U*-Matrix Calculation
- 5.3.3 Topographic Map with Hypsometric Tints
- 5.3.4 Limitations
- 6 Quality Assessments of Visualizations
- 6.1 Common Quality Measures (QMs)
- 6.1.1 Classification Error (CE)
- 6.1.2 C Measure
- 6.1.3 Two Variants of the C Measure: Minimal Path Length and Minimal Wiring
- 6.1.4 Force Approach Error
- 6.1.5 König's Measure
- 6.1.6 Local Continuity Meta-Criterion (LCMC).
- 6.1.7 Mean Relative Rank Error (MRRE) and the Co-ranking Matrix
- 6.1.8 Precision and Recall
- 6.1.9 Rescaled Average Agreement Rate (RAAR)
- 6.1.10 Stress and the Shepard Diagram
- 6.1.11 Topographic Product
- 6.1.12 Topographic Function (TF)
- 6.1.13 Trustworthiness and Discontinuity (T&
- D)
- 6.1.14 U-ranking
- 6.1.15 Overall Correlations: Topological Index (TI) and Topological Correlation (TC)
- 6.1.16 Zrehen's Measure
- 6.2 Types of Quality Measures for Assessing Structure Preservation
- 6.2.1 Theoretical Assessment of Quality Measures
- 6.2.2 Practical Assessment of Quality Measures
- 6.3 Introducing the Delaunay Classification Error (DCE)
- 6.3.1 Summary
- 7 Behavior-based Systems in Data Science
- 7.1 Artificial Behavior Based on DataBots
- 7.1.1 Swarm-Organized Projection (SOP)
- 7.2 Swarm Intelligence for Unsupervised Machine Learning
- 7.3 Missing Links: Emergence and Game Theory
- 8 Databionic Swarm (DBS)
- 8.1 Projection with Pswarm
- 8.1.1 Motivation: Game Theory
- 8.1.2 Symmetry Considerations
- 8.1.3 Algorithm
- 8.1.4 Data-driven Annealing Scheme
- 8.1.5 Annealing Interval
- 8.1.6 Convergence
- 8.2 Comparing Pswarm with a Previously Developed Approach
- 8.2.1 Neighborhood Definition
- 8.2.2 Annealing Scheme
- 8.2.3 Swarm Intelligence and Self-Organization
- 8.3 Clustering on a Generalized U*-Matrix
- 9 Experimental Methodology
- 9.1 Data Sets
- 9.1.1 Atom
- 9.1.2 Chainlink
- 9.1.3 EngyTime
- 9.1.4 Golf Ball
- 9.1.5 Hepta
- 9.1.6 Iris
- 9.1.7 Leukemia
- 9.1.8 Lsun3D
- 9.1.9 S-shape
- 9.1.10 Swiss Banknotes
- 9.1.11 Target
- 9.1.12 Tetra
- 9.1.13 Tetragonula
- 9.1.14 Cuboid
- 9.1.15 Two Diamonds
- 9.1.16 Wine
- 9.1.17 Wing Nut
- 9.1.18 World Gross Domestic Product (World GDP)
- 9.2 Parameter Settings
- 9.2.1 Quality Measures (QMs)
- 9.2.2 Projection Methods.
- 9.2.2.1 Swarm-Organized Projection (SOP)
- 9.2.2.2 Pswarm
- 9.2.3 Common clustering algorithms
- 9.3 Gene Ontology (GO)
- 9.3.1 Overrepresentation Analysis (ORA)
- 9.3.2 Filtering via ABC Analysis
- 10 Results on Pre-classified Data Sets
- 10.1 Comparison with Given Classifications
- 10.1.1 Recognition of the Absence of Clusters
- 10.2 Evaluation of Projections Using the Delaunay Classification Error (DCE)
- 10.3 Topographic Maps with Hypsometric Colors
- 11 DBS on Natural Data Sets
- 11.1 Types of Leukemia
- 11.2 World Gross Domestic Product (World GDP)
- 11.3 Tetragonula Bees
- 12 Knowledge Discovery with DBS
- 12.1 Hydrology
- 12.1.1 Knowledge Acquisition and Prediction in the Hydrology Data Set
- 12.2 Pain Genes
- 12.2.1 Prior Knowledge
- 12.2.2 Knowledge Acquisition in Clusters of Pain Genes
- 13 Discussion
- 14 Conclusion
- References
- Appendices
- Supplement A: Evaluation of Common QMs
- Supplement B: Wine Dataset Distance Distribution
- Supplement C: Generalized Umatrix of Pswarm and SOP
- Supplement D: DBS Visualizations of S-shape and uniform Cuboid
- Supplement E: U-Matrix Visualizations of ESOM Projections
- Supplement F: Statistical Tests in Hydrology
- Supplement G: 3D Prints of Generalized Umatrix Visualizations of DBS
- Supplement H: Contingency Table for Tetragonula Bees Clustering
- Index.