Elementary Cluster Analysis : : Four Basic Methods That (Usually) Work.

Saved in:
Bibliographic Details
:
Place / Publishing House:Denmark : : River Publishers,, 2022.
Ã2022.
Year of Publication:2022
Edition:1st ed.
Language:English
Online Access:
Physical Description:1 online resource (518 pages)
Tags: Add Tag
No Tags, Be the first to tag this record!
LEADER 08642nam a22004333i 4500
001 50029156150
003 MiAaPQ
005 20240229073849.0
006 m o d |
007 cr cnu||||||||
008 240229s2022 xx o ||||0 eng d
020 |a 9788770224246  |q (electronic bk.) 
035 |a (MiAaPQ)50029156150 
035 |a (Au-PeEL)EBL29156150 
035 |a (OCoLC)1311313906 
040 |a MiAaPQ  |b eng  |e rda  |e pn  |c MiAaPQ  |d MiAaPQ 
050 4 |a QA278.55 
082 0 |a 519.53028557 
100 1 |a Bezdek, James C. 
245 1 0 |a Elementary Cluster Analysis :  |b Four Basic Methods That (Usually) Work. 
250 |a 1st ed. 
264 1 |a Denmark :  |b River Publishers,  |c 2022. 
264 4 |c Ã2022. 
300 |a 1 online resource (518 pages) 
336 |a text  |b txt  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
505 0 |a Front Cover -- Elementary Cluster Analysis: Four Basic Methods that (Usually) Work -- Contents -- Preface -- List of Figures -- List of Tables -- List of Abbreviations -- Appendix A. List of Algorithms -- Appendix D. List of Definitions -- Appendix E. List of Examples -- Appendix L. List of Lemmas and Theorems -- Appendix V. List of Video Links -- I The Art and Science of Clustering -- 1 Clusters: The Human Point of View (HPOV) -- 1.1 Introduction -- 1.2 What are Clusters? -- 1.3 Notes and Remarks -- 1.4 Exercises -- 2 Uncertainty: Fuzzy Sets and Models -- 2.1 Introduction -- 2.2 Fuzzy Sets and Models -- 2.3 Fuzziness and Probability -- 2.4 Notes and Remarks -- 2.5 Exercises -- 3 Clusters: The Computer Point of View (CPOV) -- 3.1 Introduction -- 3.2 Label Vectors -- 3.3 Partition Matrices -- 3.4 How Many Clusters are Present in a Data Set? -- 3.5 CPOV Clusters: The Computer's Point of View -- 3.6 Notes and Remarks -- 3.7 Exercises -- 4 The Three Canonical Problems -- 4.1 Introduction -- 4.2 Tendency Assessment - (Are There Clusters?) -- 4.2.1 An Overview of Tendency Assessment -- 4.2.2 Minimal Spanning Trees (MSTs) -- 4.2.3 Visual Assessment of Clustering Tendency -- 4.2.4 The VAT and iVAT Reordering Algorithms -- 4.3 Clustering (Partitioning the Data into Clusters) -- 4.4 Cluster Validity (Which Clusters are "Best"?) -- 4.5 Notes and Remarks -- 4.6 Exercises -- 5 Feature Analysis -- 5.1 Introduction -- 5.2 Feature Nomination -- 5.3 Feature Analysis -- 5.4 Feature Selection -- 5.5 Feature Extraction -- 5.5.1 Principal Components Analysis -- 5.5.2 Random Projection -- 5.5.3 Sammon's Algorithm -- 5.5.4 Autoencoders -- 5.5.5 Relational Data -- 5.6 Normalization and Statistical Standardization -- 5.7 Notes and Remarks -- 5.8 Exercises -- II Four Basic Models and Algorithms -- 6 The c-Means (aka k-Means) Models -- 6.1 Introduction. 
505 8 |a 6.2 The Geometry of Partition Spaces -- 6.3 The HCM/FCM Models and Basic AO Algorithms -- 6.4 Cluster Accuracy for Labeled Data -- 6.5 Choosing Model Parameters (c, m, ||*||A) -- 6.5.1 How to Pick the Number of Clusters c -- 6.5.2 How to Pick the Weighting Exponent m -- 6.5.3 Choosing the Weight Matrix (A) for the Model Norm -- 6.6 Choosing Execution Parameters (V0, ", ||*||err,T) -- 6.6.1 Choosing Termination and Iterate Limit Criteria -- 6.6.2 How to Pick an Initial V0 (or U0) -- 6.6.3 Acceleration Schemes for HCM (aka k-Means) and (FCM) -- 6.7 Cluster Validity With the Best c Method -- 6.7.1 Scale Normalization -- 6.7.2 Statistical Standardization -- 6.7.3 Stochastic Correction for Chance -- 6.7.4 Best c Validation With Internal CVIs -- 6.7.5 Crisp Cluster Validity Indices -- 6.7.6 Soft Cluster Validity Indices -- 6.8 Alternate Forms of Hard c-Means (aka k-Means) -- 6.8.1 Bounds on k-Means in Randomly Projected Downspaces -- 6.8.2 Matrix Factorization for HCM for Clustering -- 6.8.3 SVD: A Global Bound for J1 (U, V -- X) -- 6.9 Notes and Remarks -- 6.10 Exercises -- 7 Probabilistic Clustering - GMD/EM -- 7.1 Introduction -- 7.2 The Mixture Model -- 7.3 The Multivariate Normal Distribution -- 7.4 Gaussian Mixture Decomposition -- 7.5 The Basic EM Algorithm for GMD -- 7.6 Choosing Model and Execution Parameters for EM -- 7.6.1 Estimating c With iVAT -- 7.6.2 Choosing Q0 or P0 in GMD -- 7.6.3 Implementation Parameters ", ||*||err,T for GMD With EM -- 7.6.4 Acceleration Schemes for GMD With EM -- 7.7 Model Selection and Cluster Validity for GMD -- 7.7.1 Two Interpretations of the Objective of GMD -- 7.7.2 Choosing the Number of Components Using GMD/EM With GOFIs -- 7.7.3 Choosing the Number of Clusters Using GMD/EM With CVIs -- 7.8 Notes and Remarks -- 7.9 Exercises -- 8 Relational Clustering - The SAHN Models -- 8.1 Relations and Similarity Measures. 
505 8 |a 8.2 The SAHN Model and Algorithms -- 8.3 Choosing Model Parameters for SAHN Clustering -- 8.4 Dendrogram Representation of SAHN Clusters -- 8.5 SL Implemented With Minimal Spanning Trees -- 8.5.1 The Role of the MST in Single Linkage Clustering -- 8.5.2 SL Compared to a Fitch-Margoliash Dendrogram -- 8.5.3 Repairing SL Sensitivity to Inliers and Bridge Points -- 8.5.4 Acceleration of the Single Linkage Algorithm -- 8.6 Cluster Validity for Single Linkage -- 8.7 An Example Using All Four Basic Models -- 8.8 Notes and Remarks -- 8.9 Exercises -- 9 Properties of the Fantastic Four: External Cluster Validity -- 9.1 Introduction -- 9.2 Computational Complexity -- 9.2.1 Using Big-Oh to Measure the Growth of Functions -- 9.2.2 Time and Space Complexity for the Fantastic Four -- 9.3 Customizing the c-Means Models to Account for Cluster Shape -- 9.3.1 Variable Norm Methods -- 9.3.2 Variable Prototype Methods -- 9.4 Traversing the Partition Landscape -- 9.5 External Cluster Validity With Labeled Data -- 9.5.1 External Paired-Comparison Cluster Validity Indices -- 9.5.2 External Best Match (Best U, or Best E) Validation -- 9.5.3 The Fantastic Four Use Best E Evaluations on Labeled Data -- 9.6 Choosing an Internal CVI Using Internal/External (Best I/E) Correlation -- 9.7 Notes and Remarks -- 9.8 Problems -- 10 Alternating Optimization -- 10.1 Introduction -- 10.2 General Considerations on Numerical Optimization -- 10.2.1 Iterative Solution of Optimization Problems -- 10.2.2 Iterative Solution of Alternating Optimization with (t, s) Schemes -- 10.3 Local Convergence Theory for AO -- 10.4 Global Convergence Theory -- 10.5 Impact of the Theory for the c-Means Models -- 10.6 Convergence for GMD Using EM/AO -- 10.7 Notes and Remarks -- 10.8 Exercises -- 11 Clustering in Static Big Data -- 11.1 The Jungle of Big Data -- 11.1.1 An Overview of Big Data. 
505 8 |a 11.1.2 Scalability vs. Acceleration -- 11.2 Methods for Clustering in Big Data -- 11.3 Sampling Functions -- 11.3.1 Chunk Sampling -- 11.3.2 Random Sampling -- 11.3.3 Progressive Sampling -- 11.3.4 Maximin (MM) Sampling -- 11.3.5 Aggregation and Non-Iterative Extension of a Literal Partition to the Rest of the Data -- 11.4 A Sampler of Other Methods: Precursors to Streaming Data Analysis -- 11.5 Visualization of Big Static Data -- 11.6 Extending Single Linkage for Static Big Data -- 11.7 Notes and Remarks -- 11.8 Exercises -- 12 Structural Assessment in Streaming Data -- 12.1 Streaming Data Analysis -- 12.1.1 The Streaming Process -- 12.1.2 Computational Footprints -- 12.2 Streaming Clustering Algorithms -- 12.2.1 Sequential Hard c-Means and Sebestyen's Method -- 12.2.2 Extensions of Sequential Hard c-Means: BIRCH, CluStream, and DenStream -- 12.2.3 Model-Based Algorithms -- 12.2.4 Projection and Grid-Based Methods -- 12.3 Reading the Footprints: Hindsight Evaluation -- 12.3.1 When You Can See the Data and Footprints -- 12.3.2 When You Can't See the Data and Footprints -- 12.3.3 Change Point Detection -- 12.4 Dynamic Evaluation of Streaming Data Analysis -- 12.4.1 Incremental Stream Monitoring Functions (ISMFs) -- 12.4.2 Visualization of Streaming Data -- 12.5 What's Next for Streaming Data Analysis? -- 12.6 Notes and Remarks -- 12.7 Exercises -- References -- Index -- About the Author -- Back Cover. 
588 |a Description based on publisher supplied metadata and other sources. 
590 |a Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2024. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.  
650 0 |a Cluster analysis. 
650 0 |a Cluster analysis--Data processing. 
655 4 |a Electronic books. 
776 0 8 |i Print version:  |a Bezdek, James C.  |t Elementary Cluster Analysis: Four Basic Methods That (Usually) Work  |d Denmark : River Publishers,c2022 
797 2 |a ProQuest (Firm) 
856 4 0 |u https://ebookcentral.proquest.com/lib/oeawat/detail.action?docID=29156150  |z Click to View