Ãëàâíûå ìíîãîîáðàçèÿ äëÿ âèçóàëèçàöèè è àíàëèçà äàííûõ
Principal Manifolds for Data Visualisation and
Dimension Reduction
Ïî ãëàâàì (PDF
files of Chapters):
Contents
Frontmatter
(Preface-Contents-List of Authors)
1 Developments
and Applications of Nonlinear
Principal Component Analysis – a Review
Uwe Kruger, Junping
Zhang, Lei Xie
1.1 Introduction
1.2 PCA Preliminaries
1.3 Nonlinearity Test for PCA Models
1.3.1 Assumptions
1.3.2 Disjunct Regions
1.3.3 Confidence Limits for Correlation Matrix
1.3.4 Accuracy Bounds
1.3.5 Summary of the Nonlinearity Test
1.3.6 Example Studies
1.4 Nonlinear PCA Extensions
1.4.1 Principal Curves and Manifolds
1.4.2 Neural Network Approaches
1.4.3 Kernel PCA
1.5 Analysis of Existing Work
1.5.1 Computational Issues
1.5.2 Generalization of Linear PCA?
1.5. Roadmap for Future Developments (Basics and
Beyond)
1.6 Concluding Summary
References
2 Nonlinear
Principal Component Analysis:
Neural Network Models and Applications
Matthias Scholz, Martin Fraunholz, Joachim Selbig
2.1 Introduction
2.2 Standard Nonlinear PCA
2.3 Hierarchical Nonlinear PCA
2.3.1 The Hierarchical Error Function
2.4 Circular PCA
2.5 Inverse Model of Nonlinear PCA
2.5.1 The Inverse Network Model
2.5.2 NLPCA Models Applied to Circular Data
2.5.3 Inverse NLPCA for Missing Data
2.5.4 Missing Data Estimation
2.6 Applications
2.6.1 Application of Hierarchical NLPCA
2.6.2 Metabolite Data Analysis
2.6.3 Gene Expression Analysis
2.7 Summary
References
3 Learning
Nonlinear Principal Manifolds
Hujun Yin
3.1 Introduction
3.2 Biological Background
3.2.1 Lateral Inhibition and Hebbian Learning
3.2.2 From Von Marsburg and Willshaw’s Mode
to Kohonen’s SOM
3.2.3 The SOM Algorithm
3.3 Theories
3.3.1 Convergence and Cost Functions
3.3.2 Topological Ordering Measures
3.4 SOMs, Multidimensional Scaling
and Principal Manifolds
3.4.1 Multidimensional Scaling
3.4.2 Principal Manifolds
3.4.3 Visualisation Induced SOM (ViSOM)
3.5 Examples
3.5.1 Data
Visualisation
3.5.2 Document
Organisation and Content Management
References
4 Elastic Maps
and Nets for Approximating
Principal Manifolds and Their Application
to Microarray Data Visualization
Alexander N Gorban, Andrei Y Zinovyev
4.1 Introduction and Overview
4.1.1 Fr´echet Mean and
Principal Objects
K-Means, PCA, what else?
4.1.2 Principal Manifolds
4.1.3 Elastic Functional and Elastic Nets
4.2 Optimization of Elastic Nets for Data
Approximation
4.2.1 Basic Optimization Algorithm
4.2.2 Missing Data Values
4.2.3 Adaptive Strategies
4.3 Elastic Maps
4.3.1 Piecewise Linear Manifolds and Data
Projectors
4.3.2 Iterative Data Approximation
4.4 Principal Manifold as Elastic Membrane
4.5 Method Implementation
4.6 Examples
4.6.1 Test Examples
4.6.2 Modeling Molecular Surfaces
4.6.3 Visualization of Microarray Data
4.7 Discussion
References
5
Topology-Preserving Mappings for Data Visualisation
Marian Pe¯na,
Wesam Barbakh, Colin Fyfe
5.1 Introduction
5.2 Clustering Techniques
5.2.1 K-Means
5.2.2 K-Harmonic Means
5.2.3 Neural Gas
5.2.4 Weighted K-Means
5.2.5 The Inverse Weighted K-Means
5.3 Topology Preserving Mappings
5.3.1 Generative Topographic Map
5.3.2 Topographic Product of Experts ToPoE
5.3.3 The Harmonic Topograpic Map
5..3.4 Topographic Neural Gas
5.3.5 Inverse-Weighted K-Means
Topology-Preserving Map
5.4 Experiments
5.4.1 Projections in Latent Space
5.4.2 Responsibilities
5.4.3 U-matrix, Hit Histograms and Distance
Matrix
5.4.4 The Quality of The
Map
5.5 Conclusions
References
6 The Iterative
Extraction Approach to Clustering
Boris Mirkin
6.1 Introduction
6.2 Clustering Entity-to-feature Data
6.2.1 Principal Component Analysis
6.2.2 Additive Clustering Model and ITEX
6.2.3 Overlapping and Fuzzy Clustering Case
6.2.4 K-Means and iK-Means Clustering
6.3 ITEX Structuring and Clustering for Similarity
Data
6.3.1 Similarity Clustering: a Review
6.3.2 The Additive Structuring Model and ITEX
6.3.3 Additive Clustering Model
6.3.4 Approximate Partitioning
6.3.5 One Cluster Clustering
6.3.6 Some Applications
References
7 Representing
Complex Data Using Localized Principal
Components with Application to Astronomical Data
Jochen Einbeck, Ludger Evers, Coryn Bailer-Jones
7.1 Introduction
7.2 Localized Principal Component Analysis
7.2.1 Cluster-wise PCA
7.2.2 Principal Curves
7.2.3 Further Approaches
7.3 Combining Principal Curves and Regression
7.3.1 Principal Component Regression and its
Shortcomings
7.3.2 The Generalization to Principal Curves
7.3.3 Using Directions Other than the Local Principal
Components
7.3.4 A Simple Example
7.4 Application to the Gaia Survey
7.4.1 The Astrophysical Data
7.4.2 Principal Manifold Based Approach
7.5 Conclusion
References
8
Auto-Associative Models, Nonlinear Principal Component
Analysis, Manifolds and Projection Pursuit
St´ephane
Girard, Serge Iovleff
8.1 Introduction
8.2 Auto-Associative Models
8.2.1 Approximation by Manifolds
8.2.2 A Projection Pursuit Algorithm
8.2.3 Theoretical Results
8.3 Examples
8.3.1 Linear Auto-Associative Models and PCA
8.3.2 Additive Auto-Associative Models and Neural
Networks
8.4 Implementation Aspects
8.4.1 Estimation of the Regression Functions
8.4.2 Computation of Principal Directions
8.5 Illustration on Real and Simulated Data
References
9 Beyond The Concept of Manifolds: Principal Trees,
Metro Maps, and Elastic Cubic Complexes
Alexander N Gorban, Neil R Sumner, Andrei Y
Zinovyev
9.1 Introduction and Overview
9.1.1 Elastic Principal Graphs
9.2 Optimization of Elastic Graphs
for Data Approximation
9.2.1 Elastic Functional Optimization
9.2.2 Optimal Application of Graph Grammars
9.2.3 Factorization and Transformation of Factors
9.3 Principal Trees (Branching Principal Curves)
9.3.1 Simple Graph Grammar (“Add a Node”, “Bisect an
Edge”)
9.3.2 Visualization of Data Using “Metro Map”
Two-Dimensional Tree Layout
9.3.3 Example of Principal Cubic Complex: Product of
Principal Trees
9.4 Analysis of the Universal 7-Cluster Structure
of Bacterial Genomes
9.4.1 Brief Introduction
9.4.2 Visualization of the 7-Cluster Structure
9.5 Visualization of Microarray Data
9.5.1 Dataset Used
9.5.2 Principal Tree of Human Tissues
9.6 Discussion
References
10 Diffusion
Maps - a Probabilistic Interpretation
for Spectral Embedding and Clustering Algorithms
Boaz Nadler, Stephane Lafon, Ronald Coifman, Ioannis G
Kevrekidis
10.1 Introduction
10.2 Diffusion Distances
and Diffusion Maps
10.2.1 Asymptotics of the Diffusion Map
10.3 Spectral Embedding of Low Dimensional
Manifolds
10.4 Spectral Clustering of a Mixture of
Gaussians
10.5 Summary and Discussion
References
11 On Bounds
for Diffusion, Discrepancy
Steven B Damelin
11.1 Introduction
11.2 Energy, Discrepancy, Distance
and Integration on Measurable Sets in Euclidean
Space
11.3 Set Learning via Normalized Laplacian
Dimension Reduction and Diffusion Distance
11.4 Main Result: Bounds for Discrepancy,
Diffusion and Fill Distance Metrics
References
12 Geometric
Optimization Methods for the Analysis
Michel Journ´ee, Andrew E Teschendorff, Pierre-Antoine Absil,
Simon Tavar´e, Rodolphe Sepulchre
12.1 Introduction
12.2
12.3 Contrast Functions
12.3.1 Mutual Information [8, 10]
12.3.2 F-Correlation [14]
12.3.3 Non-Gaussianity [17]
12.3.4 Joint Diagonalization of Cumulant Matrices
[19]
12.4 Matrix Manifolds for
12.5 Optimization Algorithms
12.5.1 Line-Search Algorithms
12.5.2 FastICA
12.5.3 Jacobi Rotations
12.6 Analysis of Gene Expression Data by
12.6.1 Some Issues About the Application of
12.6.2 Evaluation of the Biological Relevance
of the Expression Modes
12.6.3 Results Obtained on the Breast Cancer
Microarray Data Set
12.7 Conclusion
References
13
Dimensionality Reduction and Microarray data
David A Elizondo, Benjamin N Passow, Ralph Birkenhead,
Andreas Huemer
13.1 Introduction
13.2 Background
13.2.1 Microarray Data
13.2.2 Methods for Dimension Reduction
13.2.3 Linear Separability
13.3 Comparison Procedure
13.3.1 Data Sets
13.3.2 Dimensionality Reduction
13.3.3 Perceptron Models
13.4 Results
13.5 Conclusions
References
14 PCA and K-Means Decipher Genome
Alexander N Gorban, Andrei Y Zinovyev
14.1 Introduction
14.2 Required Materials
14.3 Genomic Sequence
14.3.1 Background
14.3.2 Sequences for the Analysis
14.4 Converting Text to a Numerical Table
14.5 Data Visualization
14.5.1 Visualization
14.5.2 Understanding Plots
14.6 Clustering and Visualizing Results
14.7 Task List and Further Information
14.8 Conclusion
References