Sammon mapping
Suppose that we consider a set of n objects. Each object is represented by one point in an m-dimensional (high-dimensional) space. The aim of Sammon mapping is to find n points in a d-dimensional space (with d < m), in such a way that the corresponding distances approximate the original ones as well as possible. We denote:
- dij,
- the distance between two points in a d-dimensional space.
,
- the distance between two points in an m-dimensional space.
Without loss of generality, only projections onto a 2-dimensional space are studied (d=2), since our interest is in data visualization.There is a need for a criterion to decide whether one configuration is better than another. For that purpose, the error (stress) function E is considered, which measures the difference between the present configuration of n points in the d-dimensional space and the configuration of n points in the original m-dimensional space. The stress is given by the following formula:

and yields in fact a badness-of-fit measure for the entire representation. The stress range is [0,1] with 0 indicating a lossless mapping.


'Machine Learning' 카테고리의 다른 글
| PCA(Principal Components Analysis) (0) | 2009/11/13 |
|---|---|
| Hamming Distance (0) | 2009/11/11 |
| Sammon's Stress (0) | 2009/11/07 |
| MNIST parser (0) | 2009/10/20 |
| 참고하면 좋을 블로그 (0) | 2009/10/01 |
| Dendrogram (0) | 2009/09/07 |