Difference between revisions of "Cluster analysis by K-means (analysis)"
From BioUML platform
m (Protected "Cluster analysis by K-means (analysis)": Autogenerated page ([edit=sysop] (indefinite))) |
(Automatic synchronization with BioUML) |
||
(10 intermediate revisions by one user not shown) | |||
Line 1: | Line 1: | ||
+ | ;Analysis title | ||
+ | :[[File:Statistics-Cluster-analysis-by-K-means-icon.png]] Cluster analysis by K-means | ||
+ | ;Provider | ||
+ | :[[Institute of Systems Biology]] | ||
+ | ;Class | ||
+ | :{{Class|ru.biosoft.analysis.ClusterAnalysis}} | ||
+ | ;Plugin | ||
+ | :[[Ru.biosoft.analysis (plugin)|ru.biosoft.analysis (Common methods of data analysis plug-in)]] | ||
+ | |||
==== Goal: ==== | ==== Goal: ==== | ||
Genes are grouped into clusters so that those in one cluster exhibit maximal similarity, whereas those of different clusters are maximally dissimilar. | Genes are grouped into clusters so that those in one cluster exhibit maximal similarity, whereas those of different clusters are maximally dissimilar. | ||
Line 18: | Line 27: | ||
==== References: ==== | ==== References: ==== | ||
− | # Forgy, E. W. (1965) Cluster analysis of multivariate data: efficiency vs interpretability of classifications. Biometrics '''21''', | + | # Forgy, E. W. (1965) Cluster analysis of multivariate data: efficiency vs interpretability of classifications. Biometrics '''21''', 768�769. |
− | # Hartigan, J. A. and Wong, M. A. (1979). A K-means clustering algorithm. Applied Statistics '''28''', | + | # Hartigan, J. A. and Wong, M. A. (1979). A K-means clustering algorithm. Applied Statistics '''28''', 100�108. |
− | # Lloyd, S. P. (1957, 1982) Least squares quantization in PCM. Technical Note, Bell Laboratories. Published in 1982 in IEEE Transactions on Information Theory '''28''', | + | # Lloyd, S. P. (1957, 1982) Least squares quantization in PCM. Technical Note, Bell Laboratories. Published in 1982 in IEEE Transactions on Information Theory '''28''', 128�137. |
− | # MacQueen, J. (1967) Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, eds L. M. Le Cam & J. Neyman, '''1''', pp. | + | # MacQueen, J. (1967) Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, eds L. M. Le Cam & J. Neyman, '''1''', pp. 281�297. Berkeley, CA: University of California Press. |
[[Category:Analyses]] | [[Category:Analyses]] | ||
− | [[Category: | + | [[Category:Statistics (analyses group)]] |
+ | [[Category:ISB analyses]] | ||
[[Category:Autogenerated pages]] | [[Category:Autogenerated pages]] |
Latest revision as of 18:15, 9 December 2020
- Analysis title
- Cluster analysis by K-means
- Provider
- Institute of Systems Biology
- Class
ClusterAnalysis
- Plugin
- ru.biosoft.analysis (Common methods of data analysis plug-in)
Contents |
[edit] Goal:
Genes are grouped into clusters so that those in one cluster exhibit maximal similarity, whereas those of different clusters are maximally dissimilar.
[edit] Input:
A table of genes or probes with their expression values or fold change calculated. Depending on the algorithm, input of certain parameters is required.
[edit] Output:
A table with the same genes grouped into clusters.
[edit] Parameters:
- Experiment data - experimental data for analysis.
- Table - a table with experimental data stored in repository.
- Columns - the columns from the table which should be taken for the clustering analysis.
- Cluster algorithm - the version of the K-means algorithm to be applied [1-4].
- Cluster number - the number of clusters into which the input data will be divided.
- Output table - name and path in the repository under which the result table will be saved. If a table with the specified name and path already exists, it will be overwritten.
[edit] Further details:
The clustering is done with the K-means algorithm as implemented in the R package (http://www.r-project.org/).
[edit] References:
- Forgy, E. W. (1965) Cluster analysis of multivariate data: efficiency vs interpretability of classifications. Biometrics 21, 768�769.
- Hartigan, J. A. and Wong, M. A. (1979). A K-means clustering algorithm. Applied Statistics 28, 100�108.
- Lloyd, S. P. (1957, 1982) Least squares quantization in PCM. Technical Note, Bell Laboratories. Published in 1982 in IEEE Transactions on Information Theory 28, 128�137.
- MacQueen, J. (1967) Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, eds L. M. Le Cam & J. Neyman, 1, pp. 281�297. Berkeley, CA: University of California Press.