ArrayMining - Online Microarray Data Mining
Ensemble and Consensus Analysis Methods for Gene Expression Data
Contact | [X] |
The project and the website are maintained by Enrico Glaab School of Computer Science, Nottingham University, UK webmaster@arraymining.net |
|
Close |
Contact | [X] |
The project and the website are maintained by Enrico Glaab School of Computer Science, Nottingham University, UK webmaster@arraymining.net |
|
Close |
Golub et al. (1999) Leukemia data set | [X] |
Short description: Analysis of patients with acute lymphoblastic leukemia (ALL, 1) or acute myeloid leukemia (AML, 0). Sample types: ALL, AML No. of genes: 7129 No. of samples: 72 (class 0: 25, class 1: 47) Normalization: VSN (Huber et al., 2002) References: - Golub et al., Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring, Science (1999), 531-537 - Huber et al., Variance stabilization applied to microarray data calibration and to the quantification of differential expression, Bioinformatics (2002) 18 Suppl.1 96-104 |
van't Veer et al. (2002) Breast cancer data set | [X] |
Short description: Samples from Breast cancer patients were subdivided in a "good prognosis" (0) and "poor prognosis" (1) group depending on the occurrence of distant metastases within 5 years. The data set is pre-processed as described in the original paper and was obtained from the R package "DENMARKLAB" (Fridlyand and Yang, 2004). Sample types: good prognosis, poor prognosis No. of genes: 4348 (pre-processed) No. of samples: 97 (class 0: 51, class 1: 46) Normalization: see reference (van't Veer et al., 2002) References: - van't Veer et al., Gene expression profiling predicts clinical outcome of breast cancer, Nature (2002), 415, p. 530-536 - Fridlyand,J. and Yang,J.Y.H. (2004) Advanced microarray data analysis: class discovery and class prediction (https://genome. cbs.dtu.dk/courses/norfa2004/Extras/DENMARKLAB.zip) |
Yeoh et al. (2002) Leukemia multi-class data set | [X] | ||
Short description: A multi-class data set for the prediction of the disease subtype in pediatric acute lymphoblastic leukemia (ALL).
No. of samples: 327 Normalization: VSN (Huber et al., 2002) References: - Yeoh et al. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell. March 2002. 1: 133-143 - Huber et al., Variance stabilization applied to microarray data calibration and to the quantification of differential expression, Bioinformatics (2002) 18 Suppl.1 96-104 |
Alon et al. (1999) Colon cancer data set | [X] |
Short description: Analysis of colon cancer tissues (1) and normal colon tissues (0). Sample types: tumour, healthy No. of genes: 2000 No. of samples: 62 (class 1: 40, class 0: 22) Normalization: VSN (Huber et al., 2002) References: - U. Alon, N. Barkai, D. Notterman, K. Gish, S. Ybarra, D. Mack, and A. Levine, “Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays,” in Proceedings of the National Academy of Science (1999), vol. 96, pp. 6745–6750 - Huber et al., Variance stabilization applied to microarray data calibration and to the quantification of differential expression, Bioinformatics (2002) 18 Suppl.1 96-104 |
Singh et al. (2002) Prostate cancer data set | [X] |
Short description: Analysis of prostate cancer tissues (1) and normal tissues (0). Sample types: tumour, healthy No. of genes: 2135 (pre-processed) No. of samples: 102 (class 1: 52, class 0: 50) Normalization: GeneChip RMA (GCRMA) References: - D. Singh, P.G. Febbo, K. Ross, D.G. Jackson, J.Manola, C. Ladd, P. Tamayo, A.A. Renshaw, A.V. D’Amico, J.P. Richie, et al. Gene expression correlates of clinical prostate cancer behavior. Cancer Cell, 1(2): pp. 203–209, 2002 - Z. Wu and R.A. Irizarry. Stochastic Models Inspired by Hybridization Theory for Short Oligonucleotide Arrays. Journal of Computational Biology, 12(6): pp. 882–893, 2005 |
Shipp et al. (2002) B-Cell Lymphoma data set | [X] |
Short description: Analysis of Diffuse Large B-Cell lymphoma samples (1) and follicular B-Cell lymphoma samples (0). Sample types: DLBCL, follicular No. of genes: 2647 (pre-processed) No. of samples: 77 (class 1: 58, class 0: 19) Normalization: VSN (Huber et al., 2002) References: - M.A. Shipp, K.N. Ross, P. Tamayo, A.P. Weng, J.L. Kutok, R.C.T. Aguiar, M. Gaasenbeek, M. Angelo, M. Reich, G.S. Pinkus, et al. Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nature Medicine, 8(1): pp. 68–74, 2002 - Huber et al., Variance stabilization applied to microarray data calibration and to the quantification of differential expression, Bioinformatics (2002) 18 Suppl.1 96-104 |
Shin et al. (2007) T-Cell Lymphoma data set | [X] |
Short description: Analysis of cutaneous T-Cell lymphoma (CTCL) samples from lesional skin biopsies. Samples are divided in lower-stage (stages IA and IB, 0) and higher-stage (stages IIB and III) CTCL. Sample types: lower_stage, higher_stage No. of genes: 2922 (pre-processed) No. of samples: 63 (class 1: 20, class 0: 43) Normalization: VSN (Huber et al., 2002) References: - J. Shin, S. Monti, D. J. Aires, M. Duvic, T. Golub, D. A. Jones and T. S. Kuppe, Lesional gene expression profiling in cutaneous T-cell lymphoma reveals natural clusters associated with disease outcome. Blood, 110(8): pp. 3015, 2007 - Huber et al., Variance stabilization applied to microarray data calibration and to the quantification of differential expression, Bioinformatics (2002) 18 Suppl.1 96-104 |
Armstrong et al. (2002) Leukemia data set | [X] |
Short description: Comparison of three classes of Leukemia samples: Acute lymphoblastic leukemia (ALL, 0), acute myelogenous leukemia (AML, 1) and ALL with mixed-lineage leukemia gene translocation (MLL, 3). Sample types: ALL, AML, MLL No. of genes: 8560 (pre-processed) No. of samples: 72 (class 0: 24, class 1: 28, class 2: 20) Normalization: VSN (Huber et al., 2002) References: - S.A. Armstrong, J.E Staunton, L.B. Silverman, R. Pieters, M.L. den Boer, M.D. Minden, S.E. Sallan, E.S. Lander, T.R. Golub, S.J. Korsmeyer; MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nature Genetics, 30(1): pp. 41–47, 2002 - Huber et al., Variance stabilization applied to microarray data calibration and to the quantification of differential expression, Bioinformatics (2002) 18 Suppl.1 96-104 |
Parametric Gene Set Analysis (PGSEA) | [X] |
Short description: The Parametric Gene Set Enrichment Analysis (PGSEA or PAGE) method by Kim and Volsky (2005) uses a parametric statistical model to identify significantly differentially expressed gene sets of functionally related genes. PGSEA is a further development of the classical GSEA-method (Mootha, 2003), shown to provide a larger number of differentially expressed gene sets and requiring less computation (a normal distribution model assumption helps to avoid the heavy computation needed in the permutation-based random model of GSEA). Further details about the algorithm can be found in the reference paper. References: - Kim S. Y. and Volsky D. J. (2005). PAGE: Parametric Analysis of Gene Set Enrichment. BMC Bioinformatics, 6, 144 |
Multidimensional Scaling GSA (MDS-GSA) | [X] |
Short description: The MDS-GSA method uses a classical metric Multidmensional Scaling to represent the expression level data for gene sets extracted from a microarray experiment in a one-dimensional space (the "meta-gene" expression vector). Since MDS requires a dissimilarity matrix as input, this matrix is pre-computed from the gene set data using the Euclidean distance. Please note that the scale and sign of the transformed data does not necessarily correspond to the usual value ranges for gene expression vectors. |
Principal Component GSA (PC-GSA) | [X] |
Short description: In the PCA-GSA approach the first principal component of the expression matrix for a gene set is used summarize the dat into a single vector (the "meta-gene" expression vector). Variables are scaled to have unit variance before applying the analysis. Please note that the scale and sign of the transformed data does not necessarily correspond to the usual value ranges for gene expression vectors. |
Help | [X] |
Close |
Help | [X] |
|
Terms and Conditions | [X] |
Close |
Arraymining.net - Newsletter | [X] |
Stay informed about updates and new features on our website by joining our newsletter. Your email address remains strictly confidential and will only be used to inform you about major updates of our web-service (<= 1 email per month). You can unsubscribe at any time by clicking on the unsubscribe link at the bottom of our e-mails. |
Arraymining.net - Newsletter | [X] |
|
Gene Set Analysis (GSA)
Using the web-form below users can identify whether sets of functionally related genes
are significantly differentially expressed in different microarray sample classes.
To obtain instructions click help.