BOOK-CHAPTER

Microarray Data Analysis Using Neural Network Classifiers and Gene Selection Methods

Gaolin ZhengE. Olusegun GeorgeGiri Narasimhan

Year: 2006 Kluwer Academic Publishers eBooks Pages: 207-222   Publisher: Springer Science+Business Media

Abstract

Abstract:Different research groups have conducted independent gene expression studies on tissue samples from human lung adenocarcinomas [Bhattacharjee et al. 2001; Beer et al. 2002]. In this paper we (a) investigate methods to integrate data obtained from independent studies, (b) experiment with different gene selection methods to find genes that have significantly differential expression among different tumor stages, (c) study the performance of neural network classifiers with correlated weights, and (d) compare the performance of classifiers based on neural networks and its many variants on gene expression data. Raw cell intensity data were preprocessed for our analyses. Affymetrix array comparison spreadsheets were used to extract the overlapping probe sets for the data integration study. We considered neural network classifiers with random weights selected from a univariate normal distribution and optimized using Bayesian methods. The performance of the neural network was further enhanced using ensemble techniques such as bagging and boosting. The performance of all the resulting classifiers was compared using the Michigan and Harvard data sets from the CAMDA website. Three gene selection methods were used to find significant genes that could discriminate between the various stages of lung cancer. Significant genes, which were mined from the Gene Ontology (GO) database using the GoMiner and AmiGO packages, were found to be involved in apoptosis, angiogenesis, and cell growth and differentiation. Neural networks enhanced with bagging exhibited the best performance among all the classifiers we tested.

Keywords:
Artificial neural network Artificial intelligence Computer science Random subspace method Naive Bayes classifier Machine learning Boosting (machine learning) Data mining Univariate Feature selection Pattern recognition (psychology) Computational biology Support vector machine Biology Multivariate statistics

Metrics

5
Cited By
0.00
FWCI (Field Weighted Citation Impact)
20
Refs
0.13
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Gene expression and cancer classification
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Bioinformatics and Genomic Networks
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Machine Learning in Bioinformatics
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.