Abstract

Visualization of High Throughput Genomic Data Using R and Bioconductor

Ruchi Yadav and Prachi Srivastava

DNA microarrays, technology aims at the measurement of mRNA levels in particular cells or tissues for many genes simultaneously. Microarray in molecular biology results in huge datasets that need rigorous computational analysis to extract biological information that lead to some conclusion. From printing of microarray chip to hybridization and scanning process it results in variability in quality of data due to which actual information is either lost or it is over represented. Computational analysis plays an important part related to the processing of the biological information embedded in microarray results and for comparing gene expression result obtained from different samples in different condition for biological interpretation. A basic, yet challenging task is quality control and visualization of microarray gene expression data. One of the most popular platforms for microarray analysis is Bioconductor, an open source and open development software project for the analysis and comprehension of genomic data, based on the R programming language. This paper describes specific procedures for conducting quality assessment of Affymetrix Gene chip using data from GEO database GSE53890 and describes quality control packages of bioconductor with reference to visualization plots for detailed analysis. This paper can be helpful for any researcher working on microarray analysis for quality control analysis of affymetrix chip along with scientific interpretations.