Novel next generation sequencing tool - RNA Sequencing, advantages, challenges and opportunities


An typical RNA-seq experiment

Fig. 1. A typical RNA-Seq experiment

General themes of RNA-seq workflows

     Computational protocol for RNA-seq (Fig. 2.).
  • Obtain raw data
  • Align (with reference genome)/assemble (without reference genome) reads
  • Process alignment with a tool specific to the goal
  • Post Process
  • Summarize and visualize

Fig. 2. RNA-Seq computational protocol (4)
An example:

RNA-Seq analysis: practical session using galaxy main server (5)

The dataset: Genome-wide analysis of allelic expression imbalance in human primary cells by high-throughput transcriptome resequencing.

Opening a session in Galaxy

   Galaxy is an open, web-based platform for data intensive biomedical research.

2. Obtaining the data

3. Quality control of high throughput sequencing data

    FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming
   from high-throughput sequencing pipelines.

4. Loading fastq file onto Galaxy server

5. Mapping read with TopHat

   TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes    
   using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice
   junctions between exons.

6.Viewing the results with Integrated Genome Browser (IGV)

7.Computing FPKM with cufflinks

   Cufflinks perform transcript assembly and FPKM (RPKM) estimates for RNA-Seq data. One important parameter of 
   Cufflinks is to choose a reference genome that will tell cufflinks the locations of the gene for which we want to compute 
   the expression. This argument appear as Use Reference Annotation parameter in Galaxy.

Advantage of RNA-seq compared with microarrays

  • RNA-seq does not need reference sequence for genes/genome being assayed
  • More sensitive for less abundant transcripts
  • Large dynamic range (105 vs. 102 for microarrays)
  • Allows the detection of nucleotide variation in the transcribed regions (SNP)
  • Quantitation of splicing
  • Can survey novel genes if genome model still early stage
  • Reanalysis of data can become more valuable as genome annotation improves
  • High technical reproducibility

Table 1. Several advantage of RNA-seq compared with microarrays (6)

Reliance on genomic sequence
Background noise High Low
Sensitivity for less abundant transcripts
Dynamic range
Required amount of RNA

     Fig. 3. Quantifying expression levels: RNA-Seq and microarray compared (6)

An example:

To enhance the scientific community's understanding of the advantages and challenges of RNA-Seq, the performance of an RNA-Seq approach (Illumina Genome Analyzer II) and a microarray-based approach (Affymetrix Rat Genome 230 2.0 arrays) for detecting differentially expressed genes (DEGs) in the kidneys of rats was carried out.
The results indicated that RNA-Seq was more sensitive in detecting genes with low expression levels, while similar gene expression patterns were observed for both platforms. Moreover, although the overlap of the DEGs was only 40-50%, the biological interpretation was largely consistent between the RNA-Seq and microarray data (3).