pipeline-resources

About bulk RNAseq pipeline

This pipeline is designed to process bulk RNAseq data in general. It is compatible with various library preparation methods such as total RNAseq or mRNA sequencing. It can help the user evaluate the quality of their samples or libraries, quantify reads to genes, find differentially-expressed genes and pathways, and more. Please find the sample report here

Source of the pipeline

This pipeline is originally adapted from community-developed nf-core/rnaseq pipeline version 1.4.2. Zymo Research made significant contributions in the adaption effort. This mainly include the addition of downstream analyses such as differential gene expression and pathway enrichment analysis, and the improvement of the report and its documentation.

What is in the pipeline

This pipeline is built using Nextflow. A brief summary of pipeline:

Raw read QC (FastQC)
UMI extraction when applicable (UMI-tools)
Adapter trimming (Trim Galore!)
Alignment (STAR)
UMI deduplication when applicable (UMI-tools)
Sample level QC:
- Various RNAseq related QC plots (RSeQC)
- 5’ or 3’ bias plot (Qualimap)
- Library complexity estimation (Preseq)
- Mark duplicate reads (Picard)
- Duplication rate QC (dupRadar)
- Biotype composition plot (featureCounts)
- ERCC spike-in plot ([STAR and featureCounts)
Read counting (featureCounts)
Sample comparison
- Sample similarity matrix, sample MDS plot, heatmap of expression patterns of top genes (DESeq2)
- Differential gene expression analysis (DESeq2)
- Gene set enrichment analysis (g:Profiler)
Present all QC, analysis results in an interactive, comprehensive report (MultiQC)

For details, please find the source code here.

Citations

Comming soon …
A list of publications where this pipeline and its predecessors were used to analyze the data.