Introduction

This report presents identified differential methylation cytosines (DMCs) and regions (DMRs).

Workflow of data processing

A brief summary of the data workflow:

  1. Data filtering: cytosines with \(read~depth \ge 5\) in \(\ge 2 ~samples\) in a group are kept, otherwise removed for that group before downstream analyses.

  2. Detecting DMCs and DMRs: DMCs are detected with dss and DMRs are detected with dss. Significant DMCs and DMRs have \(FDR \le 0.05\) (if P values are provided by statistical method) and \(absolute~methylation~difference \ge 0.1\).

  3. Annotation of DMRs: the genes overlapped with any DMR are identified. Here an overlapped gene is defined as at least 1% of the gene region is covered by any DMRs. The gene region is defined as the genomic region between transcription start and end, plus the upstream 1000 basepairs.

  4. Functional enrichment analyses of the overlapped genes: the overlapped genes identified in previous step are input into g:Profiler for functional enrichment analysis.

  5. Plots: some plots such as heatmaps are generated to visulize the results.

  6. Downloads: all result files are available for downloading at the Downloads section.

Sample information

Here is a table of samples used for DMC/DMR analyses. The column group is used to group samples.

group sampleId
FFPE_NAT_Lung FFPE_NAT_Lung_1
FFPE_NAT_Lung FFPE_NAT_Lung_2
FFPE_Tumor_Lung FFPE_Tumor_Lung_1
FFPE_Tumor_Lung FFPE_Tumor_Lung_2
NAT_Liver NAT_Liver_1
NAT_Liver NAT_Liver_2
Tumor_Liver Tumor_Liver_1
Tumor_Liver Tumor_Liver_2

Distribution of methylation values

Before detecting DMCs/DMRs, here is an overview of the methylation level distributions at the both sample and group levels.

Per sample

This figure displays the distribution of methylation values of all cytosines (or a sampled subset for the sake of performance) for each sample using violin plot.

Group means

This figure displays the distribution of methylation values averaged for each group using density plot. The number of cytosines (n) used is at the top.

DMCs

Summary

The table below shows the number of DMCs identified for each comparison.

Column explanation:

  • cmp: the groups under comparison, separated by comma.
  • method: the statistical method used.
  • input: the total number of cytosines fed into the statistical method
  • statOuput: the total number of cytosine sites output by the statistical method; in most cases, this number equals to the number of significant sites; if not, the statistical method may filter out some sites such as non-significant ones.
  • padj<=0.05: the number of significant sites.
  • padj<=0.05 & methDiff<=-0.1: the number of significant hypomethylated sites (former group is less methylated than latter). This is not applicable for comparisons involved more than 2 groups.
  • padj<=0.05 & methDiff>=0.1: the number of significant hypermethylated sites. For comparisons with > 2 groups, methDiff is computed as the difference between the maximum and minimum methylation values of all compared groups.
cmp method input statOuput padj<=0.05 padj<=0.05 & methDiff<=-0.1 padj<=0.05 & methDiff>=0.1 file
FFPE_NAT_Lung,FFPE_Tumor_Lung dss 1885477 63795 53442 22961 30481 dms_dss.FFPE_NAT_Lung_vs_FFPE_Tumor_Lung.tsv.gz
FFPE_NAT_Lung,NAT_Liver dss 1509280 87282 78991 62115 16876 dms_dss.FFPE_NAT_Lung_vs_NAT_Liver.tsv.gz
FFPE_NAT_Lung,Tumor_Liver dss 1753351 423235 422823 108756 314067 dms_dss.FFPE_NAT_Lung_vs_Tumor_Liver.tsv.gz
FFPE_Tumor_Lung,NAT_Liver dss 1392530 130595 126246 86632 39614 dms_dss.FFPE_Tumor_Lung_vs_NAT_Liver.tsv.gz
FFPE_Tumor_Lung,Tumor_Liver dss 1582367 376226 375807 92776 283031 dms_dss.FFPE_Tumor_Lung_vs_Tumor_Liver.tsv.gz
NAT_Liver,Tumor_Liver dss 1596983 287429 285581 63641 221940 dms_dss.NAT_Liver_vs_Tumor_Liver.tsv.gz

Heatmap

Below are heatmaps of methylation values of all significant DMCs from the comparisons. If more than 3000 DMCs are identified, the top 3000 are displayed.

DMRs

Summary

The table summarizes the number of identified DMRs, and the columns have similar definitions to those of DMCs (see DMC summary) with the following differences:

  • statOuput: this column contains the total number of DMRs reported by the statistical method.
  • methDiff: the difference of methylation is calculated over each DMR. The methylation value of each DMR is computed as read-depth weighted average over all included cytosines.

Some methods such as DSS don’t compute P values for DMRs, and for these cases, the column padj<=0.05 will be NA, and the counts of hypo- and hyper-methylated regions concern only the value of methDiff.

cmp method input statOuput padj<=0.05 padj<=0.05 & methDiff<=-0.1 padj<=0.05 & methDiff>=0.1 file
FFPE_NAT_Lung,FFPE_Tumor_Lung dss 1885477 5239 NA 1894 3289 dmr_dss.FFPE_NAT_Lung_vs_FFPE_Tumor_Lung.tsv.gz
FFPE_NAT_Lung,NAT_Liver dss 1509280 7225 NA 5580 1547 dmr_dss.FFPE_NAT_Lung_vs_NAT_Liver.tsv.gz
FFPE_NAT_Lung,Tumor_Liver dss 1753351 33723 NA 6957 26469 dmr_dss.FFPE_NAT_Lung_vs_Tumor_Liver.tsv.gz
FFPE_Tumor_Lung,NAT_Liver dss 1392530 10587 NA 7800 2663 dmr_dss.FFPE_Tumor_Lung_vs_NAT_Liver.tsv.gz
FFPE_Tumor_Lung,Tumor_Liver dss 1582367 29376 NA 6360 22690 dmr_dss.FFPE_Tumor_Lung_vs_Tumor_Liver.tsv.gz
NAT_Liver,Tumor_Liver dss 1596983 21253 NA 3283 17894 dmr_dss.NAT_Liver_vs_Tumor_Liver.tsv.gz

Heatmap

Below are heatmaps of methylation values of all significant DMRs from the comparisons. If more than 3000 DMRs are identified, the top 3000 are displayed. The methylation value of each DMR is computed as read-depth weighted average over all included cytosines.

Functional enrichment

Using g:Profiler, the genes which overlaps with DMRs are analyzed for functional enrichment. Further explanation of the Manhattan plot can be found here.