How to create a metagene profile plot
Entering edit mode
4 months ago

Hello :),

I am working with nascent RNA-seq (EU-seq). I have data for three types of samples: control, treatment A and treatment B. The treatment A leads to a strong down regulation of nascent transcription across the whole length of gene bodies; while, treatment B only leads to the down regulation of nascent transcription in the first ~25kb/35kb of gene bodies.

I would like to represent this difference between the two treatments and identify the exact point where treatment B stops having an effect on transcription. I was thinking of using a metagene profile similar to Figure3B in this paper ("MDC1 maintains active elongation complexes of RNA polymerase II"):

Metagene profile representing the average coverage normalized over spike-in and anchored at the TSS of non-overlapping protein-encoding genes with size above 90 kb (n = 1,785). Wavefront shows the end of the transcription progression wave at 10, 20, and 30 min.

I am also using a spike-in normalisation method. So, I basically have 6 BAM files with reads aligned to the mouse genome (2 replicas per condition) and 6 BAM files with reads aligned to the fly genome, which I am using as spike-in organism.

I am just not sure on how to go about creating the plot. I now that it is possible to calculate the coverage depth from a BAM file using samtools depth test.bam > test_coverage.txt.

I am not sure how I would then normalise it using the spike-in BAM files and how I would anchor it at the TSS.

I have also looked at the ngs.plot package (ngs.plot/github) and used the following code ngs.plot.r -G mm10 -R genebody -F rnaseq -C ctrl_1.bam -O metagene_CTRL -FL 300 -RB 0.01 for each of the mouse BAM files, but the plots I obtained are not exactly what I was looking for.

I even looked at the R library CoverageView, but without much success.

Would anyone have any advice on how to go about doing this? Any help would be greatly appreciated!

EU-seq metagene-profile RNA-seq • 451 views
Entering edit mode
4 months ago

This should be possible with deepTools. As per this discussion on github you would want to to compute scaling factors for each sample with the --scalingFactors argument of multiBamSummary, and then use those scaling factors as input to bamCoverage. You should be able to then follow the normal plotProfile workflow to generate the plot you need.

As a side note make sure you are aligning your reads to a joint mouse/fly genome instead of aligning them separately. If you do them separately you'll get a lot of mouse reads aligning to the fly genome (and vice versa) because there is some tolerance to mismatches and gaps in the aligners.

Entering edit mode

Thank you very much for your quick reply! I will have a look at the tools you mentioned.

Regarding the alignment of the reads, they were indeed aligned to a joint mouse/fly genome :)


Login before adding your answer.

Traffic: 1894 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6