Question: CAGE processing (from BAM to expression values per Gene/Isoform)
gravatar for Tobias
2.2 years ago by
Tobias130 wrote:

Currently, I am trying to analyze some CAGE data from the FANTOM5 consortium ( and for the data

Unfortunately, I am not that experienced in analyzing CAGE data. They do provide some processed data for both the TSS expression and enhancers and some processing software. However, I want reprocess these data myself and to start at least with the BAM files. So given CAGE BAM files, what would you suggest to use to get out both a sensible value of gene expression for each TSS (and Isoform) and the same for the enhancers?
For the former, usually I would simply take cuffdiff with some GENCODE annotation, but I am unsure whether this is just sensible for RNA-Seq and not for CAGE - or what would you use for it? And I still have no idea how to do it similarly with enhancers (under the assumption that I am having their annotation) - should I count the number of tags falling into these regions?. Also do you know how the "activity" of an enhancer is related to eRNA abundance.

Generally speaking, should I consider the mRNA abundance for a gene measured with CAGE in FANTOM5 as the level of transcription initiation or the the steady-state mRNA level after posttranscriptional regulation (by miRNAs)? Also are other RNAs like miRNAs or lncRNAs included?

It would be great if you can help me here.

ADD COMMENTlink modified 2.2 years ago by Chirag Nepal2.0k • written 2.2 years ago by Tobias130
gravatar for Chirag Nepal
2.2 years ago by
Chirag Nepal2.0k
Chirag Nepal2.0k wrote:

CAGE detects promoters of coding and lncRNA, miRNA promoters, post-transcriptionally processed RNAs, eRNA and so on.

1) If you want to process CAGE BAM files yourself, here is the tool:

2) CAGE detects post-transcriptionally processed RNA signals spread across exons, and they have very different characteristic features, here is the paper:

3) CAGE detects miRNA promoters along with Drosha processing events, described here:

In general, CAGE tags detected at the promoter is mRNA abundance, that is used to quantify gene expression level. CAGE tags at the promoters show nice correlation with RNA-seq. You can also read the Fantom5 papers itself. 


ADD COMMENTlink written 2.2 years ago by Chirag Nepal2.0k

Many thanks for your answer. I am still wondering, however, whether or not I can get out the isoform expression levels from the CAGE data. That would be very helpful to me! However, for many different isoforms (in GENCODE), we have the same TSS, so (how) is it possible to separate between these different isoforms?

ADD REPLYlink written 2.1 years ago by Tobias130
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1500 users visited in the last hour