CAGE processing (from BAM to expression values per Gene/Isoform)
1
1
Entering edit mode
8.3 years ago
Tobias ▴ 150

Currently, I am trying to analyze some CAGE data from the FANTOM5 consortium (fantom.gsc.riken.jp and for the data http://fantom.gsc.riken.jp/5/data/).

Unfortunately, I am not that experienced in analyzing CAGE data. They do provide some processed data for both the TSS expression and enhancers and some processing software. However, I want reprocess these data myself and to start at least with the BAM files. So given CAGE BAM files, what would you suggest to use to get out both a sensible value of gene expression for each TSS (and Isoform) and the same for the enhancers?
For the former, usually I would simply take cuffdiff with some GENCODE annotation, but I am unsure whether this is just sensible for RNA-Seq and not for CAGE - or what would you use for it? And I still have no idea how to do it similarly with enhancers (under the assumption that I am having their annotation) - should I count the number of tags falling into these regions?. Also do you know how the "activity" of an enhancer is related to eRNA abundance.

Generally speaking, should I consider the mRNA abundance for a gene measured with CAGE in FANTOM5 as the level of transcription initiation or the the steady-state mRNA level after posttranscriptional regulation (by miRNAs)? Also are other RNAs like miRNAs or lncRNAs included?

It would be great if you can help me here.

RNA-Seq next-gen Assembly sequencing CAGE • 2.9k views
ADD COMMENT
4
Entering edit mode
8.3 years ago
Chirag Nepal ★ 2.4k

CAGE detects promoters of coding and lncRNA, miRNA promoters, post-transcriptionally processed RNAs, eRNA and so on.

  1. If you want to process CAGE BAM files yourself, here is the tool: http://www.ncbi.nlm.nih.gov/pubmed/25653163

  2. CAGE detects post-transcriptionally processed RNA signals spread across exons, and they have very different characteristic features, here is the paper: http://www.ncbi.nlm.nih.gov/pubmed/24002785

  3. CAGE detects miRNA promoters along with Drosha processing events, described here: http://www.ncbi.nlm.nih.gov/pubmed/26673698

In general, CAGE tags detected at the promoter is mRNA abundance, that is used to quantify gene expression level. CAGE tags at the promoters show nice correlation with RNA-seq. You can also read the Fantom5 papers itself.

ADD COMMENT
0
Entering edit mode

Many thanks for your answer. I am still wondering, however, whether or not I can get out the isoform expression levels from the CAGE data. That would be very helpful to me! However, for many different isoforms (in GENCODE), we have the same TSS, so (how) is it possible to separate between these different isoforms?

ADD REPLY

Login before adding your answer.

Traffic: 1865 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6