Understanding FANTOM5 CAGE fields
1
0
Entering edit mode
5.2 years ago
simplitia ▴ 130

Hi recently I downloaded data from FANTOM5 Phase 2.0 ( http://biomart.gsc.riken.jp/ ) with the goal of figuring out TSS sites, however I cannot seem to find any documentations on this. For example, I downloaded FANTOM5 Phase 2. I'm a bit confused, enter image description here

So for example FANTOM site has many human sample CAGE data from different cell lines and tissues however this file seem to suggest that it is combined from phase 1 and phase2, so does this mean that all the data were somehow average and these are the peaks of the averages? Also why do each row seem to base of of different transcripts?

thanks in advance.

RNA-Seq transcription tss • 1.5k views
ADD COMMENT
1
Entering edit mode
5.2 years ago

Q1) That it is combined from phase 1 and 2 simply mean that it is peak calls based on both the data set from this article (phase 1, stationary) and this article (phase 2, dynamic). This does not mean the peaks are average but that all the data is pooled before the peaks are called (and afterwards quantified in the individual samples).

Q2) Each row is not a different transcript but a different transcription start site (TSS, with an associated id with the form pX@geneName and having the genomic coordinates indicated in col 1) - TSS detection that is what CAGE (the method we used) is good for. For each TSS we have also annotated how that overlaps with known transcripts (that is what you see in column 4).

Hope this answers your question.

ADD COMMENT
0
Entering edit mode

great thanks that is super helpful.

Here is couple of followup questions. 1. So using that example above, p1@LINC200277 is about 29 bp in width. So does that mean the TSS for this gene can be within any of the 29 bp in this range?

  1. Moreover if its range how is it only a single digit 0bp_to_ENST00xx (column 4)?
ADD REPLY
1
Entering edit mode

1) That mean we found evidence of transcription start sites at all those positions. 2) To make it easier to you we take the peak (the position with the most TSS signal) and used that to calculate distances.

ADD REPLY

Login before adding your answer.

Traffic: 2350 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6