Question

What's the best way to map CAGE-seq data to the genome?

0

Entering edit mode

9.9 years ago

dustar1986 ▴ 380

Hi,

I've just downloaded several mouse CAGE-seq data from FANTOM5 database.

I tried using bowtie2 default setting to map those rdna.fa files to mm10. And I found a quite low mapping rate around 40% with the majority reads hit more than 1 locus.

I'm totally new to CAGE-seq data. Please forgive me if I ask something silly.

When I looked into the files, the reads seemed being trimmed already. Is this mapping rate normal?
Due to the short length of each read, it's reasonable to hit multiple genomic locations. But won't that raise false positive result when measuring which transcripts are 'really' expressed?
Is there any specific parameter I should apply in Bowtie2?

bowtie CAGE-seq • 4.2k views

ADD COMMENT • link updated 2.5 years ago by Ram 43k • written 9.9 years ago by dustar1986 ▴ 380

0

Entering edit mode

Hi! Could also be a stupid question .. but where did you manage to download the .bam files for the FANTOM5 data, I can only find bed files so far (also in teh CAGEr package). Thanks in advance.

ADD REPLY • link 9.1 years ago by Vandelnokk ▴ 10

1

Entering edit mode

I guess here: http://fantom.gsc.riken.jp/5/datafiles/latest/basic/ ?

ADD REPLY • link 9.1 years ago by Vandelnokk ▴ 10

Ram · Accepted Answer · 2014-05-27

Hi Dadi,

What you mean exactly with rDNA.fa files? rDNA normally stands for ribosomal DNA... I would recommend downloading the .bam files or just the ctss files.

Normally CAGE reads are trimmed the same way as normal sequences reads beads on sequence quality (default = q=20) When aligning reads to the human genome we normally have at least 80% up to 95%... So I expect to see same ranges in mouse genome.
Yes, by default these multimappers are thrown out. But calculations are made for this and it is only a very small percentage that is missed, because the reads are still at least 27nt long. And indeed there are some problems with pseudogenes for example, so normally these are thrown out of the analysis.
We just use bwa default parameters and works pretty good.

Ram · Accepted Answer · 2014-05-29

2

Entering edit mode

9.9 years ago

dustar1986 ▴ 380

Worked out. Fantom5 uses its own aligner Delve: http://fantom.gsc.riken.jp/5/suppl/delve/delve.tgz

ADD COMMENT • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by dustar1986 ▴ 380

0

Entering edit mode

I am trying to run delve as well, but always getting segmentation fault error. where you able to run delve?

ADD REPLY • link 5.5 years ago by sinhashruti • 0