Question: What's the best way to map CAGE-seq data to the genome?
0
gravatar for dustar1986
5.0 years ago by
dustar1986290
USA
dustar1986290 wrote:

Hi,

I've just downloaded several mouse CAGE-seq data from FANTOM5 database. 

I tried using bowtie2 default setting to map those rdna.fa files to mm10. And I found a quite low mapping rate around 40% with the majority reads hit more than 1 locus.

 

I'm totally new to CAGE-seq data. Please forgive me if I ask something silly.

 

1. When I looked into the files, the reads seemed being trimmed already. Is this mapping rate normal? 

2. Due to the short length of each read, it's reasonable to hit multiple genomic locations. But won't that raise false positive result when measuring which transcripts are 'really' expressed? 

3. Is there any specific parameter I should apply in Bowtie2?

 

bowtie cage-seq • 2.3k views
ADD COMMENTlink modified 4.2 years ago by Vandelnokk10 • written 5.0 years ago by dustar1986290
2
gravatar for Floris Brenk
5.0 years ago by
Floris Brenk890
USA
Floris Brenk890 wrote:

Hi Dadi,

What you mean exactly with rDNA.fa files? rDNA normally stands for ribosomal DNA... I would recommend downloading the .bam files or just the ctss files.

1. Normally CAGE reads are trimmed the same way as normal sequences reads beads on sequence quality (default = q=20) When aligning reads to the human genome we normally have at least 80% up to 95%... So I expect to see same ranges in mouse genome.

2. Yes, by default these multimappers are thrown out. But calculations are made for this and it is only a very small percentage that is missed, because the reads are still at least 27nt long. And indeed there are some problems with pseudogenes for example, so normally these are thrown out of the analysis.

3. We just use bwa default parameters and works pretty good.

 

ADD COMMENTlink written 5.0 years ago by Floris Brenk890

Thanks for your detailed explanation, Floris. That is extremely helpful for me. I think I really should download the bam file (mm9) and re-map them to mm10.

ADD REPLYlink written 5.0 years ago by dustar1986290
2
gravatar for dustar1986
5.0 years ago by
dustar1986290
USA
dustar1986290 wrote:

Worked out. Fantom5 uses its own aligner Delve

http://fantom.gsc.riken.jp/5/suppl/delve/delve.tgz

 

 

ADD COMMENTlink written 5.0 years ago by dustar1986290

I am trying to run delve as well, but always getting segmentation fault error. where you able to run delve?

ADD REPLYlink written 6 months ago by sinhashruti0
0
gravatar for Vandelnokk
4.2 years ago by
Vandelnokk10
Finland
Vandelnokk10 wrote:

Hi! Could also be a stupid question .. but where did you manage to download the .bam files for the FANTOM5 data, I can only find bed files so far (also in teh CAGEr package). Thanks in advance.

ADD COMMENTlink written 4.2 years ago by Vandelnokk10
1

I guess here: http://fantom.gsc.riken.jp/5/datafiles/latest/basic/ ?

ADD REPLYlink written 4.2 years ago by Vandelnokk10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2161 users visited in the last hour