I've just downloaded several mouse CAGE-seq data from FANTOM5 database.
I tried using bowtie2 default setting to map those rdna.fa files to mm10. And I found a quite low mapping rate around 40% with the majority reads hit more than 1 locus.
I'm totally new to CAGE-seq data. Please forgive me if I ask something silly.
1. When I looked into the files, the reads seemed being trimmed already. Is this mapping rate normal?
2. Due to the short length of each read, it's reasonable to hit multiple genomic locations. But won't that raise false positive result when measuring which transcripts are 'really' expressed?
3. Is there any specific parameter I should apply in Bowtie2?