Very short reads (20-22) , bowtie and tophat not doing gapped alignment
1
1
Entering edit mode
8.8 years ago
manekineko ▴ 150

I have a reads which are quite short (~22nt), and I want to find which mapped to intron-exon boundaries.

I have run tophat with segment=10 but did not find any spliced reads - probably impossible, so something must be wrong with the options I'm using?

Also tried bowtie2 --local, the option I do not see any clipped aligned reads (at least I cannot see in the IGV, is there a way to count the number of trimmed reads that are aligned in --local mode if any?)

RNA-Seq • 4.4k views
ADD COMMENT
1
Entering edit mode

20bp reads are too short for useful spliced alignments unless the introns are extremely short, on the order of a few dozen bp. That said, BBMap can map such reads spliced, with a command like this:

bbmap.sh in=reads.fq ref=ref.fa out=mapped.sam maxindel=100 k=10 slow

k=10 and slow are optional but recommended for mapping such short reads spliced. You can also add the "local" flag but that may reduce the number of reads spanning an intron.

ADD REPLY
0
Entering edit mode

Try reducing the seed length, the read has to be longer than the seed length for the alignment to work. Default seed lengths are around 20.

ADD REPLY
0
Entering edit mode

you have no option for seed in tophat or am I wrong? this option is the segment one and I already decrease it to 10

ADD REPLY
1
Entering edit mode

bowtie is the aligner that users the seed lengths - it takes seed length via -L if you use it via tophat the option is called --b2-L I think.

ADD REPLY
0
Entering edit mode
When they speak of spliced alignments they span across an intron connecting two exons. If you want an intro. Exon boundary that's a genomic aligner without splicing.
ADD REPLY
3
Entering edit mode
8.8 years ago
mark.ziemann ★ 1.9k

Hi manekineko, I've done simulations with 21 nt hairpin derived reads for rice and human microRNAs and found that even with 1bp indel, the reads are either unmapped or a large proportion of spurious mappings with 16 different mappers. Because performance was somewhat better for the smaller genome, you could get better results by mapping directly to a library of exon junctions.

In addition to Brian's suggestion of BBmap, I would recommend trying

  • Bowtie2 with -very-sensitive-local parameter
  • Bowtie2 with -very-sensitive parameter
  • Bowtie1 --best --strata parameter
  • SMALT with k=10 s=1 parameters
ADD COMMENT

Login before adding your answer.

Traffic: 1970 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6