Question: Very short reads (20-22) , bowtie and tophat not doing gapped alignment
gravatar for manekineko
4.8 years ago by
manekineko130 wrote:

I have a reads which are qiote short ~22nt, and I want to find which mapped to intron-exon boundaries.

I have run tophat with segment=10 but did not find any spliced reads - probaby impossible, so something must be wrong with the options I'm using?

Also tried bowtie2 --local, the option I do not see any clipped aligned reads (at least I cannot see in the IGV, is there a way to count the number of trimmed reads that are aligned in --local mode if any?)




rna-seq • 2.2k views
ADD COMMENTlink modified 4.7 years ago by mark.ziemann1.2k • written 4.8 years ago by manekineko130

20bp reads are too short for useful spliced alignments unless the introns are extremely short, on the order of a few dozen bp.  That said, BBMap can map such reads spliced, with a command like this: in=reads.fq ref=ref.fa out=mapped.sam maxindel=100 k=10 slow

k=10 and slow are optional but recommended for mapping such short reads spliced.  You can also add the "local" flag but that may reduce the number of reads spanning an intron.

ADD REPLYlink modified 4.7 years ago • written 4.7 years ago by Brian Bushnell17k

Try reducing the seed length, the read has to be longer than the seed length for the alignment to work. Default seed lengths are around 20.

ADD REPLYlink written 4.8 years ago by Istvan Albert ♦♦ 83k

you have no option for seed in tophat or am I wrong? this option is the segment one and I already decrease it to 10

ADD REPLYlink written 4.8 years ago by manekineko130

bowtie is the aligner that users the seed lengths - it takes seed length via -L if you use it via tophat the option is called --b2-L I think.

ADD REPLYlink written 4.8 years ago by Istvan Albert ♦♦ 83k
When they speak of spliced alignments they span across an intron connecting two exons. If you want an intro. Exon boundary that's a genomic aligner without splicing.
ADD REPLYlink written 4.7 years ago by karl.stamm3.6k
gravatar for mark.ziemann
4.7 years ago by
mark.ziemann1.2k wrote:

Hi manekineko, I've done simulations with 21 nt hairpin derived reads for rice and human microRNAs and found that even with 1bp indel, the reads are either unmapped or a large proportion of spurious mappings with 16 different mappers. Because performance was somewhat better for the smaller genome, you could get better results by mapping directly to a library of exon junctions. 

In addition to Brian's suggestion of BBmap, I would recommend trying

-Bowtie2 with "-very-sensitive-local" parameter

-Bowtie2 with "-very-sensitive" parameter

-Bowtie1 "--best --strata"  parameter

-SMALT with k=10 s=1 parameters

ADD COMMENTlink written 4.7 years ago by mark.ziemann1.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1199 users visited in the last hour