Question: Soft Clipping
2
gravatar for Gregor Rot
5.3 years ago by
Gregor Rot440
Zurich, Switzerland
Gregor Rot440 wrote:

I have short reads that end in a non-genomic sequence. Clipping the reads before mapping is possible but not ideal. The optimal thing for me is to use bowtie2 or STAR and allow soft clipping. This works, but the mapping is worse compared to mapping of pre-clipped reads.

I can't seem to find any option to allow "more" soft clipping. Is there any way i can make STAR/bowtie2 soft clip up to half the read (from 3' or 5' end)? Or do the mappers already do that? Where can i read on how soft clipping works?

Thanks, Gregor

bowtie2 • 5.2k views
ADD COMMENTlink modified 5.3 years ago • written 5.3 years ago by Gregor Rot440
1

I think it is worth investigating what the mapping is worse means in your context. It is suspicious when cutting off ends of reads leads to "worse" mapping, although as I said, what the word "worse" means is essential.

The default expectation would be that after clipping more reads map overall and but the number of uniquely mapped reads decrease.

ADD REPLYlink written 5.3 years ago by Istvan Albert ♦♦ 80k

Exactly. Simply trimming the reads would increase the mapped %, because shorter reads are easier to align - however, more of them would be of low mapping quality, or just wrong.

Soft clipping is more sensitive than adapter removal, it's been shown pretty reliably I think.

ADD REPLYlink written 16 months ago by predeus1.1k
0
gravatar for Irsan
5.3 years ago by
Irsan6.9k
Amsterdam
Irsan6.9k wrote:

With STAR version 2.3.0 you can trim for example 10 bases from 3 or 5 prime end with options --clip3pNBases 10 and --clip5pNbases 10. Have a look at the manual, its all there

I dont know if bowtie2 also has a built in read trimmer but if not, there are tons of other tools available. One example is fastx-toolkit

ADD COMMENTlink written 5.3 years ago by Irsan6.9k

bowtie has it also

-5/--trim5 <int>
       Trim <int> bases from high-quality (left) end of each read before alignment (default: 0).

-3/--trim3 <int>  
       Trim <int> bases from low-quality (right) end of each read before alignment (default: 0).
ADD REPLYlink modified 5.3 years ago • written 5.3 years ago by Assa Yeroslaviz1.2k

Have you tested whether this works properly for paired-end reads?

ADD REPLYlink written 4.7 years ago by dvanic240

I would like to point out that this is hard clipping you're referring to.

By definition, soft clipping is not done at some defined length - rather, it's simply a modification of the scoring scheme that does not punish for mismatches at the ends of the read.

ADD REPLYlink written 16 months ago by predeus1.1k
0
gravatar for Gregor Rot
5.3 years ago by
Gregor Rot440
Zurich, Switzerland
Gregor Rot440 wrote:

Thanks for all the answers. If i clip reads before mapping (i remove a certain sequence from the 3') more reads map (i only consider uniquely mapped reads) compared to if i don't clip the reads before mapping. I presume this is because sometimes i have to remove even half the read from 3' (>40 nucleotides) and soft clipping doesn't consider more than a few nucleotides? I still can't find anywhere any documentation on how soft clipping is performed, neither for bowtie2 or STAR.

ADD COMMENTlink written 5.3 years ago by Gregor Rot440

The main problem is that only when aligning the reads to the genome i see how much clipping is necessary :) so i want to use soft-clipping to align the reads. But it seems that soft clipping works only for a few nucleotides? Of course i can pre-clip nucleotide by nucleotide the unmapped reads and then re-map but i was just thinking there is a more elegant way...

ADD REPLYlink written 5.3 years ago by Gregor Rot440
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1573 users visited in the last hour