Soft-trim for multistep alignment?
0
0
Entering edit mode
23 months ago

I'm having trouble processing some sequencing data from an unusual use case. My reads contain a variable region (20bp) followed by a constant region (20bp), and I would like to filter reads based on different maximum edit distances in these regions because the sequencing quality drops off a bit once the constant region is hit due to an underloaded amount of PhiX. I've been using bowtie2 and samtools to try and stitch this together in a two-step process but it hasn't really been working out.

My thoughts were that I could soft-trim first the variable region, filter out any reads not mapping to the constant region within the required edit distance, then (after writing the reads back out to fastq), trim the constant region, align, and filter. Unfortunately, using --local mode with bowtie2 isn't soft-trimming well enough to actually align: when I hard trim to the constant region, I get >98% alignment but using --local results in no reads aligning. And if I hard trim, then I can't re-align to the variable region because they've all been clipped.

Any suggestions? Happy to use any aligner or different software to try and pull this off, been working on it for 4 days and I'm about to do something stupidly dumb and inefficient if I can't figure out a smart way to do this. I've tried a couple of other aligners (bwa aln, star) but have not had success with them either.

alignment sequencing • 624 views
ADD COMMENT
0
Entering edit mode

In general BBTools should help. You may need to combine more than one tool. In general clumpify.sh and bbduk.sh may be useful.

General set of guides can be found here: https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/

ADD REPLY
0
Entering edit mode

Thanks, will check it out. I also found the FilterSamReads function in Picard tools, which can remove reads from a bam based on read id, hopefully one of these will get it going!

ADD REPLY
0
Entering edit mode

BBTools also has a tool to do that called filterbyname.sh. If you run these scripts without any options they will produce copious in-line help for the tools.

ADD REPLY

Login before adding your answer.

Traffic: 2718 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6