Question: Removing targeted sequences from contigs
gravatar for mz1101
3.6 years ago by
mz11010 wrote:


can anybody suggest a tool which aligns a targeted sequence (10 kb) against contigs/scaffolds or long reads, then removes that sequence from the contig and if necessary splits the contig into two if the undesired sequence is flanked by other sequence?

I could script this with BLAST or BWAmem coordinate alignments but I'd rather not reinvent the wheel if there is a tool which does this already. Most contaminant (adapter) trimming tools are designed for short stretches of sequence.


alignment genome • 1.0k views
ADD COMMENTlink modified 3.6 years ago by Brian Bushnell17k • written 3.6 years ago by mz11010
gravatar for Brian Bushnell
3.6 years ago by
Walnut Creek, USA
Brian Bushnell17k wrote:

You might try BBMap's BBMask, which can mask a sequence using a sam file, converting all covered bases bases to N or lowercase. It can additionally split the result into contiguous sequences of unmasked bases only and discard the masked regions, which sounds like what you are looking for. in=sequence.fa sam=mapped.sam masklowentropy=f split=t out=split.fa
ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by Brian Bushnell17k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1841 users visited in the last hour