Question: removing sequencing gaps
gravatar for cdwilliam524
6.2 years ago by
United States
cdwilliam52430 wrote:

Could anyone tell me how to remove sequencing gaps from a *.sam file? 

I use bwa to map the reference genome (fasta) to the metatranscriptome files (fastq), and then use "bwa sampe" to combine all the *.fasta, *.sai, and *.fastq to get the *.sam file. But the *sam file contains many gaps between contigs. How could I remove them?

Thanks in advance! 

sequencing alignment • 1.7k views
ADD COMMENTlink written 6.2 years ago by cdwilliam52430

I assume you mean that you mapped the short reads (fastq) to the reference genome (fasta), rather than the other way around.

Can you elaborate a bit on what you mean by gaps in the SAM file? I can think of a couple possible ways to interpret that.

ADD REPLYlink written 6.2 years ago by Devon Ryan96k

Hi Devon,

I am new to bioinformatics, so it should be in the way you stated. Sorry for the confusion.

So in the *.sam file I have sequence like NNNNNNNNNCNNNNNNNTNNNNNNAGNNNNCT... I assume Ns are sequencing gaps and I want to remove them and I want to know the start/ end positions of the mapped contigs (I am not sure if I worded it right).


ADD REPLYlink written 6.2 years ago by cdwilliam52430
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1197 users visited in the last hour