Question: BWA barcode trimming and labeling
gravatar for igor
11 months ago by
United States
igor4.3k wrote:

There is a -B flag in bwa aln:

Length of barcode starting from the 5’-end. When INT is positive, the barcode of each read will be trimmed before mapping and will be written at the BC SAM tag. For paired-end reads, the barcode from both ends are concatenated. [0]

However, it does not seem to be present in bwa mem. Is there a way to replicate this behavior in bwa mem? It doesn't actually affect the actual alignment (as far as I can tell), so technically it should be possible for both alignment options.

Also, are there other aligners that support this behavior?

bwa • 595 views
ADD COMMENTlink modified 11 months ago by piet1.3k • written 11 months ago by igor4.3k
gravatar for piet
11 months ago by
planet earth
piet1.3k wrote:

The tasks of read trimming or clipping should be separated from assembling or mapping in a properly designed work flow. There are dozens of tools for read clipping, thus this functionality must not be re-implemented within aligners. In simple cases use 'seqtk'.

In my experience barcode trimming is usually done by the sequencing service provider. You only need to care about it if something went wrong there. And in these pathologic cases you will usually need a procedure highly adapted to that case.

However, you can exploit the fact, that 'bwa mem' is able to read an interleaved fastq stream from stdin (note the trailing '-').

my_favorite_read_trimmer <in1.fq> <in2.fq> | bwa mem -p refseq.fasta -
ADD COMMENTlink written 11 months ago by piet1.3k

The reason why I mentioned the -B flag was because it allowed you to keep the barcode associated with the corresponding read. If you trim prior to alignment, then you lose the barcode which would defeat the entire purpose.

I am familiar with demultiplexing and read trimming. My question was about a specific task that is related, but completely different.

ADD REPLYlink modified 11 months ago • written 11 months ago by igor4.3k

"bwa mem" is a local aligner. Thus technically, it can align your reads even with barcodes present. Do you have reads with different barcodes in a single FASTQ file?

ADD REPLYlink written 11 months ago by piet1.3k

Yes. Otherwise, there is no point in having the barcode in the aligned file.

ADD REPLYlink written 11 months ago by igor4.3k

If you would store the stripped reads in different files, you could use read groups to keep track of the barcodes. See this (great) comment from John C: Read Group In Sam/Bam Files: What Do They Exactly Describe?

ADD REPLYlink written 11 months ago by piet1.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 816 users visited in the last hour