Question: BWA barcode trimming and labeling
0
gravatar for igor
15 months ago by
igor4.6k
United States
igor4.6k wrote:

There is a -B flag in bwa aln:

Length of barcode starting from the 5’-end. When INT is positive, the barcode of each read will be trimmed before mapping and will be written at the BC SAM tag. For paired-end reads, the barcode from both ends are concatenated. [0]

However, it does not seem to be present in bwa mem. Is there a way to replicate this behavior in bwa mem? It doesn't actually affect the actual alignment (as far as I can tell), so technically it should be possible for both alignment options.

Also, are there other aligners that support this behavior?

bwa • 758 views
ADD COMMENTlink modified 15 months ago by piet1.4k • written 15 months ago by igor4.6k
1
gravatar for piet
15 months ago by
piet1.4k
planet earth
piet1.4k wrote:

The tasks of read trimming or clipping should be separated from assembling or mapping in a properly designed work flow. There are dozens of tools for read clipping, thus this functionality must not be re-implemented within aligners. In simple cases use 'seqtk'.

In my experience barcode trimming is usually done by the sequencing service provider. You only need to care about it if something went wrong there. And in these pathologic cases you will usually need a procedure highly adapted to that case.

However, you can exploit the fact, that 'bwa mem' is able to read an interleaved fastq stream from stdin (note the trailing '-').

my_favorite_read_trimmer <in1.fq> <in2.fq> | bwa mem -p refseq.fasta -
ADD COMMENTlink written 15 months ago by piet1.4k

The reason why I mentioned the -B flag was because it allowed you to keep the barcode associated with the corresponding read. If you trim prior to alignment, then you lose the barcode which would defeat the entire purpose.

I am familiar with demultiplexing and read trimming. My question was about a specific task that is related, but completely different.

ADD REPLYlink modified 14 months ago • written 14 months ago by igor4.6k

"bwa mem" is a local aligner. Thus technically, it can align your reads even with barcodes present. Do you have reads with different barcodes in a single FASTQ file?

ADD REPLYlink written 14 months ago by piet1.4k

Yes. Otherwise, there is no point in having the barcode in the aligned file.

ADD REPLYlink written 14 months ago by igor4.6k

If you would store the stripped reads in different files, you could use read groups to keep track of the barcodes. See this (great) comment from John C: Read Group In Sam/Bam Files: What Do They Exactly Describe?

ADD REPLYlink written 14 months ago by piet1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1308 users visited in the last hour