Question: Allow ambiguous sequence in bwa index or bwa mem to capture barcode information
gravatar for yesitsjess
2.3 years ago by
yesitsjess0 wrote:

Hi all!

How can I allow for ambiguous matching to a reference sequence? I have a 4 base barcode proceeding my sequences which I need to preserve.

This is my pipeline:

bwa index amp.fa

samtools faidx amp.fa

bwa mem amp.fa file_R1.fastq file_R2.fastq > file.sam

samtools view -bS file.sam > file.bam

samtools sort file.bam > file.sorted.bam

samtools index file.sorted.bam

I then read the sorted BAM file using R with scanBam from Rsamtools and work with it there. Mostly just because I'm a lot more comfortable working with R.

The "amp.fa" file looks like this:

> amp


I'd hoped that the Ns would mean any reads aligning to "ATGCATGCATGCATGCATGCATGCATGC" would have the 4 proceeding bases align to "NNNN", so I'd be able to see what they are.

Can anyone suggest an alternative way to do this? Or a tweak to allow the capture of any sequence proceeding position 1 of the know sequence?

Many thanks in advance

sequencing alignment • 753 views
ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by yesitsjess0
gravatar for Devon Ryan
2.3 years ago by
Devon Ryan95k
Freiburg, Germany
Devon Ryan95k wrote:

Move the barcode to the read name.

ADD COMMENTlink written 2.3 years ago by Devon Ryan95k

Thanks - sorry, brain clearly not working

ADD REPLYlink written 2.3 years ago by yesitsjess0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 800 users visited in the last hour