Question: Allow ambiguous sequence in bwa index or bwa mem to capture barcode information
0
gravatar for yesitsjess
16 months ago by
yesitsjess0
yesitsjess0 wrote:

Hi all!

How can I allow for ambiguous matching to a reference sequence? I have a 4 base barcode proceeding my sequences which I need to preserve.

This is my pipeline:

bwa index amp.fa

samtools faidx amp.fa

bwa mem amp.fa file_R1.fastq file_R2.fastq > file.sam

samtools view -bS file.sam > file.bam

samtools sort file.bam > file.sorted.bam

samtools index file.sorted.bam

I then read the sorted BAM file using R with scanBam from Rsamtools and work with it there. Mostly just because I'm a lot more comfortable working with R.

The "amp.fa" file looks like this:

> amp

NNNNATGCATGCATGCATGCATGCATGCATGC

I'd hoped that the Ns would mean any reads aligning to "ATGCATGCATGCATGCATGCATGCATGC" would have the 4 proceeding bases align to "NNNN", so I'd be able to see what they are.

Can anyone suggest an alternative way to do this? Or a tweak to allow the capture of any sequence proceeding position 1 of the know sequence?

Many thanks in advance

sequencing alignment • 472 views
ADD COMMENTlink modified 16 months ago • written 16 months ago by yesitsjess0
2
gravatar for Devon Ryan
16 months ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

Move the barcode to the read name.

ADD COMMENTlink written 16 months ago by Devon Ryan91k

Thanks - sorry, brain clearly not working

ADD REPLYlink written 16 months ago by yesitsjess0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 541 users visited in the last hour