Question: Help with BWA: how to align paired as single and then combine again
0
gravatar for simplitia
19 months ago by
simplitia30
simplitia30 wrote:

Hi I'm trying to follow a work flow and I'm confused by instruction.

This is what the instruction says,

- Map reads with bwa as single end sequences then combine single reads into one file

How does one combine single reads into one file? What tried so far is this.

bwa mem -t 4 hg19 Kf_1.fq.gz  > Kf_1.sam;
bwa mem -t 4 hg19 Kf_2.fq.gz > Kf_2.sam;

I then tried to combine the sam file with picard,

java -jar picard.jar MergeSamFiles I=Kf_1.sam I=Kf_2.sam O=/sc/snpir.out/KF.sam

but this does not work because I assume headers are completely the same for both files? I get this error here,

 Cannot add sequence that already exists in SAMSequenceDictionary:

Does anyone have an idea how I can go about this? perhaps I'm not reading the instruction correctly??

thanks in advance!

bwa rna-seq alignment • 899 views
ADD COMMENTlink written 19 months ago by simplitia30

Map reads with bwa as single end sequences then combine single reads into one file

As written this does not make sense. If you aligned SE reads separately then you would have to combine the alignment files (not reads) into one as you are trying to do. Is this a published protocol somewhere that you can link?

ADD REPLYlink written 19 months ago by genomax70k

yes its weird I have a hard time following it; I think the authors meant merge alingments. its from a package here; https://github.com/rpiskol/SNPiR the instructions are on the bottom steps 3 and 4

ADD REPLYlink modified 19 months ago • written 19 months ago by simplitia30

This is an odd workflow, why you shouldn't map the pairs as pairs? Is this from some paper, some software?

bwa mem -t 4 hg19 Kf_1.fq.gz Kf_2.fq.gz > Kf.sam;
ADD REPLYlink written 19 months ago by h.mon26k

I have the same question so I'm bit at odd but I'm sure the authors had a good reason. Its from a paper and here is the github: https://github.com/rpiskol/SNPiR

ADD REPLYlink written 19 months ago by simplitia30

Try merging the .sam files with samtools:

bwa mem -t 4 hg19 Kf_1.fq.gz | samtools view -h -b -o - - | samtools sort -o Kf_1.bam;
bwa mem -t 4 hg19 Kf_2.fq.gz | samtools view -h -b -o - - | samtools sort -o Kf_2.bam;
samtools merge Kf.bam Kf_1.bam Kf_2.bam

And, if needed, convert bam to sam:

samtools view -h -o Kf.sam Kf.bam
ADD REPLYlink written 19 months ago by h.mon26k

thanks this looks good. One worry I have is would the merge bam still contain duplicate header info? If that is the case other tools from picard like MarkDuplicates is going to crash with the same error.

ADD REPLYlink written 19 months ago by simplitia30

would the merge bam still contain duplicate header info?

I don't know, but if it does, try the suggestion from this SAMtools mail list message.

ADD REPLYlink written 19 months ago by h.mon26k

Thanks. This is going to be more complicated than I expected. I will try your method out first since its the most straightforward. I will post back here on an update.

ADD REPLYlink written 19 months ago by simplitia30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1038 users visited in the last hour