Question: How to align Trimmomatic unpaired reads with BWA?
0
gravatar for mcff23
4.3 years ago by
mcff2360
mcff2360 wrote:

Hi everyone!

I have filtered the adapters from my Illumina PE reads with TrimmomaticThis was the output (as I expected): sample.R1.trimmed.fastq, sample.R2.trimmed.fastq, sample.R1.unpaired.fastq and sample.R2.unpaired.fastq.

Then I aligned the trimmed.fastq pair with BWA just fine. But when I tried to align the unpaired reads I got this:

[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (4, 1, 1, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] skip orientation FR as there are not enough pairs
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[mem_sam_pe] paired reads have different names: "HWI-1KL178:67:HAE0RADXX:1:1101:2363:2000", "HWI-1KL178:67:HAE0RADXX:1:1101:11567:2000"

This is the command line:

bwa/bin/bwa mem -aM -t 6 ${REF_BWA_INDEX}/genome.fa ${SAMPLE}.R1.unpaired.fastq ${SAMPLE}.R2.unpaired.fastq > ${i}.sam

My goal is to align trimmed and unpaired files separately because BWA do not support them together.

Thanks in advance!

Monica 

 

bwa unpaired reads trimmomatic • 5.1k views
ADD COMMENTlink modified 5 months ago by drake.edwards0 • written 4.3 years ago by mcff2360
6
gravatar for Istvan Albert
4.3 years ago by
Istvan Albert ♦♦ 81k
University Park, USA
Istvan Albert ♦♦ 81k wrote:

Run each unpaired data separately. 

bwa/bin/bwa mem -aM -t 6 ${REF_BWA_INDEX}/genome.fa ${SAMPLE}.R1.unpaired.fastq >R1.unpaired.sam

...

Be careful with combining paired and unpaired data.  

Information gleaned from a  read pair usually cannot (should not) be combined with that obtained from two unpaired reads. That is because a paired read provides measurements from the same DNA fragment that is measured (sequenced) twice, whereas unpaired reads measure different DNA fragments.

ADD COMMENTlink modified 4.3 years ago • written 4.3 years ago by Istvan Albert ♦♦ 81k
2

Just a note that the latest bwa-mem supports this:

(seqtk mergepe sample.R?.trimmed.fastq; cat sample.R?.unpaired.fastq) | bwa mem -p -

i.e., you can merge paired and unpaired reads in one stream, as long as paired reads are next to each other.

ADD REPLYlink written 4.3 years ago by lh331k

Thanks Istvan for your quick response!

I am kind of lost. My main goal here is to call variants, what do yo suggest me to do with these unpaired files once I aligned them separately? I was going to merge them with the trimmed ones and then call the variants...

Do I have to take them into account or I should only use the trimmed ones?

Thanks!

Monica

ADD REPLYlink written 4.3 years ago by mcff2360
2

check the documentation of the variant caller for information on whether it handles mixed content. We usually discard the unpaired reads to keep things simple but typically these are no more than a few percent of data - won't actually affect the results. 

ADD REPLYlink written 4.3 years ago by Istvan Albert ♦♦ 81k

Hi Istvan,

Would you please give a general number for "a few percent"? I filtered out 8% unpaired reads. Will this amount of data loss affect the downstream analysis?

Thank you!

 

ADD REPLYlink written 4.2 years ago by Emma10

8% is not all that much but then it all depends how much data do you have left. The general rule is that it is best to get rid of bad data than to try to salvage it. in my opinion better data even if it is fewer is more desirable than salvaged data.

That is because errors rarely come isolated - we may think that we were able fix all that by trimming off the bad bases but perhaps there were more reasons that drove those errors in some regions of the flowcell and even the data that looks reliable is not.

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by Istvan Albert ♦♦ 81k
0
gravatar for drake.edwards
5 months ago by
drake.edwards0 wrote:

If your unpaired reads are being generated by Trimmomatic's pallindromic mode (i.e. If forward and reverse reads end up containing the same sequence after trimming adapters), try using the "keepBothReads" function of ILLUMINACLIP

ADD COMMENTlink written 5 months ago by drake.edwards0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 870 users visited in the last hour