BBNorm read depth binning behaviour inconsistent
0
0
Entering edit mode
4.3 years ago

Hi,

I'm using BBNorm to separate my reads by kmer depth, and I encountered what seemed to me to be strange behaviour. When I adjust the bin depths, I am getting inconsistent binning of reads, at least as far as I understand the aim of the program.

For instance, if I set a high bin depth at 200 and a low bin depth at 10, I will get a different number of reads in the high depth bin compared to when I set it at 200 and 100. Is that expected? My understanding is that the high depth reads (>200) should all be put in the highdepth output, so why would a different low depth setting change how many were high depth?

This leads to problems where I will get more reads using a 250 high depth bin than a 200 high depth bin because the low depth is different. I'll include the commands I ran to test this when I noticed that.

bbnorm.sh in1=trim_dedup_Anacalosa_R1.fastq.gz in2=trim_dedup_Anacalosa_R2.fastq.gz outhigh=high250_td_Anacalosa.fastq.gz outlow=low100_td_Anacalosa.fastq.gz outmid=mid_td_Anacalosa.fastq.gz passes=1 highbindepth=250 lowbindepth=100

Total reads in:         74158538    74.090% Kept
Low bin reads:          48960818    66.022%
Mid bin reads:          7333512     9.889%
High bin reads:         17864208    24.089%

bbnorm.sh in1=trim_dedup_Anacalosa_R1.fastq.gz in2=trim_dedup_Anacalosa_R2.fastq.gz outhigh=test1high.fastq.gz outlow=test1low.fastq.gz outmid=test1mid.fastq.gz passes=1 highbindepth=200 lowbindepth=100

Total reads in:         74158538    74.081% Kept
Low bin reads:          48959146    66.020%
Mid bin reads:          5925252     7.990%
High bin reads:         19274140    25.990%


 bbnorm.sh in1=trim_dedup_Anacalosa_R1.fastq.gz in2=trim_dedup_Anacalosa_R2.fastq.gz outhigh=test2high.fastq.gz outlow=test2low.fastq.gz outmid=test2mid.fastq.gz passes=1 highbindepth=200 lowbindepth=10

Total reads in:         74158538    74.076% Kept
Low bin reads:          21759858    29.342%
Mid bin reads:          36943402    49.817%
High bin reads:         15455278    20.841%

 bbnorm.sh in1=trim_dedup_Anacalosa_R1.fastq.gz in2=trim_dedup_Anacalosa_R2.fastq.gz outhigh=test3high.fastq.gz outlow=test3low.fastq.gz outmid=test3mid.fastq.gz passes=1 highbindepth=250 lowbindepth=10

Total reads in:         74158538    74.092% Kept
Low bin reads:          21728446    29.300%
Mid bin reads:          38034948    51.289%
High bin reads:         14395144    19.411%

Any idea why it is behaving this way or what I'm misunderstanding?

bbmap • 1.1k views
ADD COMMENT
0
Entering edit mode

This is one of those questions which may need an input from Brian Bushnell to get an authoritative answer but the inline help says this:

outlow=<file>       Pairs in which both reads have a median below lbd go into this file.
outhigh=<file>      Pairs in which both reads have a median above hbd go into this file.

Perhaps it is one of the reads from the pair that is affecting the output.

You should also look at the Normalization paramaters to see if you need to include something from there. Perhaps this

uselowerdepth=t     (uld) For pairs, use the depth of the lower read as the depth proxy.

There is a bbnorm guide here in case you had not seen it.

ADD REPLY
0
Entering edit mode

Thanks for the response, genomax. I had taken a look over the guide and the help, but the behaviour remains unclear to me. I tried to use the uselowerdepth=t, but there was no effect between runs (=t or =f produced the same output).

I wondered about your point regarding the pairs, but it still doesn't add up to me. If the high bin depth hasn't changed, then all the read pairs that are above it for both reads ought to stay there. Raising the low bin depth might mean that more pairs will have one high and one low, but they should still only go to the mid output and the high depth bin output should be unchanged.

ADD REPLY
0
Entering edit mode

I suggest that you email Brian directly (you can find his email in the BBMap documentation). Please come back and post an update if/when you hear from him.

ADD REPLY

Login before adding your answer.

Traffic: 2662 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6