Question: How to determine appropriate value for abundance parameter?
0
gravatar for amarin.cogburn
2.7 years ago by
United States
amarin.cogburn0 wrote:

Up to this point I've mostly used the default 3.  How would one go about determining what the optimal abundance threshold is for a given dataset.  I realize that kmer size selection has the greater effect on assembly quality and I've used kmergenie effectively for this on bacterial isolates in the past.

Any input is appreciated.  Thanks.

minia • 833 views
ADD COMMENTlink modified 2.7 years ago by Rayan Chikhi1.1k • written 2.7 years ago by amarin.cogburn0
3
gravatar for Rayan Chikhi
2.7 years ago by
Rayan Chikhi1.1k
France, Lille, CNRS
Rayan Chikhi1.1k wrote:

Great question.

In all generality, you want to set an abundance threshold X so that every correct k-mers appear X times or more in the dataset, and not too many erroneous k-mers are seen X times or more. When you take a look at the abundance histogram (generated by Kmergenie or a k-mer counter), a reasonable abundance threshold is near the first "valley" (local minimum) in this histogram.

For high-coverage datasets, the abundance threshold should be high (I can't give a specific number as it depends highly on the dataset but it's generally within the range 5-20). And for low-coverage datasets, 2 or 3 are generally good.

Kmergenie offers an experimental feature that determines an abundance parameter for you. It's not in the HTML report yet, but you can see it in the command line output. Give it a try! I've had good results with it so far.

 

 

ADD COMMENTlink written 2.7 years ago by Rayan Chikhi1.1k

Would that be the coverage cut-off metric? Thanks for the quick reply.

ADD REPLYlink written 2.7 years ago by amarin.cogburn0

yes, those terms are synonymous: abundance threshold (Minia), coverage cut-off (Kmergenie, Velvet)

ADD REPLYlink modified 2.7 years ago • written 2.7 years ago by Rayan Chikhi1.1k

Thanks for the clarification

ADD REPLYlink written 2.7 years ago by amarin.cogburn0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 512 users visited in the last hour