2
0
Entering edit mode
6.2 years ago
biogirl ▴ 190

Hi there,

I'm completely new as-of-this-morning to ddRADseq, but am trying to get my head around the theory.  If I have a 30 Mb genome and use ddRADseq with ~6500 digestion sites, how many segregating SNPs can I find (roughly)?  WGS shows that there's roughly 20000 SNPs separating each isolate.

Does this depend on the amount of the reference genome covered by the ddRAD?

Any help is greatly appreciated - thanks.

2
Entering edit mode
6.1 years ago
SNPsaurus ▴ 50

If you have 6500 digestion sites within your planned fragment size selection range, and sequence with 100 bp reads, then you will be sampling 6500 x 100 = 650 kb. Then if the SNPs are spaced every 1.5kb (30Mb/20000 SNPs), you should end up with ~400 SNPs total that you assay. If you sequence fragments that are 150-200 bp with 100 bp paired-end reads, you'll sample more of the genome and have more SNPs.

How did you figure the 6500 digestion sites? Most ddRAD protocols cut with a 6-cutter and a 4-cutter enzyme, but there are probably 6500 6-cutter enzyme sites in your genome. With ddRAD, you have to find the subset of fragments that have the two enzyme sites in the exact size range desired. Just checking... maybe you did all that.

0
Entering edit mode

6500*200 may be ? Because ddRad seq is generally PE sequencing.

0
Entering edit mode

Right, that's why I included the "If you sequence fragments that are 150-200 bp with 100 bp paired-end reads, you'll sample more of the genome and have more SNPs."

I thought it might be helpful to start with the simpler case of always being 100 bp to show how it is done. Sequencing a size range is less exact (150-200 bp fragments with PE) since it depends on the distribution of fragment sizes.

0
Entering edit mode

Ah, I see! That makes complete sense, thank you for going through that.

In reply to the 6500 digestion sites, I would actually have more because I'd use two cutters (as you mention). 6500 was just for the one cutter. But thank you for your reply, the theory makes sense now.

1
Entering edit mode
6.1 years ago

Here is a code ( adopted from Peterson Et al) to double digest your genome.

Usage: edit the restriction sites and give full path to your genome file. Then

 RE_Digestion.py > Rest-sites.txt

If you would like to select fragments with specific size:

RE_Digestion.py | awk '{ if( ($3-$2 ) >=300 && ($3-$2) <= 500 ) print }' > 300_500_sites.txt