Question: calculating within population nucleotide diversity for a virus population
1
gravatar for vjmorley
3.3 years ago by
vjmorley30
United States
vjmorley30 wrote:

I'm interested in using my next-gen sequencing data to calculate within population nucleotide diversity in RNA virus populations. My samples were sequenced on Illumina Hiseq, and each sample represents a diverse virus population. I have come across several packages for calculating nucleotide diversity, but they all seem to assume that each sample represents one individual rather than a population of individuals. Can anyone recommend software to use? It would be great if I could input a .bam or .mpileup file and get back diversity statistics for the population.

ADD COMMENTlink modified 2.9 years ago • written 3.3 years ago by vjmorley30
0
gravatar for skbrimer
3.3 years ago by
skbrimer530
United States
skbrimer530 wrote:

Actually, trying to due the same thing. There are a couple of quasispecies reconstruction programs out there. For example there is shoRAH (here), vispa (here), and an older one VICUNA (here). I also believe the freebayes will also call haplotypes although I'm having some trouble with it going from my vcf file to a haplotype list. 

ADD COMMENTlink written 3.3 years ago by skbrimer530

Did you ever get freebayes to work for calling haplotypes. By the documentation and posts like this , it sure seems like it should be possible but I have yet to discover quite how and all my requests for help, though highly viewed, have gone unanswered.

ADD REPLYlink written 2.9 years ago by mark.rose30

Sadly, no. I tried playing around with the allele frequency since different populations would show up as low quality SNPs and you can do that with the -F flag I believe the default is 0.1 but I was trying as low as 0.005. Maybe I didn't go far enough? I'm sorry I'm not much help.

Also this project got pushed back for us so I haven't been working on it very much. I was going to try emailing Erik and ask for some guidance but never did.

Sorry again and good luck.

ADD REPLYlink written 2.9 years ago by skbrimer530
0
gravatar for apelin20
3.3 years ago by
apelin20470
Canada
apelin20470 wrote:

I would recommend PoPoolation. Allows you to calculate Theta, Pi and Tajima's D from NGS per population (.mpileup) and even allows you to calculate differentiation between populations, see which genes are undergoing adaptation.

ADD COMMENTlink written 3.3 years ago by apelin20470
0
gravatar for vjmorley
2.9 years ago by
vjmorley30
United States
vjmorley30 wrote:

Update: In the end I found SNPGenie to be most useful for calculating Pi from RNA virus data. I tried PoPoolation, but found that it had trouble with the high coverage and large population sizes associated with RNA virus data.

ADD COMMENTlink written 2.9 years ago by vjmorley30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 835 users visited in the last hour