I'm interested in using my next-gen sequencing data to calculate within population nucleotide diversity in RNA virus populations. My samples were sequenced on Illumina Hiseq, and each sample represents a diverse virus population. I have come across several packages for calculating nucleotide diversity, but they all seem to assume that each sample represents one individual rather than a population of individuals. Can anyone recommend software to use? It would be great if I could input a .bam or .mpileup file and get back diversity statistics for the population.
Actually, trying to due the same thing. There are a couple of quasispecies reconstruction programs out there. For example there is shoRAH (here), vispa (here), and an older one VICUNA (here). I also believe the freebayes will also call haplotypes although I'm having some trouble with it going from my vcf file to a haplotype list.
I would recommend PoPoolation. Allows you to calculate Theta, Pi and Tajima's D from NGS per population (.mpileup) and even allows you to calculate differentiation between populations, see which genes are undergoing adaptation.
Update: In the end I found SNPGenie to be most useful for calculating Pi from RNA virus data. I tried PoPoolation, but found that it had trouble with the high coverage and large population sizes associated with RNA virus data.