I want have deep amplicon sequencing data (illumina) where I have samples in row and haplotype (both sequences and names) in column and read counts.The dataframe is a CSV format and I have a separate file in fasta for the haplotypes. I read somewhere that someone has calculated nucleotide diversity and ASV heterozygosity (I suspect this is haplotype diversity) using the pegas software. I have managed to get nucleotide diversity data both in DNAsp and Arlequin. I was however curious how this was done in pegas. The thing that makes me think about it is especially how haplotype frequency is calculated in a samples( I think read counts are a poor representation of haplotype frequency, may of may not be true depending on the assumptions.
Can someone weigh in on this. It makes my head spin.