How can I use PopGenome set.synnonsyn() function on whole genome vcf data?
0
0
Entering edit mode
2.9 years ago
apc • 0

Hi everyone.

I would like to use PopGenome to calculate genome-wide Pn/Ps-ratio starting with a whole Genome-VCF file. I managed to do this for genes in candidate windows using PopGenome. But for the whole genome it did not work. When I try to verify synonymous and non-synonymous SNP's using the set.synnonsyn() function the output I get is not right, although the command runs without any error or warning. I think the problem is that I have several contigs. Maybe someone ran into the same issue and found a solution or a workaround. Here is the code I used (R version 3.6.0, PopGenome_2.7.5):

read in the VCF file:

genome<-readData("VCFScaffolds/", format="VCF", gffpath="GFFScaffolds", include.unknown=TRUE)

I have 10 chromosomes:

genome@region.names

Output: [1] "000008F|quiver" "000010F|quiver" "000016F|quiver" "000022F|quiver" [5] "000032F|quiver" "000037F|quiver" "000040F|quiver" "000045F|quiver" [9] "000047F|quiver" "000058F|quiver"

Now set syn. and nonsyn. SNPs:

genome_sNs<-set.synnonsyn(genome, ref.chr=genome.fasta)

Split data into coding sequences:

genome_sNs_split<-splitting.data(genome_sNs, subsites="coding", whole.data=FALSE)

Calculate diversity stats for all SNPs and syn. and nonsyn. SNPs separately

genome_sNs_split<-diversity.stats(genome_sNs_split)
genome_sNs_split_ns<-diversity.stats(genome_sNs_split, subsites="nonsyn")
genome_sNs_split_syn<-diversity.stats(genome_sNs_split, subsites="syn")

Extract nucleotide diversities

nuc_div_all<-genome_sNs_split@nuc.diversity.within[,1]
nuc_div_ns<-genome_sNs_split_ns@nuc.diversity.within[,1]
nuc_div_syn<-genome_sNs_split_syn@nuc.diversity.within[,1]

When I now compare the nucleotide diversity of gene_1 obtained from the whole genome analysis, like described here to the nucleotide diversity of gene_1 obtained from the candidate analysis where I read in a 12kb window via the read.VCF() function they match for all sites (nuc_div_all) but NOT for syn (nuc_div_ns), and nonsyn. (nuc_div_ns) subsites!

Does anyone have any idea what the problem might be? Any hints and ideas are appreciated!

vcf R set.synnonsyn synonymous-sites PopGenome • 693 views
ADD COMMENT

Login before adding your answer.

Traffic: 2555 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6