vcf file summary
0
0
Entering edit mode
6 weeks ago
rheab1230 ▴ 20

Hello everyone, I have vcf files for chr 1 to 22. I want to do a summary statistics for these vcf files. Like find out the sample size, no. of SNP,mean and standard deviation. Is there any software or tools which can plot graph for these calculations? Thank You.

vcf summary statistics SNP • 467 views
ADD COMMENT
0
Entering edit mode

You should mention that you've tried the vcfstats software and are facing issues; and also point to your post: error related to vcfstats

ADD REPLY
0
Entering edit mode

Yes, I tried vcfstats but its showing error. So now I am trying bcftools stats and then plot-vcfstats to analyse the output produce by bcftools stats. But its showing error the error:

Parsing bcftools stats output: test.vchk
Plotting graphs: python plot.py
Traceback (most recent call last):
  File "plot.py", line 53, in <module>
    import matplotlib as mpl
ModuleNotFoundError: No module named 'matplotlib'
The command exited with non-zero status 256:
        python plot.py

 at /home/kxj190026/anaconda3/envs/chip-seq/bin/plot-vcfstats line 99.
        main::error("The command exited with non-zero status 256:\x{a}\x{9}python plot.py\x{a}\x{a}") called at /home/anaconda3/envs/chip-seq/bin/plot-vcfstats line 287
        main::plot(HASH(0x55a9a5d69e80)) called at /home/anaconda3/envs/chip-seq/bin/plot-vcfstats line 69
ADD REPLY
0
Entering edit mode

ModuleNotFoundError: No module named 'matplotlib'

Install matplotlib: https://matplotlib.org/stable/users/installing.html

ADD REPLY
0
Entering edit mode

Thank you so much. I am able to do it now. I installed matplotlib

ADD REPLY
0
Entering edit mode

I got this as summary: this is for chr1.
I am not able to understand what does these values means? like how do I extract necessary information from this summary.

ADD REPLY
0
Entering edit mode

I am not really able to understand how to get information from this summary? Like what does n or ts/tv means. Like what does this summary overall predict?

ADD REPLY
0
Entering edit mode

ts/tv is the ratio between transition/transversion, see for instance here: Ti Tv ratio and their usefulness in exome sequencing n is the number of SNPs (2893523) or indels (109293) I think.

ADD REPLY

Login before adding your answer.

Traffic: 1155 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6