Question: VCF with SNP- count number of 100% missing genotypes
0
gravatar for bwczech
17 months ago by
bwczech60
bwczech60 wrote:

Hey,

I am using a VCF Tools for SNPs filtering. I need to count (or extract) a variable which have 100% missing genotypes (I have multisample VCF).

In VCF I know a command --max.missing, but I can define only a max missing, without a min number.

It is important for me, because I have 1 big VCF with 8 breeds' SNP's, and I would like to get information, how many SNP's are exactly in each breeds.

Thank you in advance!

snp vcf • 728 views
ADD COMMENTlink modified 17 months ago by Pierre Lindenbaum119k • written 17 months ago by bwczech60
0
gravatar for Jeremy Leipzig
17 months ago by
Philadelphia, PA
Jeremy Leipzig18k wrote:

genomic loci which have 100% missing coverage (or all reference calls) would not show up in a VCF anyway

ADD COMMENTlink written 17 months ago by Jeremy Leipzig18k
0
gravatar for chrchang523
17 months ago by
chrchang5234.9k
United States
chrchang5234.9k wrote:

You can dump the list of SNPs which are less than 100% missing, and then --exclude that list.

ADD COMMENTlink written 17 months ago by chrchang5234.9k

Which command allow me to exclude this 100% missing data?

ADD REPLYlink written 17 months ago by bwczech60
0
gravatar for bwczech
17 months ago by
bwczech60
bwczech60 wrote:

Look up. I told I am extracting from multisample file each breed and it generate empty variable. In my VCF file I have many breeds. I am extracting just one, and this file contain a null variable, but other breed in the same position, SNP was detected.

ADD COMMENTlink written 17 months ago by bwczech60
0
gravatar for Pierre Lindenbaum
17 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:

using vcffilterjdk: http://lindenb.github.io/jvarkit/VcfFilterJdk.html

 java -jar dist/vcffilterjdk.jar -e 'return !variant.getGenotypes().stream().anyMatch(G->G.isCalled());' in.vcf
ADD COMMENTlink written 17 months ago by Pierre Lindenbaum119k

I can not use it on my server. I can not install new softs. Exist any other option, to get this 100% missing data, using any bash, vcftools, beagle or another?

ADD REPLYlink written 17 months ago by bwczech60

I can not install new softs

you don't have to install it on the server, you can always install it just for you. (and in fact, it is installed where it's compiled... )

ADD REPLYlink written 17 months ago by Pierre Lindenbaum119k

but it will be impossible to run this on my files, which are on the server. I can't download it, because its too big file :(

ADD REPLYlink written 17 months ago by bwczech60
wget -O - "http://a.c.d/path/to/file.vcf.gz" | gunzip -c |  java -jar dist/vcffilterjdk.jar -e 'return !variant.getGenotypes().stream().anyMatch(G->G.isCalled());'
ADD REPLYlink written 17 months ago by Pierre Lindenbaum119k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 919 users visited in the last hour