Question: VCF with SNP- count number of 100% missing genotypes
0
gravatar for bwczech
22 months ago by
bwczech70
bwczech70 wrote:

Hey,

I am using a VCF Tools for SNPs filtering. I need to count (or extract) a variable which have 100% missing genotypes (I have multisample VCF).

In VCF I know a command --max.missing, but I can define only a max missing, without a min number.

It is important for me, because I have 1 big VCF with 8 breeds' SNP's, and I would like to get information, how many SNP's are exactly in each breeds.

Thank you in advance!

snp vcf • 874 views
ADD COMMENTlink modified 22 months ago by Pierre Lindenbaum123k • written 22 months ago by bwczech70
0
gravatar for Jeremy Leipzig
22 months ago by
Philadelphia, PA
Jeremy Leipzig18k wrote:

genomic loci which have 100% missing coverage (or all reference calls) would not show up in a VCF anyway

ADD COMMENTlink written 22 months ago by Jeremy Leipzig18k
0
gravatar for chrchang523
22 months ago by
chrchang5235.5k
United States
chrchang5235.5k wrote:

You can dump the list of SNPs which are less than 100% missing, and then --exclude that list.

ADD COMMENTlink written 22 months ago by chrchang5235.5k

Which command allow me to exclude this 100% missing data?

ADD REPLYlink written 22 months ago by bwczech70
0
gravatar for bwczech
22 months ago by
bwczech70
bwczech70 wrote:

Look up. I told I am extracting from multisample file each breed and it generate empty variable. In my VCF file I have many breeds. I am extracting just one, and this file contain a null variable, but other breed in the same position, SNP was detected.

ADD COMMENTlink written 22 months ago by bwczech70
0
gravatar for Pierre Lindenbaum
22 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum123k wrote:

using vcffilterjdk: http://lindenb.github.io/jvarkit/VcfFilterJdk.html

 java -jar dist/vcffilterjdk.jar -e 'return !variant.getGenotypes().stream().anyMatch(G->G.isCalled());' in.vcf
ADD COMMENTlink written 22 months ago by Pierre Lindenbaum123k

I can not use it on my server. I can not install new softs. Exist any other option, to get this 100% missing data, using any bash, vcftools, beagle or another?

ADD REPLYlink written 22 months ago by bwczech70

I can not install new softs

you don't have to install it on the server, you can always install it just for you. (and in fact, it is installed where it's compiled... )

ADD REPLYlink written 22 months ago by Pierre Lindenbaum123k

but it will be impossible to run this on my files, which are on the server. I can't download it, because its too big file :(

ADD REPLYlink written 22 months ago by bwczech70
wget -O - "http://a.c.d/path/to/file.vcf.gz" | gunzip -c |  java -jar dist/vcffilterjdk.jar -e 'return !variant.getGenotypes().stream().anyMatch(G->G.isCalled());'
ADD REPLYlink written 22 months ago by Pierre Lindenbaum123k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 827 users visited in the last hour