keep samples that carry non-reference allele in list of variants snpeff/snpsift
1
0
Entering edit mode
2.2 years ago
curious ▴ 750

I am trying to get carriers of high impact variants of DUOX2 with the amazing snpeff/snpsift workflow

I annotate:

snpEff -Xmx64g "hg38" $sample.vcf.gz > $sample_annotated.vcf

I filter keeping variants with high impact in DUOX2

bcftools view  $sample_annotated.vcf  | SnpSift filter "( EFF[*].IMPACT = 'HIGH' ) & ( EFF[*].GENE = 'DUOX2' )" > $variant_filt.vcf

Now I want to keep only samples that have at least one non-homozygous reference genotype in any of the high impact variants. Is there a way to do this with snpeff or some other tool?

snpeff snpsift bcftools • 1.1k views
ADD COMMENT
2
Entering edit mode
2.2 years ago

after filtering the variant with snpEff, get the samples carrying an ALT using http://lindenb.github.io/jvarkit/BioAlcidaeJdk.html

java -jar ${JVARKIT_DIST}/bioalcidaejdk.jar -e 'stream().flatMap(V->V.getGenotypes().stream()).filter(G->G.isCalled() && G.getAlleles().stream().anyMatch(A->A.isNonReference())).map(G->G.getSampleName()).collect(Collectors.toSet()).stream().forEach(S->println(S));' filtered.snpsift.vcf > samples.txt

use samples with bcftools to extract those samples:

bcftools view --samples-file samples.txt -O u filtered.snpsift.vcf | bcftools view -i 'AC>0'
ADD COMMENT
0
Entering edit mode

That certainty looks like it is working, but I get error:

[WARN][SnpEffPredictionParser]no INFO[EFF] or no description. This VCF was probably NOT annotated with SnpEff (old version)

I know I could rerun snpEff with -formatEff, but your command is not using these annotations right? If I understand the assumption is that I am filtering on the snpeff annotations prior to your command and that your command is just passing genotypes correct?

ADD REPLY
1
Entering edit mode

this is just a warning. you can ignore this as your vcf is already filtered.

ADD REPLY
0
Entering edit mode

Thank you so much

ADD REPLY

Login before adding your answer.

Traffic: 2719 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6