SelectVariant JEXL expression to select private variants from multisample VCF
1
0
Entering edit mode
7.4 years ago
hellbio ▴ 520

Dear all,

I have a multi-sample VCF file from which i need to extract shared private variants(variants shared only a specific group) . I have used the below command to find shared variants.

java -Xmx4g -jar GenomeAnalysisTK.jar -T SelectVariants -R ref.fa --variant goodAA.final.vcf -select 'vc.getGenotype("LJ1A2").isHomVar() && vc.getGenotype("LJA3").isHomVar()' > shared.vcf

This gives the shared variants in LJ1A2 and LJA3. And I can use the below expression to find shared private variants

'vc.getGenotype("LJ1A2").isHomVar() && vc.getGenotype("LJA3").isHomVar()  &&  vc.getGenotype("JLA4").isHomRef()'

this gives the shared hom varaints in LJ1A2, LJA3 and REF in JLA4. i.e. shared private hom variants with respect to one control (JLA4).

How can specify all the other samples (100's) in the vcf file to have '.isHomRef()' except those specified to have '.isHomVar' ?

It would be complex to specify each and every sample manually when there are >200 samples. COuld someone suggest a smoother way of doing this by using any wild cards?

gatk jexl • 2.8k views
ADD COMMENT
0
Entering edit mode
7.4 years ago

try using vcffilterjs https://github.com/lindenb/jvarkit/wiki/VCFFilterJS or picard/FilterVCF using a javascript expression. The script could look like this. (no tested)

function accept(ctx) {
for(var i=0;i< ctx.getNSamples();++i)
{
var g = ctx.getGenotype(i);
var sample = g.getSampleName();
if ( (sample=="LJ1A2"  || sample=="LJA3") && g.isHomVar()) continue;
if( g.isHomRef() || g.isNoCall()) continue;
return false;
}
return true;
}

accept(variant);
ADD COMMENT
0
Entering edit mode

Tried to use the above script using vcffilterjs. but the installation failed with long error message with few shown below:

lindenb/jvarkit/util/Pedigree.java:92: error: illegal start of type
    default int compareTo(final Family o) {
    ^/jvarkit/src/main/java/com/github/lindenb/jvarkit/util/Pedigree.java:92: error: = expected
    default int compareTo(final Family o) {
            ^
/jvarkit/src/main/java/com/github/lindenb/jvarkit/util/Pedigree.java:92: error: ';' expected
    default int compareTo(final Family o) {
               ^
/jvarkit/src/main/java/com/github/lindenb/jvarkit/util/Pedigree.java:92: error: illegal start of type
    default int compareTo(final Family o) {
                         ^
/jvarkit/src/main/java/com/github/lindenb/jvarkit/util/Pedigree.java:92: error: <identifier> expected
    default int compareTo(final Family o) {

Then tried with picard, it worked but the output looks wierd which writes all the variants in the input to the output in a different format for the FORMAT columns as shown below:

GT:FT:PL        0/0:LowDP;LowGQ:0,151,3000      0/0:LowDP;LowGQ:0,193,3840

which does not output any private variants.

Could you suggest further.

ADD REPLY
0
Entering edit mode

error: illegal start of type

The manual says : java 1.8 is required. Did you check that ?

Then tried with picard

picard adds an item in the FILTER column.

ADD REPLY
0
Entering edit mode

jvarkit: Yes, i did used Java 8:

`$ java -version
 Picked up _JAVA_OPTIONS: -Xmx10g
 java version "1.8.0_101"
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)

make vcffilterjs`

Picard: Yes, it has AllGtsFiltered and AllGtsFiltered;JavaScript items in FILTER column. Then i picked the variants with AllGtsFiltered in FILTER column . And it does the trick. Thanks!!!

ADD REPLY
0
Entering edit mode

I tried to get heterozygous variants by using if( g.isHomRef() || g.isNoCall() || g.isHetVar()) which is not valid. and could not find a valid expression in the documentation. Any suggestion?

ADD REPLY
0
Entering edit mode

and could not find a valid expression in the documentation

doc about genotype is here: https://samtools.github.io/htsjdk/javadoc/htsjdk/htsjdk/variant/variantcontext/Genotype.html

so, instead of g.isHetVar() , try g.isHet()

ADD REPLY
0
Entering edit mode

Tried if( g.isHomRef() || g.isNoCall() || g.isHet()) and if( g.isHomRef() || g.isNoCall() || g.isHetNonRef()) . In both cases it worked without any error but all the input variants have AllGtsFiltered which does not make sense. The idea is to have all the other samples to be 0/0 or 0/1 or ./.

ADD REPLY
0
Entering edit mode
g.isHomRef() || g.isNoCall() || g.isHet()

would select ANY genotype but the g.isHomVar()

ADD REPLY
0
Entering edit mode

i did not understand your previous post. The aim is that the selected samples should have 1/1 and the rest can be 0/1 or 0/0 or ./.

ADD REPLY
0
Entering edit mode

could you briefly explain.

ADD REPLY

Login before adding your answer.

Traffic: 1607 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6