Question: how to find ratio of cases vs control per variant from a vcf file
0
gravatar for abhishek.abhishekkumar
5 months ago by
Germany
abhishek.abhishekkumar10 wrote:

how to find ratio of cases vs control per variant from a vcf file

vcf file looks like -

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  Co1 Co2 Ca1 Ca2
1   15455   .   A   C   2599    PASS    AC=1;AF=6.227e-04;AN=1606;BaseQRankSum=-6.731e+00;CCC=1606;ClippingRankSum=1.27;DP=31258;FS=1.355;GQ_MEAN=99.94;GQ_STDDEV=84.82;HWP=1.0000;InbreedingCoeff=-0.0006;MLEAC=1;MLEAF=6.227e-04;MQ=60.00;MQ0=0;MQRankSum=1.79;NCC=0;QD=15.47;ReadPosRankSum=-1.258e+00;VQSLOD=0.451;culprit=MQ   GT:AD:DP:GQ:PL  0/0:26,0:26:78:0,78,987 0/0:28,0:28:62:0,62,1170    0/0:49,0:49:99:0,120,1800   0/0:26,0:26:78:0,78,1007
1   15456   .   C   T   2276.10 PASS    AC=753;AF=0.469;AN=1606;BaseQRankSum=-5.629e+00;CCC=1606;ClippingRankSum=-4.790e-01;DB;DP=145744;FS=1.372;GQ_MEAN=1472.49;GQ_STDDEV=1218.73;HWP=0.7663;InbreedingCoeff=0.0124;MLEAC=753;MLEAF=0.469;MQ=59.72;MQ0=0;MQRankSum=0.072;NCC=0;POSITIVE_TRAIN_SITE;QD=19.85;ReadPosRankSum=0.753;VQSLOD=5.00;culprit=QD   GT:AD:DP:GQ:PL  0/1:109,66:175:99:1782,0,3496   0/1:116,93:209:99:2547,0,3609   0/1:133,115:248:99:3342,0,4025  0/1:106,84:190:99:2343,0,3387
1   154558  .   G   A   3620    PASS    AC=1;AF=6.227e-04;AN=1606;BaseQRankSum=-8.649e+00;CCC=1606;ClippingRankSum=-9.750e-01;DP=115668;FS=1.869;GQ_MEAN=123.41;GQ_STDDEV=97.83;HWP=1.0000;InbreedingCoeff=-0.0006;MLEAC=1;MLEAF=6.227e-04;MQ=59.74;MQ0=0;MQRankSum=-6.000e-03;NCC=0;QD=17.00;ReadPosRankSum=0.820;VQSLOD=2.75;culprit=InbreedingCoeff  GT:AD:DP:GQ:PL  0/0:117,0:117:99:0,120,1800 0/0:116,0:116:99:0,120,1800 0/0:201,0:201:99:0,120,1800 0/0:121,0:121:99:0,120,1800
case vs control vcf • 230 views
ADD COMMENTlink modified 5 months ago • written 5 months ago by abhishek.abhishekkumar10
2

Hi abhishek.abhishekkumar,

People will be more eager to help if you show what you tried and ensure us that you also have put effort in this question. We would be happy to point out your mistakes or put you back on track.

In addition, I have converted your post from a 'forum' to a 'question', because that's what it is.

I added markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

Finally: Interesting guidelines for posting can be found in the following posts:

Cheers,
Wouter

ADD REPLYlink modified 5 months ago • written 5 months ago by WouterDeCoster22k

Thanks for the multiple answers - Pierre, as many of your ideas, worked for me in past, so special thanks.

ADD REPLYlink modified 5 months ago • written 5 months ago by abhishek.abhishekkumar10

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your post but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

ADD REPLYlink written 5 months ago by WouterDeCoster22k
2
gravatar for Pierre Lindenbaum
5 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum99k wrote:

using bioalcidae: http://lindenb.github.io/jvarkit/BioAlcidae.html

script:

var sample2status={
    "M10475":0,
    "M10478":1,
    "M10500":1
    };
out.println("contig pos ref aff unaff");
while(iter.hasNext())
    {
    var vc = iter.next();
    var affected=0;
    var unaffected=0;
    var total=0.0;
    for(var sample in sample2status)
        {
        var genotype= vc.getGenotype(sample);
        if( genotype == null || !genotype.isCalled()) continue;
        var alleles = genotype.getAlleles();
        for(var i=0;i< alleles.size();++i)
            {
            var allele = alleles.get(i);
            if(allele.isNoCall()) continue;
            total++;
            if(allele.isReference()) continue;
            if(sample2status[sample]==1)
                {
                affected++;
                }
            else
                {
                unaffected++;
                }
            }
        }
    if(total==0) continue;
    out.println(vc.getContig()+" "+vc.getStart()+" "+vc.getReference().getDisplayString()+" "+(affected/total)+" "+(unaffected/total));
    }

invoke:

$ curl -s "https://raw.githubusercontent.com/arq5x/gemini/master/test/test.region.vep.vcf"  | java -jar dist/bioalcidae.jar -f script.js -F VCF | column -t 

contig  pos       ref  aff                  unaff
chr1    10001     T    0.3333333333333333   0.16666666666666666
chr1    10056     A    0.16666666666666666  0
chr1    10121     A    0                    0
chr1    10177     A    0.16666666666666666  0.16666666666666666
chr1    10180     T    0.3333333333333333   0
chr1    10234     C    0.16666666666666666  0.16666666666666666
chr16   72057282  A    0.3333333333333333   0.3333333333333333
chr16   72057435  C    0                    0.16666666666666666
chr16   72059269  T    1                    0

shameless self promotion, see also:

     Alleles
     +-----+-----+-----+-------+--------+----+----+-----+-------------+---------------+---------+-----------+
     | Idx | REF | Sym | Bases | Length | AC | AN | AF  | AC_affected | AC_unaffected | AC_male | AC_female |
     +-----+-----+-----+-------+--------+----+----+-----+-------------+---------------+---------+-----------+
     | 0   | *   |     | T     | 1      | 4  | 8  | 0.5 | 2           | 1             | 1       | 2         |
     | 1   |     |     | TC    | 2      | 4  | 8  | 0.5 | 2           | 1             | 1       | 2         |
     +-----+-----+-----+-------+--------+----+----+-----+-------------+---------------+---------+-----------+

enter image description here

ADD COMMENTlink modified 5 months ago • written 5 months ago by Pierre Lindenbaum99k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1277 users visited in the last hour