help needed for understanding plinkseq output
2
0
Entering edit mode
7.4 years ago
peris ▴ 120

Hi all,

I have the following plinkseq output. Can anyone help me in understanding it. Also considering its a case control study of 1000 sample each which test should I use.

NVAR    TEST    P    I    DESC
19    BURDEN    0.375    0.0434783    231/224
27    BURDEN    0.0440678    0.00340136    81/62
9    BURDEN    1    0.2    38/46
26    BURDEN    0.714286    0.166667    799/824
10    BURDEN    0.714286    0.166667    15/16
2    BURDEN    0.5    0.2    1/1
6    BURDEN    0.833333    0.2    2/4
26    BURDEN    0.102362    0.00793651    79/66
7    BURDEN    1    0.2    604/663
9    BURDEN    0.428571    0.05    533/542
14    BURDEN    0.00953029    0.000681199    12/3
5    BURDEN    0.538462    0.333333    5/4
36    BURDEN    0.000788777    5.63444e-05    706/615
1    BURDEN    0.833333    0.2    378/381

case-control plinkseq snp next-gen • 2.5k views
2
Entering edit mode
7.4 years ago

You need to read the manual for the software to know what it makes. If you don't understand the procedure, then chances are your input data is malformed too, and the results will be wrong.

From the output I'd guess it has performed "BURDEN" tests over NVAR snp segments and reported a p<0.01 hit over 14 snps, with 12 cases and 3 controls sharing the sequence.

The Plinkseq documentation page for association tests shows your BURDEN table with columns of genomic locations.

They explain that the I column represents the diversity at the locus, high values indicating less unique sequence observations. Therefore your low P and low I regions are good candidates for something occurring more often in cases than controls.

0
Entering edit mode

Thanks Karl.

0
Entering edit mode

Karl can the DESC column exceed 1000/1000 in above example?

For my result with only 12 cases and 28 controls :

LOCUS                       POS       ALIAS NVAR   TEST         P    I    DESC
NM_000016   chr1:76190072..76229221    34,ACADM   64 BURDEN 1.0000000  0.20000000 313/147


Any suggestions?

0
Entering edit mode

I'm not sure what the DESC column is. Subject totals was a guess. It doesn't really make sense to have a P=1.0 when 313 cases and 147 controls carry the a 64 SNP haplotype.

1
Entering edit mode
7.4 years ago
zx8754 10k

From the output LOCUS, POS, ALIAS columns are missing.

Interpretation, it is a pseq burden (Excess of rare alleles in cases compared to controls) result output for 14 loci each having NVAR number of SNPs. e.g.: the first line, on that locus there are 19 SNPs grouped together for burden.

"The P field is based on permutation, the empirical significance. The I field indicates the proportion of null replicates for which the best test statistic was tied. ... The DESC field contains the number of case/control minor alleles." - pseq gene-based tests

pseq 0.09 version includes SKAT test as well.

0
Entering edit mode

Dear Tokhir,

Thanks for your answer. If you don't mind can you please do let me know when I should use calpha or burden test for case control study. I am new to human genetics as well as bio-stats; so trying to gain idea about this.

Also I did not get this "pseq 0.09 version includes SKAT test as well" in the pseq note.