Entering edit mode
                    8.9 years ago
        cristina_sabiers
        
    
        ▴
    
    110
    well Im back with silly questions..
Is there any easy guide to understand how to filter my vcf files? I have exome secuency, I filtered my vcf file just to have SNPS and exclude QD<2 (I need to start somewhere) eventhought still I have over 35.000 genes (too many for my brain to study)
Got QUAL from 100-3000 If Im not wrong higher better (but which I can discard...under 1000...800....??
In my vcf head file have many different values of:
 [1]CHROM   [2]POS  [3]REF  [4]ALT  [5]QUAL [6]GENE [7]GT   [8]GQ   [9]FILTER   [10]AF  [11]AO  [12]BKPTID  [13]CDF_LD  [14]CDF_MAPD    [15]CIEND   [16]CIPOS   [17]CONFIDENCE  [18]DP  [19]END [20]FAO [21]FDP [22]FR  [23]FRO [24]FSAF    [25]FSAR    [26]FSRF    [27]FSRR    [28]FWDB    [29]FXX [30]HOMLEN  [31]HOMSEQ  [32]HRUN    [33]HS  [34]LEN [35]MEINFO  [36]MLLD    [37]NS  [38]NUMTILES    [39]OALT    [40]OID [41]OMAPALT [42]OPOS    [43]OREF    [44]PRECISE [45]PRECISION   [46]QD  [47]RBI [48]REFB    [49]REVB    [50]RO  [51]SAF [52]SAR [53]SRF [54]SRR [55]SSEN    [56]SSEP    [57]SSSB    [58]STB [59]STBP    [60]SVLEN   [61]SVTYPE  [62]TYPE    [63]VARB    [64]FUNC    [65]SF
If anyone can provide a link where I can learn easily how to reduce my vcf file I would really appreciate it.
Thanks
Giva a look here, especially if you used GATK.
https://software.broadinstitute.org/gatk/best-practices/
Thanks Fabio, I hadnt done this vcf files, my pc sadly cant handle to do that kind of jobb (I tried once and was a hell). Just got this vcf done by a company. And I dont think they used Gatk, they have their own program.
Thanks for the link.
That highly depends on what the aim of your analysis is. Looking for a causal variant, an eQTL study, population genetics,...