In one of my project, I have to use the SNPs from TCGA (PAAD) and convert them into plink format and use for further analysis. There are many issues , that I faced
- I was not able to capture all the sites while convering .maf to .vcf. Although I used exactly same genome version as mentioned in .maf build reference genome information.
- I worked on rest of sites and convert .maf to .vcf files and processed in PLINK. Now issues were lots of missing data at individual level , end up with no outcome.
- Which data (for SNPs or Mutation) I should exactly start to work on the tumour and normal samples profiles in PAAD ? Is it .maf or Copy number data or GWAS ? I found many datatypes files http://gdac.broadinstitute.org/runs/stddata__2016_01_28/data/PAAD/20160128/ Can anyone suggest here ?
I will appreciate all the suggestions