I am new to TCGA data analysis. I would really appreciate your suggestions on these questions:
a) From TCGA vcf files, I am looking to generate manhattan plots and qq plots to detect the association of SNPs with the traits? I know to generate manhattan plots we need these info:
CHR: chromosome (aliases chr, chromosome) BP: nucleotide location (aliases bp, pos, position) SNP: SNP identifier (aliases snp, rs, rsid, rsnum, id, marker, markername) P: p-value for the association (aliases p, pval, p-value, pvalue, p.value)
"CHR", "BP, "SNP" are in the vcf files, so where to get the "P-value" from?
And for QQ plots also where to get the observed and expected p-value?
b) What type of plot should be generated to best present the number of variants for each tumor in each cancer type in vcf files? Could you please let me know where to find the information of tumor and the cancer type for SNPs in vcf files?
Thank you very much! DK