Question: TCGA maf to vcf file conversion
0
gravatar for swethabiochem
5 weeks ago by
swethabiochem0 wrote:

Hello All, I want to convert TCGA MAF to VCF analysis with PLINK I worked on the conversion of MAF2VCF using maf2vcf.pl, I got converted VCF file, but while conversion it has missing phenotype data and some gene id. due to this I could not able to process with PLINK analysis. please suggest me to exact conversation or adding phenotype data. Thank you.

snp • 239 views
ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by swethabiochem0

Hello Sir, Thanks for the reply and suggestion. the below Perl command I used to convert maf to vcf

perl maf2vcf.pl --input-maf /home/swetha/Desktop/GWAS/TCGA.HNSC.varscan.5296cf00-4d8c-4db3-80d7-930a4b44f90d.DR-10.0.somatic.maf --ref /home/swetha/vep/ensembl-vep-release-97/ GRCh38.d1.vd1.fa --output- /home/swetha/Desktop//GWAS/TCGA.vcf

it has converted as TCGA.HNSC.varscan.5296cf00-4d8c-4db3-80d7-930a4b44f90d.DR-10.0.somatic.vcf and tsv smaple file like

TCGA.HNSC.varscan.5296cf00-4d8c-4db3-80d7-930a4b44f90d.DR-10.0.somatic.pairs.tsv

Now i want to custom .fam file how i can use this .tsv sample file to generate.fam file indivally please let me know best way. If i generate .fam custom is it have any problem internal connection with .bim .bam to process further work.

ADD REPLYlink modified 5 weeks ago by Kevin Blighe48k • written 5 weeks ago by swethabiochem0

You will generate the FAM file on your own, by studying how it is structured: https://www.cog-genomics.org/plink/1.9/formats#fam

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Kevin Blighe48k

Still, I am not getting to generate .fam file because I don't have patient family details, I have only gender phenotype values. I am new to this work not getting idea could you explain how I can process this work. Thanks.

ADD REPLYlink written 5 weeks ago by swethabiochem0

Still, I am not getting to generate .fam file because I don't have patient family details, I have only gender phenotype values. I am new to this work not getting idea could you explain how I can process this work. Thanks.

ADD REPLYlink written 5 weeks ago by swethabiochem0

You do not have to have FID (family ID) in order to use PLINK. Also, this process is simply not something that is easy to explain across a web forum, like Biostars. I have provided enough details already in my previous answer: linkage disequilibrium analysis

ADD REPLYlink written 5 weeks ago by Kevin Blighe48k
0
gravatar for Kevin Blighe
5 weeks ago by
Kevin Blighe48k
Kevin Blighe48k wrote:

To better help, you should show all commands that you have used.

Some general advice: when converting a VCF to PLINK format, PLINK will not know what are the phenotypes. You need to create a custom FAM file that matches exactly to the order of samples in your PLINK object - you can then use this FAM file in all downstream analysis via the --fam command line parameter.

Please note: PLINK will NOT store your samples in the same order as they appear in the VCF. You can control this, however, via the --indiv-sort file parameter when converting. Please read my previous answer, here: linkage disequilibrium analysis

Kevin

ADD COMMENTlink written 5 weeks ago by Kevin Blighe48k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2007 users visited in the last hour