Question: TCGA maf to vcf file conversion
0
gravatar for swethabiochem
11 months ago by
swethabiochem0 wrote:

Hello All, I want to convert TCGA MAF to VCF analysis with PLINK I worked on the conversion of MAF2VCF using maf2vcf.pl, I got converted VCF file, but while conversion it has missing phenotype data and some gene id. due to this I could not able to process with PLINK analysis. please suggest me to exact conversation or adding phenotype data. Thank you.

snp • 584 views
ADD COMMENTlink modified 11 months ago • written 11 months ago by swethabiochem0

Hello Sir, Thanks for the reply and suggestion. the below Perl command I used to convert maf to vcf

perl maf2vcf.pl --input-maf /home/swetha/Desktop/GWAS/TCGA.HNSC.varscan.5296cf00-4d8c-4db3-80d7-930a4b44f90d.DR-10.0.somatic.maf --ref /home/swetha/vep/ensembl-vep-release-97/ GRCh38.d1.vd1.fa --output- /home/swetha/Desktop//GWAS/TCGA.vcf

it has converted as TCGA.HNSC.varscan.5296cf00-4d8c-4db3-80d7-930a4b44f90d.DR-10.0.somatic.vcf and tsv smaple file like

TCGA.HNSC.varscan.5296cf00-4d8c-4db3-80d7-930a4b44f90d.DR-10.0.somatic.pairs.tsv

Now i want to custom .fam file how i can use this .tsv sample file to generate.fam file indivally please let me know best way. If i generate .fam custom is it have any problem internal connection with .bim .bam to process further work.

ADD REPLYlink modified 11 months ago by Kevin Blighe63k • written 11 months ago by swethabiochem0

You will generate the FAM file on your own, by studying how it is structured: https://www.cog-genomics.org/plink/1.9/formats#fam

ADD REPLYlink modified 11 months ago • written 11 months ago by Kevin Blighe63k

Still, I am not getting to generate .fam file because I don't have patient family details, I have only gender phenotype values. I am new to this work not getting idea could you explain how I can process this work. Thanks.

ADD REPLYlink written 11 months ago by swethabiochem0

Still, I am not getting to generate .fam file because I don't have patient family details, I have only gender phenotype values. I am new to this work not getting idea could you explain how I can process this work. Thanks.

ADD REPLYlink written 11 months ago by swethabiochem0

You do not have to have FID (family ID) in order to use PLINK. Also, this process is simply not something that is easy to explain across a web forum, like Biostars. I have provided enough details already in my previous answer: linkage disequilibrium analysis

ADD REPLYlink written 11 months ago by Kevin Blighe63k
0
gravatar for Kevin Blighe
11 months ago by
Kevin Blighe63k
Kevin Blighe63k wrote:

To better help, you should show all commands that you have used.

Some general advice: when converting a VCF to PLINK format, PLINK will not know what are the phenotypes. You need to create a custom FAM file that matches exactly to the order of samples in your PLINK object - you can then use this FAM file in all downstream analysis via the --fam command line parameter.

Please note: PLINK will NOT store your samples in the same order as they appear in the VCF. You can control this, however, via the --indiv-sort file parameter when converting. Please read my previous answer, here: linkage disequilibrium analysis

Kevin

ADD COMMENTlink written 11 months ago by Kevin Blighe63k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1638 users visited in the last hour