Entering edit mode
3 months ago
Peerzada • 0
Hello all ,
I annotated the multi sample VCF file using ensemble-VEP command line. I got all the mutations in the vcf file but the information regarding which sample has which mutation is not there . How can I get the information that which mutation is in which sample.
look at the genotypes FORMAT/GT ....
There is not any column naming Format .It only contain positioin and the nucletide change as well as some other information like amino acid changes.
What is the format of your output? Or better what was your command for calling VEP?
Format of output was text file and I used ensemple-Vep for annotation with the following command.
output as VCF to keep the genotypes.
I kept the output as vcf ,yet the output do not have sample information.
Also I have this variant data from 1000g .
if the input protein.vcf don't have any genotype, you cannot get the affected samples -- quod erat demonstrandum .
My input vcf file have genotypes as well as sample names of about 2500 individulas.IT is mentioned there.
please, show us the output of:
But when I am changing the output to vcf format , I cannot know how to analyze that as there is no clear visualization of the samples there.I tried to analyze it as excel file but there is lot of mess created.
so THIS is your real problem. You have to learn to analyze/filter a VCF file. Have a look at snpSift, etc... search biostars.org .
Thank you Sir .I will try to .
I tried using snpSift and Jvarkit but I am not able to get the command that will simplify my multi sample annotated vcf file to get the sample information of each variant . I am very new to this . Kindly provide the solution for the same.