Help on Oncomine VCF file "ID" nomenclature

0

Entering edit mode

3.2 years ago

lee_victoria • 0

Hi, i have RNAseq data provided to me in a VCF file format. Need help interpreting the "ID" column of the file. For example, for a single gene MDM4, there are 4 unique lines, where the ID is "MDM4.E1E2", "MDM4-MDM4.M2M11", "MDM4.E3E4" and "MDM4-MDM4.M7M10".

What do these nomenclature mean? are they different splice variants or alt transcripts of MDM4? When i do DEG analyses do I use the total READ COUNT for these 4 lines as the read count for MDM4, or analyse each line as 1 separate gene?

VCF RNAseq • 838 views

ADD COMMENT • link updated 3.2 years ago by Pierre Lindenbaum 166k • written 3.2 years ago by lee_victoria • 0

0

Entering edit mode

What do these nomenclature mean?

you need to tell use what software was used to insert those names in the ID column . It's usually defined in the VCF header.

ADD REPLY • link 3.2 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

Thanks for your reply! in the VCF header i can see the following which may be what you are looking for?

##OncomineVariantAnnotationToolVersion=2.3.11 
##IonReporterExportVersion=0.1.5
##annotationSources=[clinvar_1, dbsnp_138, dgv_20130723, drugbank_1, hg19_esp6500_1, hg19_go_1218, hg19_pfam_26, hg19_phylop_1, hg19_refgene_63, namedVariants_1, Oncomine Variant Annotator v2.1, VARIANTKBc3130d6e_4ce5_4def_ac1e_36994ac536a5, canonical_refseq_v63_updated_OCP.txt]

ADD REPLY • link updated 3.2 years ago by Pierre Lindenbaum 166k • written 3.2 years ago by lee_victoria • 0