Multiple identical mutations in SnpEff result: ex1.genes.txt
1
0
Entering edit mode
2.8 years ago
Maya • 0

Hi, I am new here and struggling to get exome sequencing data in VCF format annotated. I ran

java -Xmx8g -jar snpEff.jar -v -stats ex1.html hg38 P2.vcf > out/P2.ann.vcf 

and among the output I got was the list of mutated genes (ex1.genes.txt). There are multiple identical mutation listed on the ex1.genes.txt as A1CF gene seen below. How did this happen and could I just ignore the overlapping ones? Thanks in advance for any help!

The following table is formatted as tab separated values

#GeneName   GeneId  TranscriptId    BioType variants_impact_HIGH    variants_impact_LOW variants_impact_MODERATE    variants_impact_MODIFIER    variants_effect_3_prime_UTR_variant variants_effect_5_prime_UTR_premature_start_codon_gain_variant  variants_effect_5_prime_UTR_variant variants_effect_conservative_inframe_deletion   variants_effect_conservative_inframe_insertion  variants_effect_disruptive_inframe_deletion variants_effect_disruptive_inframe_insertion    variants_effect_downstream_gene_variant variants_effect_exon_loss_variant   variants_effect_frameshift_variant  variants_effect_initiator_codon_variant variants_effect_intron_variant  variants_effect_missense_variant    variants_effect_non_coding_transcript_exon_variant  variants_effect_non_coding_transcript_variant   variants_effect_splice_acceptor_variant variants_effect_splice_donor_variant    variants_effect_splice_region_variant   variants_effect_start_lost  variants_effect_start_retained_variant  variants_effect_stop_gained variants_effect_stop_lost   variants_effect_stop_retained_variant   variants_effect_synonymous_variant  variants_effect_upstream_gene_variant
A1BG    A1BG    NM_130786.4 protein_coding  0   7   8   44  2   0   0   0   0   0   0   7   0   0   0   12  8   0   0   0   0   0   0   0   0   0   0   7   23
A1BG-AS1    A1BG-AS1    NR_015380.2     0   0   0   50  0   0   0   0   0   0   0   19  0   0   0   7   0   6   0   0   0   0   0   0   0   0   0   0   18
A1CF    A1CF    NM_001198818.1  protein_coding  0   0   6   30  0   0   0   0   1   0   0   4   0   0   0   26  5   0   0   0   0   0   0   0   0   0   0   0   0
A1CF    A1CF    NM_001198819.1  protein_coding  0   0   6   30  0   0   0   0   1   0   0   4   0   0   0   26  5   0   0   0   0   0   0   0   0   0   0   0   0
A1CF    A1CF    NM_001198820.1  protein_coding  0   0   6   30  0   0   0   0   1   0   0   4   0   0   0   26  5   0   0   0   0   0   0   0   0   0   0   0   0
A1CF    A1CF    NM_001370130.1  protein_coding  0   0   6   30  0   0   0   0   1   0   0   4   0   0   0   26  5   0   0   0   0   0   0   0   0   0   0   0   0
A1CF    A1CF    NM_001370131.1  protein_coding  0   0   6   30  0   0   0   0   1   0   0   4   0   0   0   26  5   0   0   0   0   0   0   0   0   0   0   0   0
A1CF    A1CF    NM_014576.4 protein_coding  0   0   6   30  0   0   0   0   1   0   0   4   0   0   0   26  5   0   0   0   0   0   0   0   0   0   0   0   0
A1CF    A1CF    NM_138932.2 protein_coding  0   0   6   30  0   0   0   0   1   0   0   4   0   0   0   26  5   0   0   0   0   0   0   0   0   0   0   0   0
A1CF    A1CF    NM_138933.2 protein_coding  0   0   6   30  0   0   0   0   1   0   0   4   0   0   0   26  5   0   0   0   0   0   0   0   0   0   0   0   0
A2M A2M NM_000014.5 protein_coding  2   21  24  45  0   0   0   0   0   0   0   1   0   0   0   43  24  0   0   0   0   2   0   0   2   0   0   20  2
mutation SnpEff • 710 views
ADD COMMENT
1
Entering edit mode
2.8 years ago
Ram 43k

The same nucleotide change can impact multiple transcripts of a gene in distinct ways. All variant annotation software have options to list each such per-transcript effects separately and that's what's happening here. The various NM_s are protein coding transcripts of the A1CF gene.

Sometimes, the same variant can have multiple "effect"s on the same transcript as a single region can serve multiple purposes. For example, a splice site exonic variant can also be a missense or synonymous variant. That could be what's going on in entries with duplicated GeneID and TranscriptID columns,

ADD COMMENT

Login before adding your answer.

Traffic: 1777 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6