Alignment Annotation mistake?
1
0
Entering edit mode
6.0 years ago

We received the results for sequences of exon 7 from APOBEC3G gen, and we found a change (rs369306100) that be present in all sequences (240 samples), however, this change (c.1109A>C), has a MAF very low: C=0.00005/6 (ExAC) C=0.00008/1 (GO-ESP) C=0.00008/10 (TOPMED) and others. We found this change in 240 samples from Colombian people (n=240). The sequence is clear, we don´t have unpecifics products in PCR, and the electropherogram is very clear and clean. The products were sequencing in two different opportunities and we had the same results. It is possible that you can help us with this unexpected result? Can exist an error in this annotation? Can you give us some indication to understand these results? We appreciate your help. Thanks for your time. We will be waiting for your answer.

Thanks a lot.

Sergio Andres Castañeda Universidad del Rosario Bogotá, Colombia

SNP Assembly sequencing alignment • 1.3k views
ADD COMMENT
0
Entering edit mode

Is the variant homozygous or heterozygous in your 240 samples? Can you add the primers you used for this PCR?

ADD REPLY
0
Entering edit mode

Change is heterozygous in 240 samples (100%). Tha samples are from Colombian population form Cúcuta and Bogota. This population is not isolated.

https://varsome.com/variant/hg19/APOBEC3G%3AAsp370Ala

https://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=369306100

ADD REPLY
1
Entering edit mode

100% heterozygous does not suggest you have something population specific, it suggests an artefact is present.

I suspect that your primers also amplified another locus, and what you see is not a heterozygous SNP but a different between two paralogous sequences (a paralogous sequence variant if you like).

Could you share the primer sequences? Blatting the exon in which your SNP is present already shows other hits.

ADD REPLY
1
Entering edit mode

In fact, this is how we found KCNJ18, due to sequencing KCNJ12 and always finding heterozygous variants.

ADD REPLY
0
Entering edit mode

rs369306100 [Homo sapiens]

A[A/C]CCTGAGTGGGAGGCTGCGGGCCATT
Chromosome: 22:39087095

Gene:APOBEC3G (GeneView) Functional Consequence: missense Validated: by cluster,by frequency HGVS: CM000684.2:g.39087095A>C, NC_000022.10:g.39483100A>C, NC_000022.11:g.39087095A>C, NM_001349436.1:c.1076A>C, NM_001349437.1:c.908A>C, NM_021822.3:c.1109A>C, NP_001336365.1:p.Asp359Ala, NP_001336366.1:p.Asp303Ala, NP_068594.1:p.Asp370Ala, XP_006724353.1:p.Asp303Ala, XP_016884392.1:p.Asp370Ala, XP_016884393.1:p.Asp303Ala
ADD REPLY
0
Entering edit mode

Could you share the primer sequences?

ADD REPLY
0
Entering edit mode

You mean all 240 individuals are homozygous for the ALT allele?

Also, is this a country-wide sample? An indigenous, isolated population?

ADD REPLY
0
Entering edit mode

Change is heterozygous in 240 samples (100%). Tha samples are from Colombian population form Cúcuta and Bogota. This population is not isolated.

https://varsome.com/variant/hg19/APOBEC3G%3AAsp370Ala

https://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=369306100

ADD REPLY
1
Entering edit mode

Hola amigo, the APOBEC gene cluster contains quite a few genes that each arose due to gene duplication events. So, even if you are confident that your data is of high quality, you cannot rule out the possibility that your reads are mis-aligning to another of the APOBEC genes n which the true variant may exist. Please send your primer sequences to Wouter - it is essential that your primer sequences align uniquely. This can be checked via in silico PCR.

ADD REPLY
0
Entering edit mode

Completely agree. Thank you very much.

ADD REPLY
1
Entering edit mode
6.0 years ago
h.mon 35k

I will summarize the discussion above:

Most likely, there is is a recent (or very conserved for some other reason) duplication involving the region you are interested. The primers you designed amplified both regions, and when you sequenced, due to the fact they are still very similar, the electropherograms came really clean, with just this seemingly heterozygous site.

If you blast online (select More dissimilar sequences (discontiguous megablast)) the sequence the sequence from the refSNP site:

>gnl|dbSNP|rs369306100|allelePos=2|totalLen=102|taxid=9606|snpclass=1|alleles='A/C'|mol=Genomic|build=151
 CCTGAGTGGG AGGCTGCGGG CCATTCTCCA GGTGAGGGCT TCTTCCCTCT GCCCAGTGCC
 CCATCGGCCT CCCCCTCCTC CCCTCTCCCC TGCGCCGTGC

You will see there is a clone used on the Human genome project which has a duplication of this sequence. Likewise, there is a clone from Pongo abelii (Sumatran orangutan) with the same region also apparently duplicated. So there is some evidence for duplication on this region.

You will have to design a new set of primers, try to use the clone above to design primers unique to both duplicated regions. Another option is to use some high fidelity / efficiency Taq to perform a really long PCR, and then primer walking to sequence the whole molecule.

ADD COMMENT
0
Entering edit mode

Completely agree. Thank you very much.

ADD REPLY

Login before adding your answer.

Traffic: 2838 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6