How to get sequence variant information from no dbSNP using ensembl Perl API.
1
0
Entering edit mode
5.9 years ago

Hi, this is my first post. I'm Fran from Spain and currently I'm working on my final thesis for a bioinformatic master's degree. I have to obtain transcript sequences which are affected by a mutation. To do this, firstly I get the original sequence of the transcript and then I include the mutation by modifing the string.

I use the following function in order to perform the operation:

# param 0 -> TransciptVariationAllele object
# return -> Sequence of the variation including 5' and 3' regions.
sub get_variation_seq{
my $tva =$_[0];
# translateable_seq returns the coding part of the transcript
# (it removes introns and 5' and 3' utr)
# my $seq =$tva->transcript->translateable_seq;
# seq contains 5' and 3' regions.
my $seq =$tva->transcript->seq->seq;
my $variation_start =$tva->transcript_variation->cdna_start - 1;
my $variation_end =$tva->transcript_variation->cdna_end - 1;
# If is a deletion, feature_seq is '-', so we will use '' instead
# to build the final sequence.
my $feature_seq =$tva->feature_seq eq "-" ? "" : $tva->feature_seq; print$tva->display_codon_allele_string . "\n";
print $tva->transcript_variation->variation_feature->variation_name . "\t$variation_start-$variation_end\n"; print "$seq\n";
substr($seq,$variation_start, $variation_end -$variation_start + 1) = $feature_seq; print$seq . "\n";

return $seq; }  This function receives a TrancriptVariationAllele object and returns the complete variation sequence, including 5' and 3' UTR. This works for dbSNP variations, but when I have to deal with COSMIC or HGMD-PUBLIC variations,$tva->feature_seq does not contain information about the variation seq, it only contains a string as "COSMIC".

¿How could I get the complete mutated sequence of non dbSNP variations? ¿Is there any other way to do this?

Greeting, Fran.

sequence perl ensembl variation allele • 1.3k views
0
Entering edit mode
5.9 years ago
Emily 23k

Unfortunately many variants from COSMIC and HGMD are not distributed with alleles, just the loci.

0
Entering edit mode

Thanks for your answer, Emily. In that case I would have to exclude variations whose source is different from dbSNP.