How to get sequence variant information from no dbSNP using ensembl Perl API.
Entering edit mode
5.9 years ago

Hi, this is my first post. I'm Fran from Spain and currently I'm working on my final thesis for a bioinformatic master's degree. I have to obtain transcript sequences which are affected by a mutation. To do this, firstly I get the original sequence of the transcript and then I include the mutation by modifing the string.

I use the following function in order to perform the operation:

# param 0 -> TransciptVariationAllele object
# return -> Sequence of the variation including 5' and 3' regions.
sub get_variation_seq{
    my $tva = $_[0];
    # translateable_seq returns the coding part of the transcript
    # (it removes introns and 5' and 3' utr)
    # my $seq = $tva->transcript->translateable_seq;
    # seq contains 5' and 3' regions.
    my $seq = $tva->transcript->seq->seq;
    my $variation_start = $tva->transcript_variation->cdna_start - 1;
    my $variation_end = $tva->transcript_variation->cdna_end - 1;
    # If is a deletion, feature_seq is '-', so we will use '' instead
    # to build the final sequence.
    my $feature_seq = $tva->feature_seq eq "-" ? "" : $tva->feature_seq;

    print $tva->display_codon_allele_string . "\n";
    print $tva->transcript_variation->variation_feature->variation_name . "\t$variation_start-$variation_end\n";
    print "$seq\n";
    substr($seq, $variation_start, $variation_end - $variation_start + 1) = $feature_seq;
    print $seq . "\n";

    return $seq;

This function receives a TrancriptVariationAllele object and returns the complete variation sequence, including 5' and 3' UTR. This works for dbSNP variations, but when I have to deal with COSMIC or HGMD-PUBLIC variations, $tva->feature_seq does not contain information about the variation seq, it only contains a string as "COSMIC".

¿How could I get the complete mutated sequence of non dbSNP variations? ¿Is there any other way to do this?

Thank you in advance.

Greeting, Fran.

sequence perl ensembl variation allele • 1.3k views
Entering edit mode
5.9 years ago
Emily 23k

Unfortunately many variants from COSMIC and HGMD are not distributed with alleles, just the loci.

Entering edit mode

Thanks for your answer, Emily. In that case I would have to exclude variations whose source is different from dbSNP.


Login before adding your answer.

Traffic: 2063 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6