How to get the correct strand of the gene?
0
0
Entering edit mode
3.3 years ago

Hi, I have prepared a perl script to read the stable Ensembl IDs from a file and write the sequences and strands of them to separate files. The $slice->seq() function works fine and I get the sequences, but$slice->strand() always returns 1 even for the genes that are on the reverse strand, like ENSG00000227359. I really need to get the correct strand. Am I missing something here? I copy my script here. I would appreciate your help.

use Bio::EnsEMBL::Registry;

use strict;
use warnings;

# warn user (from perspective of caller)
use Carp;

# use nice English (or awk) names for ugly punctuation variables
use English qw(-no_match_vars);

my $registry = 'Bio::EnsEMBL::Registry';$registry->load_registry_from_db(
-host => 'ensembldb.ensembl.org', # alternatively 'useastdb.ensembl.org'
-user => 'anonymous'
);

my $slice_adaptor =$registry->get_adaptor( 'Human', 'Core', 'Slice' );

my $files = 'strand_test.txt'; my$files_seq = 'sequence_test.txt';

# check if the file exists
if (-f $files) { unlink$files
or croak "Cannot delete $files:$!";
}

# use a variable for the file handle
my $OUTFILE; my$OUTFILE_seq;
# use the three arguments version of open
# and check for errors
open $OUTFILE, '>>',$files
or croak "Cannot open $files:$OS_ERROR";

open $OUTFILE_seq, '>>',$files_seq
or croak "Cannot open $files_seq:$OS_ERROR";

my $filename = "gene_list_test2.txt"; open(my$fh, '<:encoding(UTF-8)', $filename) or die "Could not open file '$filename' $!"; # my$slice = $slice_adaptor->fetch_by_gene_stable_id( 'ENSG00000099889', 0 ); # print "to the loop\n"; while (my$row = <$fh>) { my$slice;
chomp $row; print "$row\n";
print "ine\n";
# my $red_row = substr$row, 0, -1;
# print "$red_row\n"; eval{$slice = $slice_adaptor->fetch_by_gene_stable_id( "$row", 0 );
};
if (not ($@)) { print {$OUTFILE } $slice->strand() or croak "Cannot write to$files: $OS_ERROR"; print {$OUTFILE } "\n"
or croak "Cannot write to";

print { $OUTFILE_seq }$slice->seq()
or croak "Cannot write to $files_seq:$OS_ERROR";
print { $OUTFILE_seq } "\n" or croak "Cannot write to"; } else { print {$OUTFILE } "NOT THERE!\n"
or croak "Cannot write to";

print { $OUTFILE_seq } "NOT THERE!\n" or croak "Cannot write to"; print "----------------\n" } print$slice->strand();
print "\n+++++++++++++++++++\n";
}
print "done\n";

close $OUTFILE or croak "Cannot close$files: $OS_ERROR"; close$OUTFILE_seq
or croak "Cannot close $files_seq:$OS_ERROR";

Ensembl Core API gene rna perl • 834 views
0
Entering edit mode

Hello,

I don't have a solution for you, as I do not use the perl api until now. But I found this statement here:

Note that for historical reasons the fetch_by_gene_stable_id() method always returns a slice on the forward strand even if the gene is on the reverse strand.

fin swimmer

0
Entering edit mode

Thank you for your time. Yes, this sentence initiated my problem which itself is confusing because this is how I understood this sentence: "All the returned sequences are assumed to be on the forward strand and if they are on the reverse strand, the reverse complement of the sequence will be returned as the sequence" and this is why I needed the strand to get the correct sequence for the ones on the reverse strand. After all, there should be some way to get the strand of the genes per se, even if the sequences are as they should be. Thank you again for your comment.