bioperl fasta indexing problem with non model genome sequence
0
0
Entering edit mode
7.6 years ago

Hello all,

I am new to using bioperl and trying to extract the mRNA transcript for every gene from a reference fasta and gff file. I have used the Bioperl script provided in a similar query Extract Cds Fastas From A Gff Annotation + Reference Sequence (by user @severin) and this script works great for reference genome of mouse and rat. I am interested in doing a three species comparison where the two species are reference mouse (Mus_musculus_GRMCM38.91) and Rattus norvegicus (Rattus_norvegicus.Rnor_5.0.71.dna_sm.toplevel.fa). my third species is whole genome sequence of black rat (Rattus rattus) that we have sequenced in our lab. This genome of the black rat was mapped to the genome of the Rattus norvegicus (Rattus_norvegicus.Rnor_5.0.71.dna_sm.toplevel.fa) and I want to extract similar to the above mRNA transcripts from black rat genome assembly. I am using the gff3 file for Rattus norvegicus as it was mapped on this. However, using the bioperl script for this black rat genome I am not getting the same output as I get for the reference mouse and rat genomes. I have checked my genome sequence using samtools. Extracting the exon coordinates using samtools gives me the right sequence from the black rat. I am thinking this has to do with index file generated by the bioperl program. But I am not sure. I would really appreciate some help in figuring this out. Please let me know what files would help. Thank you.

bioperl mRNA fasta index reference • 1.4k views
ADD COMMENT

Login before adding your answer.

Traffic: 3648 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6