Question: bioperl fasta indexing problem with non model genome sequence
gravatar for shreyasibiswas88
9 days ago by
United States
shreyasibiswas8830 wrote:

Hello all,

I am new to using bioperl and trying to extract the mRNA transcript for every gene from a reference fasta and gff file. I have used the Bioperl script provided in a similar query Extract Cds Fastas From A Gff Annotation + Reference Sequence (by user @severin) and this script works great for reference genome of mouse and rat. I am interested in doing a three species comparison where the two species are reference mouse (Mus_musculus_GRMCM38.91) and Rattus norvegicus (Rattus_norvegicus.Rnor_5.0.71.dna_sm.toplevel.fa). my third species is whole genome sequence of black rat (Rattus rattus) that we have sequenced in our lab. This genome of the black rat was mapped to the genome of the Rattus norvegicus (Rattus_norvegicus.Rnor_5.0.71.dna_sm.toplevel.fa) and I want to extract similar to the above mRNA transcripts from black rat genome assembly. I am using the gff3 file for Rattus norvegicus as it was mapped on this. However, using the bioperl script for this black rat genome I am not getting the same output as I get for the reference mouse and rat genomes. I have checked my genome sequence using samtools. Extracting the exon coordinates using samtools gives me the right sequence from the black rat. I am thinking this has to do with index file generated by the bioperl program. But I am not sure. I would really appreciate some help in figuring this out. Please let me know what files would help. Thank you.

ADD COMMENTlink written 9 days ago by shreyasibiswas8830
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 931 users visited in the last hour