Question: Create annotation file for mature miRNA sequences from mirBase
0
gravatar for polag03
3.1 years ago by
polag030
polag030 wrote:

Please help, I am new to sequence analysis. I have been trying to create gff/gtf annotation file for mature miRNA sequences obtained from miRBase in order to analyze some sequence data. I have not seen any directions on how to achieve this. The miRNA sequences are in fasta format and I have my reference genome sequence too. Please kindly guide me. Thank you

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by polag030

Thank you Prasad. I will try the SAM2GFF out and revert

ADD REPLYlink written 3.1 years ago by polag030
2
gravatar for Prasad
3.1 years ago by
Prasad1.6k
India
Prasad1.6k wrote:

gff for some of the organism are already there in miRBase.

Other thing what you can do, align all the mature miRNAs to the genome and convert the sam to gff using SAM2GFF. Hope this helps

ADD COMMENTlink written 3.1 years ago by Prasad1.6k

Hi Prasad, Thanks for the reply On a second look, the mirbase sequences are in fasta format while Bowtie takes fastq reads for alignment. is there a way around this?

ADD REPLYlink written 3.1 years ago by polag030

I got that resolved, Prasad But I got the sam file of the alignment. When I tried to run the perl script to convert the sam file to gff i got a fatal error "Unable to open input file <filename.sam>".

My command perl scampi_sam_to_gffv1.pl -i inputfile.sam -o outputfile.gff

ADD REPLYlink written 3.1 years ago by polag030

Is the sam file not in current directory? Have you tried ./inputfile.sam to make the location more explicit?

ADD REPLYlink written 3.1 years ago by genomax71k

Yes it is in the current directory. i also had to specify the path explicitly when i got the error, still got the error.

ADD REPLYlink written 3.1 years ago by polag030

Can you post a few lines of your sam file?

head inputfile.sam
ADD REPLYlink written 3.1 years ago by genomax71k

Sure. Here is the output

@HD     VN:1.0  SO:unsorted
@SQ     SN:Chromosome01 LN:34959721
@SQ     SN:Chromosome02 LN:32431396
@SQ     SN:Chromosome03 LN:29412403
@SQ     SN:Chromosome04 LN:28749345
@SQ     SN:Chromosome05 LN:28438989
@SQ     SN:Chromosome06 LN:27939960
@SQ     SN:Chromosome07 LN:27069033
@SQ     SN:Chromosome08 LN:34011518
@SQ     SN:Chromosome09 LN:29417918
ADD REPLYlink modified 3.1 years ago by genomax71k • written 3.1 years ago by polag030

That looks like a proper sam file (assuming you see alignments further down in the file, correct?).

ADD REPLYlink written 3.1 years ago by genomax71k

yes it is

This is more of the file

cel-lin-4-3p    4       *       0       0       *       *       0       0       ACACCTGGGCTCTCCGGGTACC  IIIIIIIIIIIIIIIIIIIIII  XM:i:0
cel-lin-4-5p    4       *       0       0       *       *       0       0       TCCCTGAGACCTCAAGTGTGA   IIIIIIIIIIIIIIIIIIIII   XM:i:0
cel-miR-1-5p    4       *       0       0       *       *       0       0       CATACTTCCTTACATGCCCATA  IIIIIIIIIIIIIIIIIIIIII  XM:i:0
cel-let-7-5p    4       *       0       0       *       *       0       0       TGAGGTAGTAGGTTGTATAGTT  IIIIIIIIIIIIIIIIIIIIII  XM:i:0
cel-let-7-3p    4       *       0       0       *       *       0       0       CTATGCAATTTTCTACCTTACC  IIIIIIIIIIIIIIIIIIIIII  XM:i:0
ADD REPLYlink modified 3.1 years ago by genomax71k • written 3.1 years ago by polag030
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1686 users visited in the last hour