Bulk RNA-seq with genome but without gene annotation
2
1
Entering edit mode
7 weeks ago
Diego ▴ 10

Hi all,

I need to analyse RNA-seq data from two species that have a genome available. The problem is that the genomes do not have any GTF/GFF file so there is no information about genes. I was wondering how to analyse the data.

Intuitively, I would use Trinity with the genome-guided transcriptome and then use Kallisto. However, I still have a problem with the gene id. Which tool would you use to annotate the genome? Trinotate? Is there a preferred way?

Thanks in advance,

Diego

GFT wouthou analysisz RNA-seq • 312 views
ADD COMMENT
1
Entering edit mode

Choice of gene annotation software is going to depend on the organisms you are studying. Eukaryotes? Prokaryotes?

ADD REPLY
4
Entering edit mode
7 weeks ago

I don't think genome annotation is easy at all, in fact I think it is an extremely hard problem.

Another option is

  • gmap with gff3 output (provided you have a transcript set to map). Else create one with Trinity.
  • visualize transcripts on genome to check accuracy
  • map reads to genome, eg STAR, Hisat2
  • featureCounts
  • DESeq2 etc

Maker is also very good, but harder to setup/ use than gmap.

ADD COMMENT
2
Entering edit mode
7 weeks ago
ahmad mousavi ▴ 710

Hello,

This is not hard but might be a long process. You need following step to get to get at least a minimum information about your transcriptome, I have used following steps :

1- Trinity ( to get assembly file)

2- RSEM analysis for finding gene expression

3- Using edgeR or DESeq2 for finding DEGs.

4- Use Trinotate as a way for annotating your Trinity assembly with known databases ( you can try with only try with UniProt and Pfam not nr at this point). It used blastx or blastp against the SWiSS-PROT entries not all Tremble. When I analyzed the plants I just used Virdiplant entries not all proteins a further information for my Trinity dataset.

5- Retrive information of GO for your UniProt ID from uniprot.org and then try WEGO website for GO enrichment.

By the way maker is another option but I think is time consuming for you and not worth to run it for a single dataset.

Hope it works for you.

ADD COMMENT

Login before adding your answer.

Traffic: 1686 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6