Alignng RNAseq data for gene prediction
1
0
Entering edit mode
7 months ago
karthic ▴ 100

Hi,

I have rna-seq data for five tissue samples and I am planning to use these for gene prediction. Should I align them all together in one step to the genome or each tissue separately and later merge the bams??

Thanks, Karthic

gene prediction RNA-Seq • 261 views
ADD COMMENT
0
Entering edit mode

Alternative plan:

  • make a genome-guided transcriptome assembly (e.g. with Trinity)
  • use the generated transcripts as evidence for gene prediction
ADD REPLY
0
Entering edit mode

I have always known Trinity as de-novo transcriptome assembler.

ADD REPLY
0
Entering edit mode

Trinotate, part of the broader Trinity workflow, performs functional annotation: https://github.com/griffithlab/rnaseq_tutorial/wiki/Trinotate-Functional-Annotation

ADD REPLY
0
Entering edit mode

Thanks. I will go through that.

ADD REPLY
0
Entering edit mode

Should I align them all together in one step

If you have a reference available then you are likely not doing gene prediction. And if you don't then you should be assemblig the data as suggested by @Michael.

ADD REPLY
0
Entering edit mode

We have assembled the genome and there is no other annotation available for this species. We have rnaseq and isoseq for some tissues. Currenlty figuring out the way I should prepare the files.

ADD REPLY
0
Entering edit mode

Was the genome assembly done independent of the RNAseq data? What do you mean by "prepare the files"?

ADD REPLY
0
Entering edit mode

Yes, independent of the RNAseq data. I mean gathering the evidence for the gene prediction by utilizing the RNAseq and isoseq data.

The RNAseq data should be assembled with trinity and transcripts to be given as input to tools like augustus, genemark etc or they should be mapped to genome with tools like hisat2/tophat and generate models with stringtie/cufflinks and later given as input to augustus, genemark etc.

ADD REPLY
0
Entering edit mode
7 months ago
liorglic ▴ 340

I think you can map everything at once. If you are worried that you might loose isoform information - you can also do it separately for each tissue and merge only at the end. I had pretty good experience with GAWN for that type of analysis - check it out.

ADD COMMENT
0
Entering edit mode

Thank you. I will check it out.

ADD REPLY

Login before adding your answer.

Traffic: 1916 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6