Question: rna-seq expression analysis among several nonmodel species (Next step?)
gravatar for cmpolania
2.3 years ago by
cmpolania0 wrote:

I'm currently analyzing RNA-seq data from four species in one genus, and I would love a little help with deciding my next steps.

My eventual goal: Finding secreted proteins/secondary metabolites expressed significantly among 4 species of fungus in culture: either expressed in one species only, or co-expressed in all 4. (This is a discovery-based project, there's no null hypothesis)

Starting data: I started with RNAseq reads, assembled genomes, a .gtf annotation for each genome, and functional annotation information (swissprot, signalp, PFAM, etc) for each genome. The functional annotation files hold protein_ids and corresponding descriptions.

What I have done so far: I've aligned the reads from each species to their respective genomes (including the .gtf annotations in order to keep gene_ids constant) using Hisat2, and assembled transcripts and quantified expression using Stringtie.

What I have now: 1 Stringtie output for each species, each with aligned gene_id, transcript_id, and FPKM/TPM values.

The advice I need: What should be my next step? Since I'm not looking for differential expression, I'm assuming that my next analyses should be on individual species. How can I associate my protein_ids and my gene_ids? How can I go from FPKM values to deciding whether or not a gene is significantly expressed in a species? Are FPKM values enough, or is there some kind of normalization that should still be done (log transformation)? Should gene clusters be found, and how would that be important? Once I find (for example) a gene that produces an interesting secondary metabolite, how would I find if there are analogs in the other species?

I'm feeling a little lost when it comes to what to do next.

rna-seq alignment • 680 views
ADD COMMENTlink modified 2.3 years ago by Hussain Ather950 • written 2.3 years ago by cmpolania0
gravatar for Hussain Ather
2.3 years ago by
Hussain Ather950
National Institutes of Health, Bethesda, MD
Hussain Ather950 wrote:

Maybe you could try making a density plot of FPKM values? There you could probably see a significant cutoff as a result of it.

ADD COMMENTlink written 2.3 years ago by Hussain Ather950
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2709 users visited in the last hour