Question: rna-seq expression analysis among several nonmodel species (Next step?)
gravatar for cmpolania
3 months ago by
cmpolania0 wrote:

I'm currently analyzing RNA-seq data from four species in one genus, and I would love a little help with deciding my next steps.

My eventual goal: Finding secreted proteins/secondary metabolites expressed significantly among 4 species of fungus in culture: either expressed in one species only, or co-expressed in all 4. (This is a discovery-based project, there's no null hypothesis)

Starting data: I started with RNAseq reads, assembled genomes, a .gtf annotation for each genome, and functional annotation information (swissprot, signalp, PFAM, etc) for each genome. The functional annotation files hold protein_ids and corresponding descriptions.

What I have done so far: I've aligned the reads from each species to their respective genomes (including the .gtf annotations in order to keep gene_ids constant) using Hisat2, and assembled transcripts and quantified expression using Stringtie.

What I have now: 1 Stringtie output for each species, each with aligned gene_id, transcript_id, and FPKM/TPM values.

The advice I need: What should be my next step? Since I'm not looking for differential expression, I'm assuming that my next analyses should be on individual species. How can I associate my protein_ids and my gene_ids? How can I go from FPKM values to deciding whether or not a gene is significantly expressed in a species? Are FPKM values enough, or is there some kind of normalization that should still be done (log transformation)? Should gene clusters be found, and how would that be important? Once I find (for example) a gene that produces an interesting secondary metabolite, how would I find if there are analogs in the other species?

I'm feeling a little lost when it comes to what to do next.

rna-seq alignment • 182 views
ADD COMMENTlink modified 3 months ago by Hussain Ather910 • written 3 months ago by cmpolania0
gravatar for Hussain Ather
3 months ago by
Hussain Ather910
National Institutes of Health, Bethesda, MD
Hussain Ather910 wrote:

Maybe you could try making a density plot of FPKM values? There you could probably see a significant cutoff as a result of it.

ADD COMMENTlink written 3 months ago by Hussain Ather910
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 747 users visited in the last hour