Question: DE for a de novo species
Hi there, I'm trying to find out which genes have been deferentially expressed in an infected plant in a time course by RNAseq data analysis and at the same time what genes in the pathogen may responsible for pathogenicity . I have RNAseq data of both non infected plant and individual pathogen as controls.

But my plant has a draft genome with no annotation gtf file and my fungal pathogen has not been sequenced yet. My questions are:

1- what is the best work plan to analysis the data?

2- can I map with tophat without annotation file?

3- or I must do de novo with trinity and then use tools like TransDecoder and Trinotate to find genes then use tools like MAKER to make annotation! and then go for finding DE genes?

I would thankful if someone answer me

hi there, this sounds to me like quite a project!

Why? A draft (hence most likely incomplete) reference plant genome without annotation with a fungal infection time course with an unknown fungal pathogen...

Here's a few suggestions on what you could start:

  • trinity assembly of the RNAseq reads & align to reference
  • mapping to the reference (counterscreen/decontaminate) (using bwa mem/bowtie2)
  • classify the non mapping reads transcripts, for example with kraken
  • analysis of what is different in your conditions, using for example DESeq2

I guess the first problem is to discriminate plant reads from fungal reads in order to get a better idea on the evolution of the infection?

