Question: Sorting reads from host-pathogen interaction
0
gravatar for cwbenson1993
6 months ago by
cwbenson19930 wrote:

I am working on rna-seq data for a host-pathogen interaction between a grass species and its fungal parasite. The ultimate goal is to do differential expression analysis and functional enrichment to see what genes and pathways are involved in parasitism.

I have:

  1. Draft genome of the fungus
  2. RNA-seq reads from non-infected grass
  3. RNA-seq reads from infected grass (contains grass and fungal transcripts)
  4. RNA-seq reads from the fungus growing in culture

I built the transcriptome of the fungus using just the reads from the culture grown fungus, and I also built the grass transcriptome with only the non-infected reads. Now im thinking it would be useful to rebuild those trascriptomes to include reads from the infected tissue to capture transcripts that are unique to the host-pathogen interaction.

Is there a way to filter the infected reads into grass and fungal groups using the resources I currently have?

Perhaps I could align the infected grass reads (#3) to the fungal transcriptome, and use only the un-mapped reads to rebuild the grass transcriptome? Maybe I can use BLAST, BBduk, or some other tool on the unmapped reads to further filter out fungal reads before using them to build the grass transcriptome.

rna-seq assembly • 308 views
ADD COMMENTlink modified 6 months ago • written 6 months ago by cwbenson19930

valid approach indeed. I could consider aligning them to the fungal genome (as well?) in order to filter out the fungal ones.

ADD REPLYlink written 6 months ago by lieven.sterck2.4k

Hey lieven.sterck,

Thanks for the response! Ive considered using BBsplit to further sort, but unfortunately I dont have genomic sequence of the plant.

Does anyone know a tool that can sort RNA-seq data using the genome of one of the host-pathogen species?

ADD REPLYlink written 6 months ago by cwbenson19930

Can't you just align them to the fungal genome and then use the ones that do not map (== likely to be plant ones) ?

ADD REPLYlink written 6 months ago by lieven.sterck2.4k

That would be the way to go.

ADD REPLYlink written 6 months ago by genomax56k
0
gravatar for cwbenson1993
6 months ago by
cwbenson19930 wrote:

Its the novel transcripts that im concerned about. If reads don't map to the fungus or the plant, then they correspond to a transcript that is specifically expressed at the host-pathogen interaction; either plant or fungus. For example, if I map infected grass reads to the fungal transcriptome and use the unmapped reads to build the grass transcriptome, I would still have the novel fungal transcripts present in my grass assembly.

I dont know if its possible to further sort unmapped reads using the fungal genome, or maybe its not even worth troubling myself over.

ADD COMMENTlink written 6 months ago by cwbenson19930

not worth troubling yourself over I would say ;-)

you will likely always end up with more or less a mixture of sequence-origins.

On the other hand if you map to the fungal genome you should be able to remove all fungal derived reads (regardless at what stage or infection they are expressed ) since all these reads should be derived from the genome somewhere so even the 'novel ones' in your denovo transcriptome. I understand that you only have a draft genome so some might slip through at this stage but nothing to cause a big fuzz about i think.

ADD REPLYlink written 6 months ago by lieven.sterck2.4k

Fantastic! Thanks for all the help!

ADD REPLYlink written 6 months ago by cwbenson19930
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1012 users visited in the last hour