Question: 20% of unknown-source contamination RNA-Seq Illumina
0
gravatar for crimsontabaq
22 months ago by
crimsontabaq40
Russia, Kazan
crimsontabaq40 wrote:

We're working with a de novo transcriptome of a non-model organism. There were some genome assemblies for it, but we decided not to use it for GE analysis since they weren't annotated and also looked suspicious in some ways. Yet we've tried to map rRNA/mitRNA filtered reads on each of them and it appeared that 7-20% of reads wouldn't map on any variant of subject genome, so it's a contamination. We've mapped unaligned reads on genomes of every organism we're also working with, but there weren't any serious match (0-1%).

What could it be? Is it for sure contamination of some kind or might be something else?

Should we still work with all reads, or

choose those which are aligned on a genome and assemble a de novo transcripts with them?

UPD: The organism is basidiomycete fungus Lentinula edodes.

ADD COMMENTlink modified 22 months ago by Brian Bushnell16k • written 22 months ago by crimsontabaq40
1
gravatar for Michael Dondrup
22 months ago by
Bergen, Norway
Michael Dondrup46k wrote:

You should give full details of your organism. I think it is too simple to think of a whole body sample as single organism. Likely it's a symbiosis or a parasite you are picking up. Use a meta transcriptomics approach using blast vs nt /Nr and Megan to find the source.

ADD COMMENTlink modified 22 months ago • written 22 months ago by Michael Dondrup46k

I've updated the OP. How do you blast READS? There are hundreds of millions of them with the length 30-45 bp. Did you mean blasting assembled contigs?

ADD REPLYlink modified 22 months ago • written 22 months ago by crimsontabaq40
1

You can use one of the faster alternatives to blast and only use it on the reads that are left over, but simply using the contigs should be good enough. In addition you can use kraken. Consider also that 80-93 %aligned is quite OK for a nomo depending on the completeNess of the genomes.

ADD REPLYlink written 22 months ago by Michael Dondrup46k

If this rate of unaligned reads is fine, am i suppose to use all reads (aligned and unaligned) in further work?

ADD REPLYlink written 21 months ago by crimsontabaq40
1
gravatar for Brian Bushnell
22 months ago by
Walnut Creek, USA
Brian Bushnell16k wrote:

I'd also suggest BBSketch as an alternative to Blast. It's really fast, and can compare your reads to everything in nt or RefSeq:

sendsketch.sh in=reads.fq reads=1m nt

or

sendsketch.sh in=reads.fq reads=1m refseq
ADD COMMENTlink written 22 months ago by Brian Bushnell16k
0
gravatar for jrj.healey
22 months ago by
jrj.healey11k
United Kingdom
jrj.healey11k wrote:

Screen your reads with Kraken. If your organism is a microbe, it'll QC your data. If your organism isn't, then Kraken might at least be able to pick up some of the contaminants if they are microbial

ADD COMMENTlink written 22 months ago by jrj.healey11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1081 users visited in the last hour