Interpretation of sparse reads
Entering edit mode
14 months ago


I have the results from alignment, and I have some reads that are sparse: few (a couple to a dozen) reads in a desert, like in this pict:

enter image description here

The target is virus, thus I am expecting low coverage since the majority of the reads are human; the quality of mapping is 10; I filtered the reads with blastn and blastx, and the output is viral. Yet the virus is dsDNA, thus it might well be human but wrongly determined.

Can I trust these reads being true viral? Or "two reads in the sky don't make summer", so to say?

Thank you

assembly next-gen interpretation alignment • 239 views
Entering edit mode
14 months ago
colindaven ★ 2.9k

Identifying viruses in NGS short read data is very, very tricky.

Is this RNA-seq or WGS or ... ?

  • Did you align against a database human+virus at the same time ?
  • Did you align subtractively, remove all human first , then align vs virus ?
  • We have a pipeline for this kind of stuff which you might be interested in (it's mainly for bacteria, but we can find some viral signatures in the unmapped reads using a MQ of 30 - )
  • MQ10 is not very high or specific

So the answer is - no one knows. Further steps.

  • qPCR on the same samples or maybe libraries with specific primers to detect that virus ?
  • more sequencing depth ?
  • amplicon sequencing after ampliying the virus ?
  • check literature for parameters used.
Entering edit mode

Hello, this is a WGS; I align against a human+virus. I actually selected reads with 30 MQ, but left 10 on IGV to have a more thorough look. I will read your pipeline, thank you.


Login before adding your answer.

Traffic: 2049 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6