Entering edit mode
20 months ago
Ales
•
0
After sequencing I got two Fastq files with forward and reverse reads. I would like to know what percentage of reads belong to the virus I'm interested in. I mapped (bowtie2) these reads to the reference genome I used in the assembly and got 2%. I tried another method, did a metagenomic analysis (Kaiju), after which I got 30%. Why are these numbers so different? and which method should I choose and why.
Is this a metagenomic dataset? What do you mean by
the reference genome I used in the assembly
?If you have a genome available for the virus you are interested in then aligning the data to it is going to be a reasonably good way of identifying those reads. Depending on where the sample comes from there will be a chance of some misassignments when doing alignment to just viral genome (false positive and negative).