Hello All,
I have isolated a new organism and got its 16S sequenced. Based on 16S seqencing its closest homolog is Acinetobacter baumanii and the 16S shows 99% identity. Now I did the WGS and based on WGS its closest homolog is Enterobacter cloacae. Why this difference ? To confirm, I did some further test:
1) I tried different assemblers viz. SPAdes, IDBA, Ray, A6, Mascura, kiki, MegaHIT. All the assemblers gave results but when I BLASTed the assembled genome from respective assemblies with the 16S sequence (initially sequenced), only Spades assembled shows 85% identity and 63% query cover. Same assembly shows 99% identity and 63% query cover with enterococacae (closest homolog according to WGS).
2) when I align the FASTQ sequence using HISAT using enterococacae and Acinetobacter as the reference genome, mapped reads are approximately 87.39% and 0.05% respectively.
Please let me know where I am going wrong? What should I do ?
What "do the WGS" and "based on WGS the closest homolog is" mean? You have to explain in more detail.
Maybe you isolate is not an isolate after all - I have seen this happen with a certain frequency. Try blobtools to explore the taxonomic composition of the putative isolate.
WGS means whole genome sequencing and we constructed the species tree based on the whole genome sequence.
I am certain that isolate I am sequencing is my isolate as when I grew it on plate it is a pure culture and gram staining also shows its a pure culture and moreover the ability for which it was isolated is also present.
since WGS also contains 16s sequence data (to my understanding), did you try comparing sequencing data from 16 s sequencing and 16s sequencing data from WGS? aggsandy8
Please select a more descriptive title for your question. "Problem with WGS" does not tell users who can help you what your problem is about.
Something like "difference between closest homolog based on 16S and WGS" would be lots more informative.