I have a question on Host Reference used. I know that nearly all of the public reference genomes, that Hg38 or Older was used. The newest T2T is supposed to be nearly complete, and is recommended by many sites.
So if I have my human sample and remove reads matching T2T, that Should? give me a host free set of reads???
My confusion is that if I map to a reference in the host removed reads, and if I use NCBI Blast, I may get one or several human reads that map better than that of the reference of the mapped reads. I often get say human 99% then the reference I mapped to may be 98.5%, many times they map identical at say 99+%. My assumption is that since BLAST uses Hg38, that seeing human reads on BLAST should be ignored??? Just go with the T2T host removed reads??
I have previously chopped up my reference sequence (50, 75, 100, 150bp), and mapped against Hg38 and T2t. Hg38 mapped about 5,000 sequences and T2t mapped about 10,000 sequences.
HI GenoMax, I saw what you posted but couldn't find how to respond to your comment. I understand the point that you are making that something needs to be done for treatment, though it's not as simple as that. #1 I had to confirm Naegleria Fowleri was present, which I did. Then I needed to separate those reads from the rest of the samples. I have 16S samples that also show a High Bacterial load on to of the Naegleria Fowleri, if it's present, then these bacterial species need to be treated as well with the correct antibiotics. I already know that portions of this Naegleria Fowleri genome matches up to numerous other bacterial and other genomes, and I have read that Naegleria Fowleri appears to also show up in 16S results. Other parasites such as Malaria are also a promoter of bacteremia, as the immune system is muted allowing bacteria to proliferate. My assumption is that if Naegleria Fowleri is treated, then the immune system may suddenly kick back in, so they must be treated together.
After thinking about your post for some time I had posted a comment, which I then removed. On second thought I did not think it was related to bioinformatics, but looks like you did read it.
What does this mean? You have 16S amplicons independently prepared from the same samples? What does "high bacterial load" mean?
What number (and % of reads) are remaining after all the steps you have gone through. In case of NGS experiments there tends be a certain % (small) of reads that may remain that can't be easily explained.
Going to make an exception and say the following. Are you a physician able to make this decision as a part of the treatment. If not, the person in that role should be making this decision, right?
Yes, I read it, I read everything I can.
I don't have 16s on all of the samples that I have NGS, but I have 2 that went through several different 16s analysis. Before I went to the NGS samples, I did a fair amount of 16s samples V1-V2, V1-V3, V3-V4, V7, Pacbio Full Length. High bacterial Load, in terms of whole blood samples the 16s samples showed high levels of pathogenic bacteria as well as numerous nitrogen fixing bacteria. Top of my head for clinical blood samples I believe it is 100 reads/M for a positive result, so we are talking well in excess of that. But I know some of the hits on the 16s samples are actually Naegleria Fowleri Karachi NF001, for example Plasmodium Ovale was showing up on some of the samples, as well as some Mycobacterium, both of which align with portions of the NF001 genome. I'm going to relook at the 16s samples after I finish the NGS samples.
Previously I was using several of the online bioinformatics platforms, with the assumption that if they identified various species, I should be able to map the same species to the sample. The problem was that I wasn't able to reproduce via mapping and assembly what they were identifying. I suspect the running them through the online platforms that after removing the NF001 genome, that it will show whatever bacteria is present.
Obviously a doctor will make the final decisions. #1 is that what I'm finding hasn't been reported before (not contagious-I have 4 humans and 2 dog with the same NF001 showing it's contagious between humans and animals). (Death is 97% within 2 weeks, subject #1 2.5 years). (Whole blood not tested as the consensus is that counts are too low in blood to be statistically valid for diagnosis, the counts I'm finding are MASSIVE).
As for the bacteria issue, there has never been anything reported that I have seen regarding Naegleria Fowleri/Bacteria co-infection). But amoebas to have similarities, but since most people die so fast they probably didn't look to far. The Plasmodium species for Malaria are known to cause bacteremia as essentially the amoeba shuts down the immune system allowing the bacteria to prolificate. Though it's also not been documented, it's highly unlikely that you could do a blood culture for bacteria in the presence of Naegleria Fowleri, as the primary food source for them is bacteria, and Naegleria Fowleri is the dominant species by far.
I'm just doing the actual work to determine what is present, it's highly unlikely to find a doctor to do anything with this, as it's way over knowledge, even infectious disease doctors as there are only a handfull of cases in the USA. T
The CDC is the main provider for Naegleria Fowleri analysis, for which I believe the standard is PCR targeting ITS1/5.8s/ITS2 region. But I'm unsure if that test is valid for this strain, I'm assuming they have a standard reference for which in general would have coverage of near 100% at near 100% match. There was a utility that I found online that provided a blastn link for the PCR target which was matched against all of the Nagleria Fowleri strains, the NF001 strain I believe was 66% coverage at 100% match showing this strain was completely missing one of the targets. I doubt the 66% coverage would show a positive result, but I could be wrong.
In essence, your question if this is related to bioinformatics, sure looks like it to me.
My remaining reads I haven't quite gotten to yet. Rather than trying to use more aggressive mapping to pull the Naegleria Fowleri reads, I'm concentrating only with perfect matching reads to NF001. Anything other than perfect has a high chance of being human and then it tries to assemble NF001 and human together. So my unknown reads ((not mapping perfect to NF001, Hg38, and T2T), and not mapped to T2T with HISAT2 with default settings). I'm running the unknown reads with error correction on Metaspades, and then going to separate out the perfect reads again. This should give me a dataset nearly clean of NF001 and most of the human reads removed.