Question

Shall we perform a bio-experiment like PCR to confirm bioinformatics analysis?

0

Entering edit mode

9.4 years ago

Yang Li ▴ 70

Hi,

With an in-house program developed by myself, the following information from the NGS result can be got:

ref_gi species genus %coverage reads_hit

Now the result suggests that a virus with coverage more than 90%. Then I assemble the reads guided by the reference, nearly complete genome of this virus has been achieved (Blast confirmed). My mates suggested me to perform a PCR.

In the table above, it showed 40%-60% coverage of several candidate viruses. I think here I need to perform a PCR to confirm it. Why I feel no necessary to confirm a virus with high coverage? The problem, perform a PCR to confirm bioinformatics analysis, sound like the statistical significance v.s. biological significance.

I could not answer the question. If we need to perform a PCR to confirm bioinformatics analysis every time, the NGS technology might simply provide a clue for downstream bio-experiment. With PCR result, it seems I can have more confidence. Can I have enough confidence on the analysis without PCR result?

Any helps from you will be highly appreciated.

Yang Li

next-gen Metagenomics • 3.0k views

ADD COMMENT • link updated 2.2 years ago by Ram 43k • written 9.4 years ago by Yang Li ▴ 70

2

Entering edit mode

Because of my job I know lots of clinical samples can not be characterized. A story happened to me is an infant suffered diarrhea with unknown reason. Colleagues here finally found out sapovirus infection after 7 days. During the 7 days, doctors used every methods to keep its life with a heavy cost (for the baby). I ask myself can we find out the reason more quickly?

NGS provided alternative method for virus identification and classification. The aim for the pipeline is, one command, taxonomy and genome coverage information can be achieved from a clinical metagenomic result. What are there in the samples are the basis for future analysis. However, currently no such a tool for clinical metagenomics data can be used to generate coverage information which I think it is quite an important parameters. I decided to develop one. After I finished it, I used sets of public data to test it and optimize the parameters based on results. Finally it works out.

Your words gave me a lesson. "The more skeptical you are of your own results early on, the easier it ends up being to convince others (including reviewers) later." I will figure out what to do. You have my thanks.

ADD REPLY • link updated 2.2 years ago by Ram 43k • written 9.4 years ago by Yang Li ▴ 70

0

Entering edit mode

Very interesting, I certainly wish you luck with this, it has potential to be very helpful!

ADD REPLY • link 9.4 years ago by Devon Ryan 104k

0

Entering edit mode

Thank you. I will release it on github when it is ready. I will appreciate your comments

ADD REPLY • link updated 2.2 years ago by Ram 43k • written 9.4 years ago by Yang Li ▴ 70

Ram · Answer 1 · 2014-11-21

The answer depends on what you did before and what you want to do with the results later. As a general principle, for a paper you want multiple lines of evidence with different methods that all point toward the same general conclusion. If you read any molecular biology paper, they'll typically perform a half dozen or more vaguely related experiments with different biases to show that protein X interacts with virus Y (or whatever). So if the NGS results are following up on a number of other lines of evidence, all indicating the presence of the virus in your samples then doing PCR (or a Southern or Northern, depending on the virus and if you're old-school) is probably not necessary. However if you don't have other lines of evidence supporting a virus being present, then you have to worry that this is just some sort of contamination in some step of your library prep (this is becoming an increasingly acknowledge issue...it affects basically everyone at this point). In that case, an independent wet-bench experiment can at least confirm that if the virus is just a random contaminant, that it must have happened during sample collection or storage.

One of the sayings that I learned in grad school is that you always need to be your most critical reviewer. The more skeptical you are of your own results early on, the easier it ends up being to convince others (including reviewers) later.

Ram · Answer 2 · 2014-11-21

1

Entering edit mode

9.4 years ago

Ram 43k

"With an in-house program developed by myself" can potentially be a huge mistake. Always go for trusted tools and best practices, and that will ensure the bioinformatics analyses you perform are not merely initial filters for downstream wet bench work. Used properly, computational biology will save you tons of resources. Relying on tools that have not been tested/validated will lead you down a dangerous path.

ADD COMMENT • link 2.2 years ago by Ram 43k

1

Entering edit mode

Thank you for suggestions. The program or the pipeline is based on scripts based on Perl/Shell/Python. These scripts are used to combine the open-source tools together. Sets of public data has been used to test the pipeline. As Ryan said "The more skeptical you are of your own results early on, the easier it ends up being to convince others (including reviewers) later." I will use more public clinical metagenomics data to test it and make it more helpful.

ADD REPLY • link updated 2.2 years ago by Ram 43k • written 9.4 years ago by Yang Li ▴ 70