Question

Viral Genome reconstruction

0

Entering edit mode

4.1 years ago

Carla • 0

Hi! I have some reads from sequencing a viral genome. We are analyzing readings with two matches, and we calculate the existing distance between both matches, for which we carry out histograms. I would like to know how can I distinguish if the regions that do not match (cuts) are due to chance or not?

lectures genome virus sequencing matches • 1.0k views

ADD COMMENT • link 4.1 years ago by Carla • 0

0

Entering edit mode

Now, that is an extension to the other question, I think you need to explain a bit more about the background. What do you mean by regions that do not match (cut), and what do you mean by "chance or not"? Evolution involves some random variation plus selection.

If a singular event is observed there cannot be drawn conclusions about its probability. Take existence of life on planets. As of now, we know about exactly one planet with life. We cannot calculate the probability that life exists on other planets from that singular observation, the probability of the observation itself is given but nothing is known about others.

Same with your example, it all depends on the background distribution.

ADD REPLY • link 4.1 years ago by Michael 56k

0

Entering edit mode

Hi! We consider the regions that do not match as places in which the polymerase has not been able to act, so we think that it is possible that the fragments resulting from replication are reattached at random or that they do so in the original order without those regions that are lost

ADD REPLY • link 4.1 years ago by Carla • 0

0

Entering edit mode

Ok, so now there is polymerase (which?, polymerases might have length limitations and biases on their own) added to the picture. It's still not very clear to me though. Possibly you should focus initially on formulating the biological question properly and describing eventual experiments to carry out.

After that, one could still start thinking from scratch about an appropriate statistical or bioinformatics framework.

ADD REPLY • link 4.1 years ago by Michael 56k

0

Entering edit mode

Hi! It is about Avibirnavirus, it has its own polymerase called VP1. It seems that the areas that are not replicated are due to the fact that they are attached to another viral protein called VP3 , which has a protective action on the genome. VP3 requires a minimum size of 9bp to join.

Initially we used blast on the reads from the sequencing, and we are currently working with the reads that were discarded after using blast. From these discarded readings we are only using those that had two matches, as I've commented previously. We have calculated the distance between both matches and made histograms of these in which it is observed as you have commented a density function with linear probability with a maximum in 0 and a minimum in the length of the genome.

However, as the multiplicity of infection of this virus increases, the distribution becomes irregular. For this reason, we think that it is possible that the reorganization of the genome after replication is carried out randomly.

ADD REPLY • link 4.1 years ago by Carla • 0