Question

checking the sequences that share origin

0

Entering edit mode

8 days ago

QX ▴ 70

Hi all,

I have a library of sequencing data that is very diverse. However, due to nature of sequencing machine, that it is hard to say whether a pair of sequences are from the same origin sequence (due to technical error) or actually different based on the (random) designs.

One can set a threshold based on the hamming distance, that smaller than, let say 2 hamming score, is actually just the result of technical error. But from my understanding, the different in some bases can also due to mutation during culture (biological impact) or many other factors. Also, setting a threshold is subjective.

Can anyone suggest some ways or any modeling that take some parameters to check this?

sequencing • 315 views

ADD COMMENT • link updated 8 days ago by GenoMax 152k • written 8 days ago by QX ▴ 70

score 0 · Answer 1 · 2025-06-12

0

Entering edit mode

8 days ago

GenoMax 152k

Without the presence of UMI in data, it would be difficult to say with certainty that a pair of sequences share an identical original fragment, are from two identical independent fragments from the library or are because of sequencer making an error (or more).

ADD COMMENT • link 8 days ago by GenoMax 152k