I have a few questions regarding mate-pair and paired end sequences:
- Does one have to know the exact length of the insert for the paired end or mate-pair sequences to be useful? (I'm not sure if "insert length" is the proper term to use in mate-pairs, but it seems to be the same concept as insert length in paired ends unless I am mistaken).
- From what I have read, usually, people obtain the length of the insert by aligning the paired end to a reference genome. Doesn't that kind of defeat the whole purpose or usefulness of the paired-end sequences, because to align to a reference genome, you generally have to treat the paired-end as two single reads. Or do you also have some knowledge of the *approximate* length of the insert (in which case, I can see the usefulness)?
- How is this done in de novo sequencing, when you don't even have a reference sequence?
Thanks a bunch!