Question: Fragment size and insert size
1
gravatar for CY
4 days ago by
CY60
United States
CY60 wrote:

I am aware that the fragment size depends on the strength of sonication. The insert size is the length of actual genomic sequence without adapter.

My confusion is: What are the considerations when deciding the fragment size? Say, we have an target region to be sequenced. What is difference of making the fragment size of 200 and making the fragment size of 400?

Besides, assuming we apply paired end sequencing and our read length is 150bp, the insert size of 400 makes an inner size of 100 and the insert size of 200 makes the paired end reads overlapping each other by 50bp (I guess we don't want the insert size < read length, right?). How would this two options makes difference? Is there a preference or consideration on whether making the paired end reads overlapping to each other? Really appreciate any comments :)

ADD COMMENTlink modified 18 hours ago by d-cameron1000 • written 4 days ago by CY60

Can anyone share some comments? Thanks

ADD REPLYlink written 3 days ago by CY60
3
gravatar for d-cameron
18 hours ago by
d-cameron1000
Australia
d-cameron1000 wrote:

If your fragments are so short than you are sequencing adapters, then you are wasting sequencing. A non-trivial portion of your fragments will be < 150bp if you size select for 200bp fragments.

For SNV/indel caller, overlapping reads reduce your effective sequencing depth. Overlapping read can be used to error correct when the two reads from the fragment disagree but unless your SNV caller counts fragments instead of reads, it will double-count overlapping reads (two reads from the same fragment represent a single sampling, not two independent samping).

For structural variant (SV) calling, fragment size is extremely important. Increasing the fragment size increases the likelihood that a fragment will span across a SV breakpoint. 2x150bp with a 200bp median fragment length will have no read pair signal left at all.

If I had a choice between 200bp and 400bp for 2x150bp sequencing, I would choose the 400bp option unless there was a specific experimental design reason for shorter fragments.

NB: different tools/papers use different terminology for 'insert size', and 'fragment size'. Insert size can exclude the read bases (thus being negative if the read length is less than twice the fragment size), and fragment size may or many not include the adapters in the definition.

ADD COMMENTlink written 18 hours ago by d-cameron1000

I guess most of the SNV caller only count the base once even paired reads overlaps at this position, right?

ADD REPLYlink written 15 hours ago by CY60

Also, do you think it is always good to merge paired end reads if they are overlapping? Tools like PEAR has this function. By merging them, we can call larger indel. Besides, I can't think of any advantage of not merging them

ADD REPLYlink written 15 hours ago by CY60

I can't think of any advantage of not merging them

Me too, this is why I started a thread about it some time ago.

fin swimmer

ADD REPLYlink written 14 hours ago by finswimmer630
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1492 users visited in the last hour