I have a DNA library encoding scFv antibody genes that consist of VH gene (~380 bp) + linker peptide (54 bp) + VL gene (~380 bp), and the library contains about 1M unique antibodies. We've performed some iterative selections on the library such that the final sample has an expected diversity of about 500-5K unique antibodies. However, we want to sequence each round of selection, starting with the unselected diverse library, and use the enrichment of unique sequences across the rounds of selection to inform some future experiments.
Previously, our libraries consisted of VH genes only and 2x250 bp NovaSeq runs worked really nicely, giving us the coverage and depth that we needed especially in the early rounds of selection when diversity is still high. However, this new library contains inserts of about 800-900 bp and I don't know how to go about sequencing it.
The read depth is super important for calculating the fold-enrichment during selection. The linker sequence between the VH and VL sequences should be invariant, and we thought about sequencing the VH and VL domains separately, but I'm not sure how to re-assemble which VH domain goes with which VL domain and this is biologically necessary.
Does anyone have any suggestions to balance coverage with read depth? Thank you in advance!!