Entering edit mode
2.2 years ago
jetson
•
0
Will very high depth cause non specific interactions to show up in a ChIAPET dataset? Due to reasons, the sequencing facility had generated a dataset worth 700x of the yeast genome and the analysis is picking almost thrice the number or interactions than previously reported. We're troubleshooting possible reasons and one of them is if this high depth would cause a lot of noise. Any pointers would be appreciated.
You could easily down-sample your dataset and check that possibility. use
seqtk
orreformat.sh
from BBMap suite.Thank you for the quick reply @genomax. My apologies for the delayed reply. I did try that option and the interactions did come down considerably. However, the protocols follows the bridge linker method (https://www.nature.com/articles/nprot.2017.012), and the analysis pipeline involves extracting only those reads which contain the linker sequence and using them downstream. And so there will be a possibility of the "bridge linker sequence reads" being under represented in the down-sampled dataset. I don't suppose there's any sure fire method to eliminate/minimize this possibility apart from trail and error.