I'm new to the bioinformatics space, and I'm attempting to merge one dataset with the 1000 Genomes dataset (as the reference).
However, I'm unsure on how to do this. I've researched a bit, and some people suggested to split the 1000G by chrosome and then attempt to merge that way? I'm not sure what the point of this is.
I'm currently following this: https://martha-labbook.netlify.app/posts/extracting-data-for-variants-common-in-both-file-sets/. Can someone tell me if this is the right way to go?
ALSO, would I need to do QC on 1000G prior to merging?
I would love any help.