I have 2 assemblies. A de novo assembly and a draft genome assembly of an organism. My aim is to confirm that specific genes of the de novo do in fact appear in the draft genome. I will then design primers around the best match obtained form the CRB analysis.
I am using crb-blast: https://github.com/cboursnell/crb-blast.
If I use the transcriptome as the query, several of the transcriptome genes will obtain a positive hit with the same gene within the genome (for instance, several of my transcripts provide a best CRBB hit to the same genome gene).
If I use the genome as the query, several of the genome genes will obtain a positive hit with the same transcript within the transcriptome.
Can someone please help me understand CRB. I was under the idea that you would end up with a 1 for 1 match, instead several matches are coming up for genes/ transcripts which are queried. I am aware that several transcripts are produced by a gene; however, is the idea of CRB not to ultimately provide a 1 for 1 best match?
If this is the case, how do you choose the best sequence to design the primers around? Do i keep the de novo as the query and the target as the genome? Thanks.