Identifying reliable regions of reference genomes
2
0
Entering edit mode
20 months ago
pixie@bioinfo ★ 1.5k

Hello, we work a lot in identifying SNPs between varieties in the same plant species. We then design primers around these SNPs for variety detection. Some of the plant genomes can be very complex, or poorly assembled. By complexity, I mean the ploidy can change, etc. Ultimately, some of the SNP sets turn out to be inadequate for variety detection. Are there ways we can identify confident regions of the genome ?

Thanks

snp genomics • 582 views
ADD COMMENT
1
Entering edit mode
20 months ago
Dave Carlson ★ 1.7k

You might be interested in trying this tool:

https://github.com/RILAB/mop

From the README:

Simple tool for capturing alignment regions with sufficient quality for genotyping.

ADD COMMENT
0
Entering edit mode
20 months ago
LChart 3.9k

Are the inadequacies of your SNP sets due to false SNPs themselves (i.e., potential mapping issues) or problems with SNPs underlying the primers? Do you do any sequencing of these plants yourselves?

If you have access to multiple sequences, you could adapt the Genome-In-A-Bottle methodology to define 'confident' callable regions: https://www.nature.com/articles/s41587-019-0074-6#Sec10 (the Illumina blog had a reasonable short alternative based solely on alignment: https://www.illumina.com/science/genomics-research/articles/identifying-genomic-regions-with-high-quality-single-nucleotide-.html).

If you do not have access to substantial sequencing, you could go back to the original assembly graph and visualize it in something like bandage, and regard long, contiguous, un-looped, "simple" subgraphs as being "high confidence"

ADD COMMENT

Login before adding your answer.

Traffic: 1973 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6