Question: NCBI vs Liftover vs ENSEMBL for Assembly Conversion for SNP data (GRCh37 to GRCh38)
gravatar for sarvism
10 months ago by
sarvism0 wrote:


I'm a beginner to bioinformatics so hopefully someone can help with my question! I am trying to go from GRCh37 to GRCh38 for a large number of SNPs. I have an input bed file of around 700,000 SNPs that I have used for NCBI remap, ENSEMBL Assembly Converter, and Liftover (UCSC). These websites all give different numbers of SNPs that are not in the new build with little overlap of SNPs between websites. I am doing an assembly conversion because I want to have the new physical positions for the SNPs. I know that rsID's can change between assemblies and am wondering if these websites would work best for that? Would the best way be to combine all the SNPs that do not match to the new build (from all three websites) and get rid of those or is there another way to do this? Thank you very much.

ensembl liftover snp assembly ncbi • 591 views
ADD COMMENTlink modified 8 months ago by Biostar ♦♦ 20 • written 10 months ago by sarvism0

I cannot speak for the Ensembl or UCSC alignments but for NCBI, using the correct query and target assemblies is very important. Does your GRCh37 data include SNPs on the Primary Assembly (chromosomes + unlocalized + unplaced scaffolds) or more? A quick way to check this would be to just generate a uniquified list of all seq-ids from your input bed file. If you have data for just the 24 chromosomes, you may want to consider remapping using primary assembly alignments only.

ADD REPLYlink written 10 months ago by vkkodali2.1k

Thank you for your answer. I have primary assembly data only, so in NCBI I would be remapping only the primary assembly alignments.

ADD REPLYlink written 10 months ago by sarvism0

In that case then you should be using GRCh37.p13 :: Primary Assembly as the source assembly and GRCh38.p13 :: Primary Assembly as the target.

Are these not working for you? Do you see instances of misplacement? I expect at least some differences between UCSC, NCBI and Ensembl considering the different aligners that were used to generate the assembly-assembly alignments but for the most part they should be the same.

ADD REPLYlink written 10 months ago by vkkodali2.1k

Sorry for the late response. NCBI is working for me and gave very few SNPs that were not in the new build, which is good. If I have other questions I will let you know, thank you for your help!

ADD REPLYlink modified 10 months ago • written 10 months ago by sarvism0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 871 users visited in the last hour