Question: lift coordinates mapping to_alt chromosomes in hg38
0
gravatar for bioguy24
2.3 years ago by
bioguy24180
Chicago
bioguy24180 wrote:

I am using liftover to convert ~100,000 hg19 coordinates to hg38. I know that there are duplicates in the hg19 bed file, but not sure whats going on or whats best to do. The hg38 coordinates are very different. Maybe table browser is a better option? Thank you :).

hg19

 chr19  54801916    54802239    chr19:54801916-54802239 .   LILRA3;LILRA6
 chr19  54801917    54802239    chr19:54801917-54802239 .   LILRA3
 chr19  54802472    54802789    chr19:54802472-54802789 .   LILRA3;LILRA6
 chr19  54802473    54802789    chr19:54802473-54802789 .   LILRA3
 chr19  54803901    54804020    chr19:54803901-54804020 .   LILRA3

hg38

 chr19_KI270938v1_alt:273030-273353
 chr19_KI270938v1_alt:273031-273353
 chr19_KI270938v1_alt:273586-273903
 chr19_KI270938v1_alt:273587-273903
 chr19_KI270938v1_alt:275015-275134
bed file hg38 ngs • 928 views
ADD COMMENTlink modified 2.3 years ago by apa@stowers410 • written 2.3 years ago by bioguy24180
2
gravatar for apa@stowers
2.3 years ago by
apa@stowers410
Kansas City
apa@stowers410 wrote:

Strangely, LILRA3 is not annotated to the reference chr19 in hg38, only to that alternate assembly. It's immediate neighbors, LILRB2 and LILRA5, are on the hg38 reference chr19, but there are no annotated genes between them. That ~1MB window containing LILRA3 did assemble, but did not get placed contiguously with the others, so got spun off as sequence KI270938.

So this is a mis-assembly (all genomes have them) but whether LILRA3 really belongs where it was on hg19, or some place new, is not clear.

This is a good post just for perspective: hg19 vs hg38 in two pictures

ADD COMMENTlink written 2.3 years ago by apa@stowers410

Thank you for the information, may I ask how you were able to determine that LILRA3 is not annotated in hg38 but LILRB2 and LILRA5 are? I guess I am trying to figure out tools that may help. Thank you very much :).

ADD REPLYlink written 2.3 years ago by bioguy24180
1

Go look at LILRA3 in hg19, see what its neighbors' names are. Then look those up in hg38, you will find they are still together but LILRA3 has disappeared.

ADD REPLYlink written 2.3 years ago by apa@stowers410

Interesting, so what is best or the correct thing to do in a case like this? I guess to try and figure out why it mis mapped or potentially why may help. Thank you :).

ADD REPLYlink written 2.3 years ago by bioguy24180
1

Well what are you doing with the remapped coordinates? Why do you need them?

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by apa@stowers410

The reference on our sequencer is hg38 so I am lifting over the hg19 targets to hg38 as well. Basically after the sequence aligns the target bed file is used for variant calling, coverage, etc... Thank you :).

ADD REPLYlink written 2.3 years ago by bioguy24180
1

Most accurate is probably to repeat the alignment, if you have access to the original data.

ADD REPLYlink written 2.3 years ago by WouterDeCoster38k

Absolutely, repeat the alignment. And don't reinvent the wheel with gene coordinates; every genome version has its own associated gene annotations somewhere. See the knownGene files in UCSC hg38 annotation database or any Ensembl GTF for human which is Ensembl 76 or later.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by apa@stowers410
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 849 users visited in the last hour