Map BED12 file exon coordinates between species using UCSC liftOver
0
0
Entering edit mode
4 months ago

Recently I'm using UCSC's liftOver command line tool to map genomic coordinates between human and mouse. Suppose this is my BED file with one transcript composed of two exons (tab delimited):

chr1    16857   17751   HSALNT0000005   100     -       16857   17751   0,0,0   2       198,519,        0,375,  geneID=HSALNG0000003


After converting with command: liftOver -bedPlus=12 -tab -minMatch=0.8 -minBlocks=0.5 -fudgeThick -multiple input.txt hg38ToMm39.over.chain out.txt unmapped.txt

The out.txt is chr6 121498510 121498660 HSALNT0000005 100 + 121497851 121498660 0 1 150, 0, geneID=HSALNG0000003with only one of two exon ranges lifted. The question is how can I konw which exon is lifted? Since the unlifted one is directly dropped from any of the output files. To conquer this, I tried splitting my input into two single lines with each exon a line:

chr1    16857   17055   HSALNT0000005_1 100     -       16857   17055   0,0,0   1       198,    0,      geneID=HSALNG0000003
chr1    17232   17751   HSALNT0000005_2 100     -       17232   17751   0,0,0   1       519,    0,      geneID=HSALNG0000003


Surprisingly, none of the ranges are lifted this time. The unlimited file:

#Partially deleted in new
chr1    16857   17055   HSALNT0000005_1 100     -       16857   17055   0       1       198,    0,      geneID=HSALNG0000003
#Boundary problem: need 1, got 0, diff 1, mapped 0.0
chr1    17232   17751   HSALNT0000005_2 100     -       17232   17751   0       1       519,    0,      geneID=HSALNG0000003


If one of the exons can be lifted in the original file, why can't it seperately? Or is there another way to know witch exons is lifted from a BED 12+ file by liftOver? Thanks.

software problem liftOver • 153 views