Question: Create Chain File for Converting Genome Coordinates
1
gravatar for Gabriel
18 months ago by
Gabriel40
Gabriel40 wrote:

I was wondering about options for creating the "chain" file for converting genome coordinates from one genome assembly to another. Malachi Griffith did an excellent summary about Converting Genome Coordinates From One Genome Version To Another but most of these tools actually need the "chain" file (that is the file that describes the pair-wise alignments between two genomes) so I would like to know how to create this file or whether there is any tool doing the coordinates transference just starting from the 2 genomes and a file of annotated features (eg bed, gff). Thanks!

ADD COMMENTlink modified 3 months ago by Charles Warden8.0k • written 18 months ago by Gabriel40
3
gravatar for jean.elbers
18 months ago by
jean.elbers1.5k
jean.elbers1.5k wrote:

In that post you mention, there is flo https://github.com/wurmlab/flo, which I recommend and can also liftover GFF files.

ADD COMMENTlink written 18 months ago by jean.elbers1.5k
2
gravatar for benformatics
18 months ago by
benformatics2.0k
ETH Zurich
benformatics2.0k wrote:

Please use the search function.

The above post (answered by Pierre Lindenbaum) will bring you to this link. Looking into it there is an automated script that will generate a chain file for you.

ADD COMMENTlink written 18 months ago by benformatics2.0k
2

This is outdated and difficult to get working, I don't recommend it. Instead, I'd recommend flo (https://github.com/wurmlab/flo) like @jean.elbers suggested. Just beware, it can be pretty CPU and memory intensive for big genomes, and takes quite a while.

ADD REPLYlink written 18 months ago by juliette30

The UCSC option provides a script that was relevant in 2018. That is a bit too recent for me to deem outdated... furthermore it's a single script.

ADD REPLYlink written 18 months ago by benformatics2.0k
1

Sorry, I meant the first page you linked is outdated: "This page is an interesting historical discussion and well worth the read. "
As for the script, I personally found it difficult to follow. I didn't manage to get it working, and a lot of stuff is hard-coded into the script. It's just honestly easier to use flo, but it's always good to give another option.

ADD REPLYlink written 18 months ago by juliette30
1

Yes! Exactly because of this I was asking again! Thanks juliette!

ADD REPLYlink written 18 months ago by Gabriel40
0
gravatar for bsaylor23
3 months ago by
bsaylor230
bsaylor230 wrote:

Does anyone know if there is a tool to make the chain files required for all of these programs without relying on UCSC tools? Every Program I have found uses UCSC dependencies that you need to pay for if you aren't academic if you want to make a chain file for your own genome.

ADD COMMENTlink written 3 months ago by bsaylor230

While Liftoff (https://github.com/agshumate/Liftoff) doesn't make a chain file, it is another program for lifting over annotations that is under GPL-3.0 license. Please contact your legal department to verify if you could use that and its dependencies.

ADD REPLYlink written 3 months ago by jean.elbers1.5k

bcftools consensus has a --chain option. (at least for version 1.10.2)

-c, --chain <file>         write a chain file for liftover

Have no idea regarding how well it would work for your purposes.

ADD REPLYlink written 3 months ago by jean.elbers1.5k
0
gravatar for Charles Warden
3 months ago by
Charles Warden8.0k
Duarte, CA
Charles Warden8.0k wrote:

I found the post and top answer to be helpful. So, thank you very much!

For example, I downloaded the UCSC executables from here.

I think followed the minimal instructions, which I found I could further modify (for single-chromosome sequences that were each less than 500,000 bp):

#prepare files 
cd $ID1
faToTwoBit $ID1.fa $ID1.2bit
twoBitInfo $ID1.2bit chrom.sizes
cd ..
cd $ID2
faToTwoBit $ID2.fa $ID2.2bit
twoBitInfo $ID2.2bit chrom.sizes
cd ..

# create .chain file
blat $ID1/$ID1.2bit $ID2/$ID2.fa $ID1\to$ID2.psl -tileSize=12 -minScore=100 -minIdentity=98
axtChain -linearGap=medium -psl $ID1\to$ID2.psl $ID1/$ID1.2bit $ID2/$ID2.2bit $ID1\to$ID2.chain

I was also able to run CrossMap (installed using pip3 install CrossMap), to confirm that .chain file can be run without generating any error messages:

CrossMap.py gff $CHAIN.gz $GFFIN $GFFOUT

Whether or not CrossMap provided the best conversion could be up for debate, and I am not sure if you might want to change the parameters to generate the .chain file in some circumstances.

However, I think this is enough to show that the custom .chain file generation was successful.

ADD COMMENTlink written 3 months ago by Charles Warden8.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1337 users visited in the last hour
_