Comparing contigs files and recover similar contigs
1
0
Entering edit mode
4.8 years ago
ben.blais ▴ 10

Hi, I need help with a problem. I got two files containing contigs from two different assemblers. I would like to know if it is possible to compare the contigs of go out to keep only those that are similar to 95%. I was thinking of using a BLAST, aligning the contigs of the first file on the scond but I can not extract those with 95% identity. I thank you in advance for your help and wish you a great day

genome sequence • 930 views
ADD COMMENT
1
Entering edit mode

Depending on the size of the contigs you may be able to use CD-HIT.

ADD REPLY
1
Entering edit mode

You need global alignment for this. Try needle from the EMBOSS suite.

ADD REPLY
0
Entering edit mode

Hi, thanks you, my contigs measured between 500 and 200 000 bases. Thanks you for your help

ADD REPLY
1
Entering edit mode

Sounds like you need to use an assembly reconciliation tool. This is a recent enough review that you may want to look through.

ADD REPLY
0
Entering edit mode

What is the actual end goal here? A 'hybrid' assembly of best contigs or something?

ADD REPLY
1
Entering edit mode
4.8 years ago
h.mon 35k

You have two options:

1) You may try an assembly reconciliation or assembly merging. There are several tools for that, an internet search should help you find some. I haven't done an assembly reconciliation yet, but a recent review doesn't seem encouraging about its usefulness:

A comparative evaluation of genome assembly reconciliation tools

2) The funannotate pipeline (a fungal annotation pipeline) includes wrapper scripts to remove redundant contigs from an assembly using minimap2. It has parameters to set percent overlap and identity. You could concatenate the two assemblies and run this script - It could take a long time, though.

ADD COMMENT

Login before adding your answer.

Traffic: 1765 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6