Question: Visualizing contig alignment
0
gravatar for arunprasanna83
6 weeks ago by
arunprasanna8330 wrote:

I have two genome assemblies created using different platforms for same species. I am doing some post analysis to check the performance of assemblers. For this, I have the following:

  1. Contig files (multi-fasta) for reference (910 MB) and query (510 MB)
  2. Blast hits (tabular format) for reference vs query.

Here, I would like to make a circular track for Reference genome and see how the hits of query assembly distribute. Please suggest the possible methods to do it. I looked for UCSC genome browser, where it displays alignment for pre-defined set of organism. My organism is not in their list.

Kindly help.

Thanks in Advance.

blast alignment assembly genome • 232 views
ADD COMMENTlink modified 6 weeks ago • written 6 weeks ago by arunprasanna8330
1

Circos might be an option or D-genies

ADD REPLYlink written 6 weeks ago by lieven.sterck2.6k

D-genies crashes while I upload the data in .gz format.

ADD REPLYlink written 6 weeks ago by arunprasanna8330

that's a pitty.

I'm not on the development team myself ;-) but I know some of them, I'll pass on the message. But don't let that hold you back to get in touch with them (== send bug report) yourself.

ADD REPLYlink written 6 weeks ago by lieven.sterck2.6k
1

Thanks @lieven.sterck. I did send them an email 3 weeks back. But did not get any response !. So moved on to explore other options.

ADD REPLYlink written 6 weeks ago by arunprasanna8330

There is a bigger problem here. Most of these programs crash due to large file size. i.e File 1 has 44k contigs (510MB) and File 2 has 5900 contigs (910 MB). To make it easier, I tried to merged respective files into single long sequence. But still, the programs crashes ! Any solutions to handle these big data ?

ADD REPLYlink written 6 weeks ago by arunprasanna8330

Are you sure the problem is not the hardware you have access to?

ADD REPLYlink written 6 weeks ago by genomax57k

Sure ! I am doing it in my linux workstation with 150 GB RAM and 40 Cores.

ADD REPLYlink written 6 weeks ago by arunprasanna8330
2
gravatar for Philipp Bayer
6 weeks ago by
Philipp Bayer5.7k
Australia/Perth/UWA
Philipp Bayer5.7k wrote:

Are dotplots not an option?

Minidot is very fast, but in my experience doesn't work that great with 'bad' assemblies: https://github.com/thackl/minidot

Symap is a bit older and slower, but works better with low quality assemblies (again, in my experience), and can also make a circular plot http://www.agcol.arizona.edu/software/symap/

Circos should be able to take both output files with some fiddling. You should be able to take minidot's or symap's alignment output files, write a tiny parser to turn them into the tabular format circos wants, and then run bundlelinks on that.

ADD COMMENTlink written 6 weeks ago by Philipp Bayer5.7k

Thanks ! Out of these, symap is doing a neat job !.

ADD REPLYlink written 6 weeks ago by arunprasanna8330

Symap worked for comparing small vs big assembly. But it is still running for other case 'big vs big' assembly (910 Mb, 5931 contigs) for self comparison over 5 days !. Any tips ?

ADD REPLYlink written 5 weeks ago by arunprasanna8330

I usually remove small contigs (<10kb) because these will only add chaos to the graph anyway.

ADD REPLYlink written 5 weeks ago by Philipp Bayer5.7k
0
gravatar for harish
6 weeks ago by
harish130
harish130 wrote:

If you have mummer alignments, you can probably use Circlize or AliTV.

Alternatively, you can also use Gepard or minidot etc to view dot-plot alignments.

ADD COMMENTlink written 6 weeks ago by harish130

Will try Circlize or AliTv. Because, I tried both Gepard & minidot. Both failed due to large file size !

ADD REPLYlink written 6 weeks ago by arunprasanna8330

As I mentioned in comments above, AliTV is also working fine for smaller files. But for bigger files it is taking forever. I guess there is no parallelization option.

ADD REPLYlink written 5 weeks ago by arunprasanna8330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1880 users visited in the last hour