How can I get a rdotplot file in lastz?
1
0
Entering edit mode
4.1 years ago
cristian ▴ 290

Hi,

I have the sequence for the OP50 strain of E. coli and I would like to compare it to the reference E. coli sequence. Does anybody know how I can do this? I am trying lastz:

lastz chr1.fa ecoli.fa --notransition --step=20 --nogapped --format=maf --rdotplot=rplot.txt > AvsB.maf

However, I do not know how to read the .maf file and the rplot.txt file is empty.

Your help would be appreciated.

Thanks.

Best, C.

lastz r rdotplot dotplot genomic alignment • 2.3k views
ADD COMMENT
1
Entering edit mode
4.1 years ago
h.mon 33k

First of all, I don't think this dotplot is useful, as the OP50 draft is fragmented and contains 2939 short contigs.

I have no idea of why you aren't getting the rdotplot file. I just replicated your command with the OP50 draft and the K-12 substr. MG1655, and got an alignment and rdotplot file. I had first to replace the first | by _, otherwise the contig names would be mangled by lastz (even then I would get a rdotplot file):

sed -i.bak "s/\|/_/" op50.fasta

lastz k12.fasta op50.fasta --notransition --step=20 --nogapped \
  --format=maf --rdotplot=rplot.txt > AvsB_a.maf

head rplot.txt outputs:

U00096.3    gi_269207760
1609417 1
1611892 2476
NA  NA
U00096.3    gi_269207759
344760  1
345348  589
NA  NA
U00096.3    gi_269207758
ADD COMMENT
0
Entering edit mode

Hi h.mon,

To clarify, I am aligning a contig that I think might be OP50 but I am not sure. My fasta file anyway only contains one sequence.

grep '>' output/genome/ecoli/ref/seq/chr1.fa
>Chromosome

grep '>' output/genome/ecoli/op50/assembly/hgap/bristol/ecoli.fa 
>000001F|arrow
ADD REPLY
0
Entering edit mode

I have reprocued what you did:

head output/rplot.txt 
U00096.3    gi_269207760
1609417 1
1611892 2476
NA  NA
344760  1
345348  589
NA  NA
270620  1
271639  1020
NA  NA

Do you think it does not work when there is only one contig in each file? or maybe they are not similar enough?

C.

ADD REPLY
0
Entering edit mode

Do you think it does not work when there is only one contig in each file?

lastz works fine with single or multiple sequences per file, this is not the issue. The dot plot, on the other hand, is intended for alignments between single sequences. Maybe your op50 contig is corrupted or has something that bothers last?

The maf output is a simple text file, you can examine it with less, head, tail to see if it looks correct.

You may also try Mauve to align both genomes, it has a GUI and is quite easy to use.

ADD REPLY

Login before adding your answer.

Traffic: 1013 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6