Question: How Do I Convert 454 Ace To A Regular Ace?
3
gravatar for Lee Katz
8.3 years ago by
Lee Katz2.9k
Atlanta, GA
Lee Katz2.9k wrote:

I have read on the BioPerl site that a 454 ace is not standardized due to its coordinate system. How can I convert it to the standard ace file?

When I run this code either by using contig or assembly objects, I get an error.

sub _newblerAceToAce($args){
  my($self,$args)=@_;
  my $ace454=Bio::Assembly::IO->new(-file=>$$args{ace454Path},-format=>"ace",-variant=>'454');
  my $ace=Bio::Assembly::IO->new(-file=>">$$args{acePath}",-format=>"ace");
  #while(my $contig=$ace454->next_contig){
  while(my $scaffold=$ace454->next_assembly){
    print Dumper $scaffold;
  }
  return $$args{acePath};
}

Can't call method "get_consensus_sequence" on an undefined value at Bio/Assembly/IO/ace.pm line 280, <GEN0> line 93349.

Further details:

From the bioperl site, The ACE files produced by the 454 GS Assembler (Newbler) do not conform to the reference ACE format. In 454 ACE, the consensus sequence reported covers only its clear range and the start of the clear range consensus is defined as position 1. Consequently, aligned reads in the contig can have negative positions. Be sure to use the '454' variant to have positive alignment positions. No attempt is made to construct the missing part of the consensus sequence (beyond the clear range) based on the underlying reads in the contig. Instead the ends of the consensus are simply padded with the gap character '-'.

assembly bioperl conversion • 2.0k views
ADD COMMENTlink written 8.3 years ago by Lee Katz2.9k
1
gravatar for Lee Katz
8.2 years ago by
Lee Katz2.9k
Atlanta, GA
Lee Katz2.9k wrote:

I sent this to the bioperl mailing list on November 22. No response has been made yet. Now the problem is that Assembly::IO::ace::next_contig() probably takes about two days (very slow!). I have not gotten far enough to figure out why.

Assembly::IO::ace.pm: I changed a regular expression on line 231 because the contig object was not initializing properly. For some reason the 454 ace file had adopted the reference assembly's ID and therefore there was a GI number followed by a pipe. The pipe was not captured with w+. I think that the regex will be safe with s(S+)s.

 if (/^CO\s(\S+)\s(\d+)\s(\d+)\s(\d+)\s(\w+)/xms) # New contig starts!
#if (/^CO\s(\w+)\s(\d+)\s(\d+)\s(\d+)\s(\w+)/xms) # New contig starts!
ADD COMMENTlink written 8.2 years ago by Lee Katz2.9k

Why not split the ACE file on every CO? That should be a quick operation, and if conversion is slow, at least you should be able to convert each contig in parallel.

ADD REPLYlink written 7.9 years ago by Ketil3.9k

Looks like the author of the module fixed it all. Looking forward to the next version of BioPerl (or I guess someone could just get the newest from source control)

ADD REPLYlink written 7.9 years ago by Lee Katz2.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1286 users visited in the last hour