Question: BioJava, installing forester correctly.
0
gravatar for Bioaln
4.6 years ago by
Bioaln310
France
Bioaln310 wrote:

Hello everyone. I've recently started with BioJava and Maven and I decided to try out sequence alignment options. On the official bioJava page it says that additional library is needed: forester.jar. I've downloaded it from maven online repository yet my program still doesn't work, it is constantly giving me the arrayindexoutofbounds error, although I copied the entire code from the official BioJava site. I am not really sure how to install forester.jar as additional library tho.

package testproj;

import java.net.URL;

import org.biojava3.alignment.Alignments;
import org.biojava3.alignment.SimpleGapPenalty;
import org.biojava3.alignment.SimpleSubstitutionMatrix;

import org.biojava3.core.sequence.ProteinSequence;
import org.biojava3.core.sequence.compound.AminoAcidCompound;
import org.biojava3.core.sequence.io.FastaReaderHelper;

 
import org.biojava3.alignment.Alignments.PairwiseSequenceAlignerType;
import org.biojava3.alignment.template.SequencePair;
import org.biojava3.alignment.template.SubstitutionMatrix;

 
public class CookbookMSA {
 
    public static void main(String[] args) {
        String[] ids = new String[] {"Q21691", "Q21495","Q21693"};
        try {
            alignPairGlobal(ids[0], ids[1]);
        } catch (Exception e){
            e.printStackTrace();
        }
    }
 
    private static void alignPairGlobal(String id1, String id2) throws Exception {
        ProteinSequence s1 = getSequenceForId(id1), s2 = getSequenceForId(id2);
        SubstitutionMatrix<AminoAcidCompound> matrix = new SimpleSubstitutionMatrix<AminoAcidCompound>();
        SequencePair<ProteinSequence, AminoAcidCompound> pair = Alignments.getPairwiseAlignment(s1, s2,
                PairwiseSequenceAlignerType.GLOBAL, new SimpleGapPenalty(), matrix);
        System.out.printf("%n%s vs %s%n%s", pair.getQuery().getAccession(), pair.getTarget().getAccession(), pair);
    }
 
    private static ProteinSequence getSequenceForId(String uniProtId) throws Exception {
        URL uniprotFasta = new URL(String.format("http://www.uniprot.org/uniprot/%s.fasta", uniProtId));
        ProteinSequence seq = FastaReaderHelper.readFastaProteinSequence(uniprotFasta.openStream()).get(uniProtId);
        System.out.printf("id : %s %s%n%s%n", uniProtId, seq, seq.getOriginalHeader());
        return seq;
    }
 
}

Output:

id : Q21691 MDLLDKVMGEMGSKPGSTAKKPATSASSTPRTNVWGTAKKPSSQQQPPKPLFTTPGSQQGSLGGRIPKREHTDRTGPDPKRKPLGGLSVPDSFNNFGTFRVQMNAWNLDISKMDERISRIMFRATLVHTDGRRFELSLGVSAFSGDVNRQQRRQAQCLLFRAWFKRNPELFKGMTDPAIAAYDAAETIYVGCSFFDVELTEHVCHLTEADFSPQEWKIVSLISRRSGSTFEIRIKTNPPIYTRGPNALTLENRSELTRIIEAITDQCLHNEKFLLYSSGTFPTKGGDIASPDEVTLIKSGFVKTTKIVDRDGVPDAIMTVDTTKSPFYKDTSLLKFFTAKMDQLTNSGGGPRGHNGGRERRDGGGNSRKYDDRRSPRDGEIDYDERTVSHYQRQFQDERISDGMLNTLKQSLKGLDCQPIHLKDSKANRSIMIDEIHTGTADSVTFEQKLPDGEMKLTSITEYYLQRYNYRLKFPHLPLVTSKRAKCYDFYPMELMSILPGQRIKQSHMTVDIQSYMTGKMSSLPDQHIKQSKLVLTEYLKLGDQPANRQMDAFRVSLKSIQPIVTNAHWLSPPDMKFANNQLYSLNPTRGVRFQTNGKFVMPARVKSVTIINYDKEFNRNVDMFAEGLAKHCSEQGMKFDSRPNSWKKVNLGSSDRRGTKVEIEEAIRNGVTIVFGIIAEKRPDMHDILKYFEEKLGQQTIQISSETADKFMRDHGGKQTIDNVIRKLNPKCGGTNFLIDVPESVGHRVVCNNSAEMRAKLYAKTQFIGFEMSHTGARTRFDIQKVMFDGDPTVVGVAYSLKHSAQLGGFSYFQESRLHKLTNLQEKMQICLNAYEQSSSYLPETVVVYRVGSGEGDYPQIVNEVNEMKLAARKKKHGYNPKFLVICTQRNSHIRVFPEHINERGKSMEQNVKSGTCVDVPGASHGYEEFILCCQTPLIGTVKPTKYTIIVNDCRWSKNEIMNVTYHLAFAHQVSYAPPAIPNVSYAAQNLAKRGHNNYKTHTKLVDMNDYSYRIKEKHEEIISSEEVDDILMRDFIETVSNDLNAMTINGRNFWA
sp|Q21691|NRDE3_CAEEL Nuclear RNAi defective-3 protein OS=Caenorhabditis elegans GN=nrde-3 PE=1 SV=1
java.lang.ArrayIndexOutOfBoundsException: 0
    at org.biojava3.core.sequence.io.GenericFastaHeaderParser.parseHeader(GenericFastaHeaderParser.java:113)
    at org.biojava3.core.sequence.io.GenericFastaHeaderParser.parseHeader(GenericFastaHeaderParser.java:60)
    at org.biojava3.core.sequence.io.FastaReader.process(FastaReader.java:182)
    at org.biojava3.core.sequence.io.FastaReader.process(FastaReader.java:108)
    at org.biojava3.core.sequence.io.FastaReaderHelper.readFastaProteinSequence(FastaReaderHelper.java:100)
    at testproj.CookbookMSA.getSequenceForId(CookbookMSA.java:42)
    at testproj.CookbookMSA.alignPairGlobal(CookbookMSA.java:33)
    at testproj.CookbookMSA.main(CookbookMSA.java:26)

 

How do I fix this> sorry if it's a stupid question. Thanks for any help.

 

ADD COMMENTlink modified 4.6 years ago by Pierre Lindenbaum120k • written 4.6 years ago by Bioaln310
1
gravatar for Pierre Lindenbaum
4.6 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum120k wrote:

not a java error.

your second item is Q21495 : look at this: http://www.uniprot.org/uniprot/Q21495

 

ADD COMMENTlink written 4.6 years ago by Pierre Lindenbaum120k

You are right, when I use only: ids = new String[] {"Q21691","Q21693"}; Both sequences are read but there is whole new

error list: Exception in thread "main" java.lang.ExceptionInInitializerError
    at org.biojava3.alignment.SimpleAlignedSequence.setLocation(SimpleAlignedSequence.java:358)
    at org.biojava3.alignment.SimpleAlignedSequence.<init>(SimpleAlignedSequence.java:88)
    at org.biojava3.alignment.SimpleProfile.<init>(SimpleProfile.java:119)
    at org.biojava3.alignment.SimpleSequencePair.<init>(SimpleSequencePair.java:86)
    at org.biojava3.alignment.SimpleSequencePair.<init>(SimpleSequencePair.java:69)
    at org.biojava3.alignment.routines.AnchoredPairwiseSequenceAligner.setProfile(AnchoredPairwiseSequenceAligner.java:137)
    at org.biojava3.alignment.template.AbstractMatrixAligner.align(AbstractMatrixAligner.java:344)
    at org.biojava3.alignment.template.AbstractPairwiseSequenceAligner.getPair(AbstractPairwiseSequenceAligner.java:112)
    at org.biojava3.alignment.Alignments.getPairwiseAlignment(Alignments.java:208)
    at testproj.CookbookMSA.alignPairGlobal(CookbookMSA.java:36)
    at testproj.CookbookMSA.main(CookbookMSA.java:27)
Caused by: java.lang.NullPointerException
    at java.util.Collections$UnmodifiableCollection.<init>(Collections.java:1026)
    at java.util.Collections$UnmodifiableList.<init>(Collections.java:1302)
    at java.util.Collections.unmodifiableList(Collections.java:1287)
    at org.biojava3.core.sequence.location.template.AbstractLocation.<init>(AbstractLocation.java:111)
    at org.biojava3.core.sequence.location.template.AbstractLocation.<init>(AbstractLocation.java:85)
    at org.biojava3.core.sequence.location.SimpleLocation.<init>(SimpleLocation.java:57)
    at org.biojava3.core.sequence.location.SimpleLocation.<init>(SimpleLocation.java:53)
    at org.biojava3.core.sequence.location.template.Location.<clinit>(Location.java:48)
    ... 11 more    I AM EVEN MORE LOST NOW :(

ADD REPLYlink written 4.6 years ago by Bioaln310

one of your variable is null somewhere. Don't use objects before checking they're not null, close your open streams , use System.err , etc...

    private static ProteinSequence getSequenceForId(String uniProtId) throws Exception {
        URL uniprotFasta = new URL( "http://www.uniprot.org/uniprot/"+URLEncoder.encode(uniProtId,"UTF-8")+".fasta");
        InputStream in =  uniprotFasta.openStream();
         LinkedHashMap<String,ProteinSequence> seqs = FastaReaderHelper.readFastaProteinSequence(in);
         in.close();       
         ProteinSequence seq = seqs.get(uniProtId);
         if(seq==null) throw new RuntimeException("not found"+

uniProtId);

   System.err.printf("id : %s %s%n%s%n", uniProtId, seq, seq.getOriginalHeader());
        return seq;

    }

 

ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by Pierre Lindenbaum120k

Alright, I tried this and the error stays the same. One last thing if you would. Is there any possibility you could provide

the simplest working example of pairwise protein seq alignment, you seem like you've done this a thousand times. It would really help me with debugging. Thank you very much.

ADD REPLYlink written 4.6 years ago by Bioaln310
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1697 users visited in the last hour