Question

Read a fasta file | Java

0

Entering edit mode

7.6 years ago

ozdavidd • 0

Hey, I'm in my final project for finding hidden repeats in DNA sequence. I have to read a fasta file and get only the sequence without the genome's name, which starts with '>' and save it into a string.

Wish you guys could help me

Thanks

fasta java • 6.4k views

ADD COMMENT • link updated 2.6 years ago by Ram 45k • written 7.6 years ago by ozdavidd • 0

0

Entering edit mode

what have you tried ?

ADD REPLY • link 7.6 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

I know how to read a regular file, but I dont know what should indicate me to start reading the nucleotides. The question is - when the gonome name ended? So I cant really write something

ADD REPLY • link 7.6 years ago by ozdavidd • 0

0

Entering edit mode

Thank you very much.

ADD REPLY • link 7.6 years ago by ozdavidd • 0

0

Entering edit mode

7.6 years ago

vmicrobio ▴ 290

Hi ozdavidd,

you may try this :

private void readFastaFile(File fastaFile) {
    InputStream flux;
    String line;
    try {
        flux = new FileInputStream(fastaFile);
        InputStreamReader lecture = new InputStreamReader(flux);
        BufferedReader buff = new BufferedReader(lecture);
        int lineNb = 0;
        StringBuilder sb = new StringBuilder();
        while ((line = buff.readLine()) != null){
            if (lineNb == 0) {
                this.header = line;
            }
            else {
                sb.append(line);
            }
            lineNb++;
        }
        this.sequence = sb.toString();
        buff.close();
    }
    catch(Exception e) {
        e.printStackTrace();
    }
}

ADD COMMENT • link 7.6 years ago by vmicrobio ▴ 290

0

Entering edit mode

Thanks for comment. what sould I put in

this.header = line;

ADD REPLY • link 7.6 years ago by ozdavidd • 0

0

Entering edit mode

you can create a class FastaSequence containing the code above, add a 'getHeader' and 'getSequence' and then return only the sequence for your use

ADD REPLY • link 7.6 years ago by vmicrobio ▴ 290

0

Entering edit mode

What in this code indicates u for the start of the nucleotides?

ADD REPLY • link 7.6 years ago by ozdavidd • 0

0

Entering edit mode

7.6 years ago

Hugo ▴ 400

You may have a look at SEDA (http://www.sing-group.org/seda/), which also provides a Java API for easily manipulation of FASTA sequences (https://github.com/sing-group/seda).

ADD COMMENT • link 7.6 years ago by Hugo ▴ 400

score 2 · Accepted Answer · 2018-04-13

2

Entering edit mode

7.6 years ago

Pierre Lindenbaum 166k

a solution:

ADD COMMENT • link 7.6 years ago by Pierre Lindenbaum 166k