Hey, I'm in my final project for finding hidden repeats in DNA sequence. I have to read a fasta file and get only the sequence without the genome's name, which starts with '>' and save it into a string.
Wish you guys could help me
Thanks
Hey, I'm in my final project for finding hidden repeats in DNA sequence. I have to read a fasta file and get only the sequence without the genome's name, which starts with '>' and save it into a string.
Wish you guys could help me
Thanks
Hi ozdavidd,
you may try this :
private void readFastaFile(File fastaFile) {
    InputStream flux;
    String line;
    try {
        flux = new FileInputStream(fastaFile);
        InputStreamReader lecture = new InputStreamReader(flux);
        BufferedReader buff = new BufferedReader(lecture);
        int lineNb = 0;
        StringBuilder sb = new StringBuilder();
        while ((line = buff.readLine()) != null){
            if (lineNb == 0) {
                this.header = line;
            }
            else {
                sb.append(line);
            }
            lineNb++;
        }
        this.sequence = sb.toString();
        buff.close();
    }
    catch(Exception e) {
        e.printStackTrace();
    }
}
                    
                
                You may have a look at SEDA (http://www.sing-group.org/seda/), which also provides a Java API for easily manipulation of FASTA sequences (https://github.com/sing-group/seda).
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
what have you tried ?
I know how to read a regular file, but I dont know what should indicate me to start reading the nucleotides. The question is - when the gonome name ended? So I cant really write something
Thank you very much.