Entering edit mode
10.6 years ago
bhonsai
•
0
Hi,
I've got a little problem. I try to retrieve many sequence parts from the chr21. The positions come from and Encode table and I want to browse fast through my chromosome via java. I get sequences, but not the ones I get, when I use the GenomeBrowser. Per BLAT I found out that for example my position 15,000,000 is in reality 15,049,142, but I can't explain the difference. Help would be nice.
StringBuffer finalSeq = new StringBuffer();
try {
RandomAccessFile chr21 = new RandomAccessFile ("/scratch/fbh/Homo_sapiens.GRCh37.72.dna.chromosome.21.fa", "r");
/* skipping top line for calculating the right pointer coordinates
* from start- and endpositions from the sequence of interest */
chr21.readLine();
long begin = (51*(startpos/50) + startpos%50 + chr21.getFilePointer());
long end = (51*(endpos/50) + endpos%50 + chr21.getFilePointer());
chr21.seek(begin); // setting pointer at begin of sequence
char charAtPos;
for ( int i = 0; i < (end - begin)+1; i++){
if ((charAtPos = (char) chr21.read()) != '\n'){ // getting sequence without newlines
finalSeq.append(charAtPos);
}
}
finalSeq.trimToSize();
chr21.close();
} catch (IOException e2){} // catching throws
I really checked Versions, but with all versions different results so i was very confused. Now checked the byte of the lines and versions and it worked. I wish I had asked earlier. Stupid mistake, but I'm very grateful for your help. Besides you even gave me a more affective way to save my result. Thanks a lot.