No Sequencing Data at Low Positions
0
0
Entering edit mode
4.7 years ago
ccnn ▴ 20

Opening a bam file for just chr22, I was surprised to see that there were no reads aligned until around position 16,050,000. In UCSC Genome Browser, looking at a window of chr22:16,049,420-16,050,420, I can see that there's "nothing," but then different tracks start. In chr1, I also think I remember seeing that alignments started only at 10,000.

Why is there nothing earlier in the chromosome? Do those positions not correspond to DNA? I've downloaded data on the length/end position of each chromosome; can I find a list of these "start" positions?

dna sequencing ngs next-gen • 894 views
ADD COMMENT
0
Entering edit mode

Are you looking at the right genome build in UCSC? Also are you sure the data is aligned against UCSC genome build (which have a chr prefix for chromosomes as opposed to other builds which may only have numbers.

Beginnings of the chromosome sequence may only have N's since the ends of chromosomes are hard to sequence.

ADD REPLY
0
Entering edit mode

Ah it does indeed like everything before that is "N" when I zoom into "base" on the browser.

ADD REPLY
0
Entering edit mode

So those N-nucleotides means that "we know there are nucleotides there, we are just not sure what they are"

ADD REPLY
0
Entering edit mode

Got it. Thank you! So is there somewhere I can find out how many of the first bases of each chromosome are N?

ADD REPLY
0
Entering edit mode

Hello,

you could use your language of choice to find the first position in each reference sequence which is not an N.

fin swimmer

ADD REPLY

Login before adding your answer.

Traffic: 2249 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6