Refseq Number/Name In Bam File
1
0
Entering edit mode
10.3 years ago

I am using bamread in matlab to read a bam file that I downloaded from 1000G. In the website it says all the reads are mapped to reference sequence called "GRCh37" which make sense. But when I try to read it in matlab, it's syntax is

bamread(File,RefSeq,Range)

The problem is this RefSeq in bamread, doesn't recognize "GRCh37" whatsoever! then I tried to pass refseq code such as 1 or 2 or ... it works but I think I'm missing sth. How do I make sure that the reads are exactly mapped to reference GRCh37 while I don't see it in the bam file?

Thanks in advance!

reference sequence bam read • 2.8k views
ADD COMMENT
0
Entering edit mode

The point of confusion seems to be that GRCh37 is not a single sequence but multiple chromosomes which are usually named in bam files as 1, 2, ..., 22, X, Y

ADD REPLY
0
Entering edit mode
10.3 years ago

print the header of your bam with samtools:

samtools view -H  file.bam |  grep '^@SQ'

the chromosomes contained in your reference will be listed there.

ADD COMMENT

Login before adding your answer.

Traffic: 2453 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6