How to map NCBI contig positions to chromosome positions
5.3 years ago
Jackie ▴ 70

I need to make a file which includes all coordinates for human 37.13 (NCBI) ribosomal RNAs. I have downloaded the NCBI 'ref_GRCh37.p13_scaffolds.gff3.gz' file from this link ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/ANNOTATION_RELEASE.105/GFF/

This gff3 file does include the coordinates information for all rRNAs. However, the way these coordinates are presented is the contig format, below are a few example lines:

NT_077402.2     RefSeq  region  1       257719  .       +       .       ID=id0;Name=1;Dbxref=taxon:9606;chromosome=1;gbkey=Src;genome=genomic;mol_type=genomic DNA
NT_077402.2     BestRefSeq      gene    1874    4409    .       +       .       ID=gene0;Name=DDX11L1;Dbxref=GeneID:100287102,HGNC:37102;description=DEAD%2FH %28Asp-Glu-Ala-Asp%2FHis%29 box helicase 11 like 1;gbkey=Gene;gene=DDX11L1;part=1%2F1;pseudo=true
NT_077402.2     BestRefSeq      transcript      1874    4409    .       +       .       ID=rna0;Name=NR_046018.2;Parent=gene0;Dbxref=GeneID:100287102,Genbank:NR_046018.2,HGNC:37102;gbkey=misc_RNA;gene=DDX11L1;product=DEAD%2FH %28Asp-Glu-Ala-Asp%2FHis%29 box helicase 11 like 1;transcript_id=NR_046018.2


Instead of using 'chr1', 'chr2' in the 1st col, it uses these contig annotations, and I assume the start/end positions in the 4th/5th cols are also relative to the contigs rather than being the absolute positions on a chromosome. Can someone advise how to convert these contig positions to chromosome positions? Or whether this ref_GRCh37.p13_scaffolds.gff3 file is the right one to use, should there be some similar files with chromosome positions downloadable on NCBI FTP site?

Thanks,

4.4 years ago

Hey I'm having the same issue. Did you discover anything about it? thanks