I recently downloaded a list of genes from the UCSC Genome browser from the "Table Browser" section. I used the following setting from the dropdown menu:
clade: Mammal
genome: Human
assembly: Feb.2009(GRCh37/hg19)
group: Genes and Gene Predictions
track: RefSeq Genes
table: refGene
region: genome
After unzipping the file, I had a table with the following columns:
- #bin
- name
- chrom
- strand
- txStart
- txEnd
- cdsStart
- cdsEnd
- exonCount
- exonStarts
- exonEnds
- score
- name2
- cdsStartStat
- cdsEndStat
- exonFrames
At some point, I noticed something weird and ran the following command:
grep "SNORD141B" All_Genes.tsv | cut -f3,7,8,13 | less
which returned:
chr5 14652491 14652491 SNORD141B
chr6 74228161 74228161 SNORD141B
chr9 135895921 135895921 SNORD141B
How could this gene exist on three different chromosomes? I've looked up "snord141b" on Gene Cards and found nothing. This is not an urgent or important question for me but I am perplexed nonetheless. Any clue how/why this gene shows up in these three different genomic regions?
Like rDNA repeat which has ~400 copies on 5 chromosomes: entire human rDNA