Question: Refseq Gene Entry With Multiple Strand Information [Refgene Table-Mm10]
Hi, can some solve this refseq puzzle for me.

For the gene 0610010B08Rik, on the ucsc browser, it says

RefSeq Gene 0610010B08Rik

RefSeq: NM_001177543.1 Status: Validated

Description: Mus musculus RIKEN cDNA 0610010B08 gene (0610010B08Rik), mRNA.

CCDS: CCDS50826.1

Entrez Gene: 100039060

PubMed on Gene: 0610010B08Rik

PubMed on Product: KRAB box and zinc finger C2H2 type domain containing

Stanford SOURCE: NM_001177543

mRNA/Genomic Alignments

The alignment you clicked on is first in the table below.

browser |  4539  100.0%          2     - 175192005 175338212          NM_001177543     1  4539  4539
browser |  4538  100.0%          2     - 175419391 175435777          NM_001177543     1  4539  4539
browser |  4538  100.0%          2     + 175640391 175656769          NM_001177543     1  4539  4539
browser |  4538  100.0%          2     + 175737942 175754328          NM_001177543     1  4539  4539
browser |  4538  100.0%          2     - 176470369 176486749          NM_001177543     1  4539  4539
browser |  4538  100.0%          2     - 176619933 176636319          NM_001177543     1  4539  4539

This alignment information is encoded in to the refgene table (mm10) when you pull it from ucsc which means for the gene 0610010B08Rik, there are 6 entries with the same NM id's (same rna). I always collapse the multiple entries to the one longest entry but in this case, for a same gene, there are entries with different strands. How is this possible.

From the refseq method page, its says

RefSeq RNAs were aligned against the mouse genome using blat; those with an alignment of less than 15% were discarded. When a single RNA aligned in multiple places, the alignment having the highest base identity was identified. Only alignments having a base identity level within 0.1% of the best and at least 96% base identity with the genomic sequence were kept.

Secondly, for my unique list, which entry should I take.


