Inaccurate read counts when genome/annotation contains alternate loci
0
0
Entering edit mode
5.9 years ago
viktorfeketa ▴ 30

I am performing gene quantification of the RNASeq data and getting very different read counts for one particular gene (most are very close) using two different reference genomes/annotations: NCBI and Gencode. I noticed that the annotation from NCBI contains two entries for this gene, one of them in the "alternative loci" scaffold. Can this be a reason for inaccurate counts? The gene entry in the "alternative locus" has a different gene_id, and the final read count for it is 0. Is it possible that the reads from this gene's transcripts align to both main and alternative locus and are not counted as "multiple alignnments"/ambiguous?

Species: mouse

Aligner: STAR

Gene quantification: HTSeq-count

NCBI: GRCm38.p6; genome and annotation downloaded from here: ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/635/GCA_000001635.8_GRCm38.p6

Gencode: M17 (GRCm38.p6); genome and annotation from here: https://www.gencodegenes.org/mouse_releases/current.html

RNA-Seq STAR HTSeq-count Alternative loci • 1.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 1532 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6