Resolving ambiguity around Seqmonk's exon annotations
0
0
Entering edit mode
8.9 years ago
dantylee ▴ 40

I have been working in the following pipeline:

Map reads to HG19 Tophat2 => Read .bam files and quantitate reads per feature (e.g. "gene") in Seqmonk (using Seqmonk-provided grch37) => Export raw reads per feature for normalization / analysis in R.

Seqmonk's feature naming scheme has always been a bit of a mystery to me. For example, when querying the "gene" feature track, the exported spreadsheet contains a significant minority of features with identical names -- I've been able to work around this satisfactorily for gene-level analysis.

However, now I'm trying to export raw reads per exon, and ideally I would like to be able to identify to which gene each of the exons belong. It seems that exon-level data can only be exported when querying the "mRNA" track (perhaps adding to the redundancy/confusion). [Wondering if anyone can confirm whether mRNA must be selected to get exons, or if there is a work-around to get exons per gene]. It seems regardless of the choices I make in the "Define Probes" and "Quantitate Existing Probes." The resulting output contains very many identically names features with distinct genomic coordinates, however it's not possible to infer which exons belong to which mRNA isoform. [Again, ideally I would like to tag these exons with their corresponding gene. To elaborate, if for gene 1, there are two mRNA isoforms (1 and 2), and both use gene exon #1, then I would like to return a single value for gene exon #1; hopefully I'm expressing myself clearly].

Further complicating the matter, it seems that when I ask for an "annotated probe report" and specify the "genes" track for annotations, a significant minority of the returned annotations don't match the gene indicated in the feature name. [I suspect this problem could be arising from the HG19 to GRCH37?].

Any insight would be greatly appreciated! Thank you.

RNA-Seq seqmonk • 2.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 1950 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6