Question: Measure expression level of a single (novel) transcript in multiple fastq files
gravatar for c_u
9 months ago by
United States
c_u200 wrote:


This maybe a silly question, but I want to quantify the expression level of a specific transcript across many single-end fastq files. The transcript in question is a novel one, and I don't think its in the HG38 transcriptome. Using salmon gives me the abundance of ALL transcripts in ONE fastq file. So,

  1. Is there a way to quantify the expression level of a single transcript, across many fastq files (these are all different samples), other than running salmon on each fastq file and then searching for my transcript of interest?

  2. How can this be done for a transcript which is not in the reference transcriptome?

Thank you!

rna-seq • 318 views
ADD COMMENTlink written 9 months ago by c_u200
gravatar for lieven.sterck
9 months ago by
VIB, Ghent, Belgium
lieven.sterck7.2k wrote:

not sure what you are hinting at, but simply add the novel transcript to the reference transcriptome, run salmon (yes , once per fastq file but that should not take that long) and extract the data you want.

this way you are not only avoiding introducing a bias towards your novel transcript (in case you only use that one as reference), but it's a simple straightforward well established approach, so no worries about the approach there.

ADD COMMENTlink modified 9 months ago • written 9 months ago by lieven.sterck7.2k

Thanks Lieven, that was quite helpful! Also, is there any advantage to using this approach to quantify the differential expression of this novel transcript, as compared to using genome-guided methods like Cufflinks etc? Especially given that this is in human

ADD REPLYlink written 9 months ago by c_u200

Yes and no, in general I'm pro aligning to the genome and then use FeatureCount or such to do the gene quantification (read counts) as in that case you are less biased and more accurate (in transcriptome some sequences or parts of, like UTRs, might still be missing) but given that this is human I would think both genome and transcriptome are on a similar quality level.

I would not use cufflinks personally, not only because it's kinda deprecated but also because you don't really need it in this case and will likely only create confusion and/or noise in your analysis. Main advantage of the salmon approach will be speed, Salmon runs quick quickly compared to other (true alignment based) approaches while still being very accurate.

ADD REPLYlink written 9 months ago by lieven.sterck7.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 859 users visited in the last hour