Question: Measure expression level of a single (novel) transcript in multiple fastq files
gravatar for chahat_u
4 weeks ago by
United States
chahat_u110 wrote:


This maybe a silly question, but I want to quantify the expression level of a specific transcript across many single-end fastq files. The transcript in question is a novel one, and I don't think its in the HG38 transcriptome. Using salmon gives me the abundance of ALL transcripts in ONE fastq file. So,

  1. Is there a way to quantify the expression level of a single transcript, across many fastq files (these are all different samples), other than running salmon on each fastq file and then searching for my transcript of interest?

  2. How can this be done for a transcript which is not in the reference transcriptome?

Thank you!

rna-seq • 195 views
ADD COMMENTlink written 4 weeks ago by chahat_u110
gravatar for lieven.sterck
4 weeks ago by
VIB, Ghent, Belgium
lieven.sterck5.5k wrote:

not sure what you are hinting at, but simply add the novel transcript to the reference transcriptome, run salmon (yes , once per fastq file but that should not take that long) and extract the data you want.

this way you are not only avoiding introducing a bias towards your novel transcript (in case you only use that one as reference), but it's a simple straightforward well established approach, so no worries about the approach there.

ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by lieven.sterck5.5k

Thanks Lieven, that was quite helpful! Also, is there any advantage to using this approach to quantify the differential expression of this novel transcript, as compared to using genome-guided methods like Cufflinks etc? Especially given that this is in human

ADD REPLYlink written 4 weeks ago by chahat_u110

Yes and no, in general I'm pro aligning to the genome and then use FeatureCount or such to do the gene quantification (read counts) as in that case you are less biased and more accurate (in transcriptome some sequences or parts of, like UTRs, might still be missing) but given that this is human I would think both genome and transcriptome are on a similar quality level.

I would not use cufflinks personally, not only because it's kinda deprecated but also because you don't really need it in this case and will likely only create confusion and/or noise in your analysis. Main advantage of the salmon approach will be speed, Salmon runs quick quickly compared to other (true alignment based) approaches while still being very accurate.

ADD REPLYlink written 4 weeks ago by lieven.sterck5.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 800 users visited in the last hour