Question: Abundance/frequency of RNASeq raw reads mapped to a transcript exon
0
gravatar for mlopez
20 months ago by
mlopez10
mlopez10 wrote:

We have paired-end Illumina RNASeq reads and we are working with a non-model organism with no reference genome. We have a working composite for a protein sequence that includes every exon we have found via cDNA. We have 6 muscle types with some triplicates and want to see how many times 4 specific exons that look to be alternatively spliced are present in each muscle type.

For example, muscle type a has this exon expressed 46% while muscle type b only expresses this exon 12% of the time.

I'm not looking for differential expression, only a number of how many times this exon is found within the muscle type's transcript file.

I've tired feeding HISAT2 BAM files into stringtie and also taking the GTF files from stringtie and putting them into htseq-count but neither worked.

I was already able to align the raw reads to the composite and visualize the alignment in IGV. However, there are thousands of raw reads aligning to the 4 exons of internet. So I was hoping that there would be a better way of quantifying the frequency than manually counting.

Do I have to annotate the composite so that it is easier to select what I am looking for and if so how do I do that.

rna-seq • 747 views
ADD COMMENTlink modified 20 months ago by Devon Ryan95k • written 20 months ago by mlopez10
1

I apologize for focusing on something other than the question, but did you post the same question (with slightly different wording) under different accounts?

Get an abundance/frequency of how many times within an RNASeq file a transcript maps to an exon

Both mention "For example, muscle type a has this exon expressed 46% while muscle type b only expresses this exon 12% of the time."

I very much want to encourage use of Biostars, but I think it is kind of important to have a transparent account, ideally linked to other information about yourself (such as your actual name, photo, etc.). Otherwise, it is harder to keep track of the answers in the different posts, and I think seeing the overall learning process for a project is important for the broader community.

ADD REPLYlink modified 20 months ago • written 20 months ago by Charles Warden7.7k
1

There is another person in the lab working with the same samples and that was her account. She is focusing more on the bioinformatic aspect and so I asked her to post the question originally. When I found out the sign up was free, then I posted the question. I apologize for any confusion I might have caused.

ADD REPLYlink written 20 months ago by mlopez10

That's OK - there are frequently similarly worded questions coming from different users. However, they usually aren't this close to being identical, and usually are posted on different days :)

ADD REPLYlink written 20 months ago by Charles Warden7.7k

Do you just have the exon sequences or do you have approximate transcript isoform sequences? The latter will be easier to use going forward.

ADD REPLYlink written 20 months ago by Devon Ryan95k

I have exon sequences yes.

ADD REPLYlink written 20 months ago by mlopez10
1
gravatar for Devon Ryan
20 months ago by
Devon Ryan95k
Freiburg, Germany
Devon Ryan95k wrote:

While the comments in Get an abundance/frequency of how many times within an RNASeq file a transcript maps to an exon are quite good (for those that can't see them, basically, "Align to the exons and count reads"), I have a feeling the following will give you better estimates:

  1. Assemble the transcriptome as best you can (e.g., using Trinity).
  2. Determine which transcripts contain your exons of interest. Most likely there will just be one each.
  3. Use salmon or kallisto with your reads and the results from 1.
  4. Extract the values associated with the transcripts in 2.

You might have to add a couple transcripts together. The nice thing about this as opposed to the more straight-forward "align to the exons and count" is that this will better handle cases where the exons have different GC content or are differentially affected by 3' bias.

ADD COMMENTlink written 20 months ago by Devon Ryan95k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 886 users visited in the last hour