Question: Adding annotation into a transcript in order to use TopHat2
gravatar for mlopez
24 months ago by
mlopez10 wrote:

Hello everyone,

I am thinking about using TopHat2 to calculate the frequency/ abundance of RNA-Seq reads to 4 possible alternatively spliced exons. I have a cDNA composite that contains all the exon form each isoform but I don't know how to annotate the 4 I want to focus on so that I only get reads aligning for those 4.

Thank you

rna-seq • 466 views
ADD COMMENTlink written 24 months ago by mlopez10

We also don't have a reference genome, would this cause any issues?

Thank you

ADD REPLYlink written 24 months ago by mlopez10

How do you manage to map your reads without a reference genome ?

ADD REPLYlink written 24 months ago by Bastien Hervé4.7k

What is your species and where are located your exons on this species ?

ADD REPLYlink written 24 months ago by Bastien Hervé4.7k

We are working with doryteuthis pealeii. We are examining isoform properties one protein in 6 different muscles. There was a paper published containing loligo brachial heart muscle sequence and we used this as our first template to make primers for ours muscles of interest until we were able to sequence the gene. Using RNA-Seq we were able to see 4 differential splice sites by aligning the reads to our cDNA composite. Now I want to annotate just those 4 sites and determine the frequency that they are alternately spliced in each muscle type to see if there is a difference.

ADD REPLYlink written 24 months ago by mlopez10

If I understand correctly, you already did the alignment step using your loligo brachial heart muscle sequence. So now you have bam files from the alignment. Next step is to count the number of reads falling in your exon parts. For this you need a gtf or gff3 of your species, or at least for genes your are interested in.

As I presume you don't have one, I can suggest you to create a new one by hand.

Dowload an already made gtf/gff3 file + take a look at the gtf/gff3 documentation (first answer in this post), to correctly understand how it is made.

Once you have created it, you can run the count part using featureCounts, do not forget to input your gtf/gff3 file

ADD REPLYlink modified 24 months ago • written 24 months ago by Bastien Hervé4.7k

Fyi, Tophat2 is deprecated (here a tweet from one of the developers).


Current alternatives are HISAT2 or STAR if you want to align, or kallisto and salmon for pseudoalignments.

ADD REPLYlink written 24 months ago by ATpoint38k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1614 users visited in the last hour