Question: Align RNA-seq data to a custom list of exons?
0
gravatar for Cumol
6 months ago by
Cumol40
Cumol40 wrote:

A recent publication investigated splice variants of a gene I am interested in (using SMRT seq) and they described different/additional exons compared to what I find in NCBI or ENSEMBL.

I wanted to analyze splice variants and exon counts of this gene using the described exons from this publication in my RNA-seq data. I have a file with exon number and sequence.

How do I align my RNA-seq data to this list of exons? I thought about taking the normal .gtf file from ENSMBL and edit it to accommodate the exon changes. Is that the recommended way of doing so? And if so, how do I do it?

Thank you for your help!

exon rna-seq alignment • 202 views
ADD COMMENTlink modified 6 months ago • written 6 months ago by Cumol40
1
gravatar for Nicolas Rosewick
6 months ago by
Belgium, Brussels
Nicolas Rosewick7.6k wrote:

Yes you can directly change the gtf file to add the exon of interest. Then use featurecounts to count the number of reads per exon (using -t exon -g exon_id (if there is an exon_id in the gtf file).

ADD COMMENTlink written 6 months ago by Nicolas Rosewick7.6k

What is the best way to edit the GTF file?

Would I remove the lines associated with the Gene I am interested in and then add my custom lines?

I will probably need the exact start and end of each exon on the genome I am using for indexing, right?

ADD REPLYlink written 6 months ago by Cumol40

awk?

ADD REPLYlink written 6 months ago by cpad011211k

I agree with Nicolas. Adding custom transcript annotations to the gtf is correct way forward. One thing to remember is to add the fasta sequence of custom exon annotations to their specific start positions in the chromosome/scaffold of interest in the genome .fa file.

ADD REPLYlink written 6 months ago by Praneet Chaturvedi110
0
gravatar for Cumol
6 months ago by
Cumol40
Cumol40 wrote:

How do I turn my exon sequences into the fasta format? The only idea I had was to blast them against the target genome (using ensembl) and then convert the output into GTF. But it doesn't seem to be so straight forward.

Is there a better option?

ADD COMMENTlink written 6 months ago by Cumol40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1673 users visited in the last hour