Rna Seq Transcript Aligns On The Wrong Strand
1
1
Entering edit mode
11.3 years ago
disco ▴ 30

Hello,

I'm analysing RNA seq data from the ENCODE CSHL long RNA seq to see differential expression between two genes sharing a chromosomal locus. I am really not familiar with bio-informatics at all, a wet bench researcher through & through. Somehow, I managed to get on with a linux platform and started with a single sample to analyse with cufflinks, and further aligned it to the reference genome using IGV. What I see is that the transcripts from cufflinks for the two genes are on the same strand in IGV, as opposed to the reality wherein they are in different strands, going away from each other. I'm pretty convinced that its a technical mistake, pertaining to the fact that I'm not suave with these informatic analyses. But if anybody could please point out how it is done properly or what could possibly have gone wrong, I would be really grateful.

Many thanks, Vaish

strand • 3.2k views
ADD COMMENT
1
Entering edit mode

I know you've probably thought of this, but I'd suggest finding a local resource to go through this with you. There are MANY details in an analysis that you will want to learn, I'm sure, and having someone you can run ideas past can be the most effective way to do that.

ADD REPLY
0
Entering edit mode

Yeah, I tried my best but couldn't find anyone who would sit and go through the whole thing, some people were kind enough to suggest and direct me, and we don't really have a bio-informatician in my group.

ADD REPLY
0
Entering edit mode

Also see older question: Transcript Specific Expression Data

ADD REPLY
0
Entering edit mode

Hello Josh, do you have an idea what could be wrong with this analysis method?

ADD REPLY
0
Entering edit mode

Can you clarify a bit? I'm not sure exactly what you mean about seeing "the transcripts from cufflinks in IGV" ... are you somehow loading a gtf (gff) generated by cufflinks in IGV? Or are you looking at the read alignments (the accepted_hits.bam) in IGV and something looks weird to you?

ADD REPLY
0
Entering edit mode

I loaded the gtf file generated by cufflinks and viewing it in IGV.. I could post a screenshot if it would be helpful..

ADD REPLY
0
Entering edit mode
ADD REPLY
3
Entering edit mode
11.3 years ago
Michael 54k

That is most likely not a mistake. Most RNA-seq protocols are not strand specific. I would check with the sequencing lab, and until it is stated explicitly, assume that there is no valid strand information in the data.

ADD COMMENT
0
Entering edit mode

Thanks for the response. All I'm doing now is to see differential expression of these two genes across different samples. So, this shouldn't be a problem, right?

ADD REPLY
0
Entering edit mode

I would check the aligned reads for uniqueness, and to be on the safe side when drawing conclusions, discard reads which have multiple matches to the reference (on whatever strand) from the data for DE analysis.

ADD REPLY
0
Entering edit mode

That makes a lot of sense, thanks a lot! I'm going to try that.

ADD REPLY

Login before adding your answer.

Traffic: 1441 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6