Question

Gene expression level with RNA-seq

0

Entering edit mode

9.0 years ago

bharata1803 ▴ 560

Hello all,

I want to ask a basic question because this is the first time I try RNA-seq analysis. My original task is basically to count several genes transcription level. I have 8 aligned RNA-seq data with Tophat. There are 2 categories, normal and disease. For Normal I give number 34,35,36, and 37. For Disease I give number 41,42,43, and 44. I already get the aligned bam file from tophat and I follow some workflows to use cufflinks to generate gtf file for each data. I have done that and I also finished merge all 8 data with cuffmerge. Basically, I follow the diagram flow for cufflinks >=2.2 from here.

For human GTF file, I download from: http://genome.ucsc.edu/cgi-bin/hgTables?command=start

I'm a bit confused with the step after that. After I use cuffmerge, I get 1 GTF file and then the step after is cuffquant. What is the input of this cuffquant? The graph said final transcriptome assembly and mapped reads, but which mapped reads? Is it the original accepted_bam from Tophat?

Thank you for your answer.

RNA-Seq tophat cufflinks • 3.3k views

ADD COMMENT • link updated 22 months ago by Ram 43k • written 9.0 years ago by bharata1803 ▴ 560

Ram · Answer 1 · 2015-04-07

2

Entering edit mode

9.0 years ago

Devon Ryan 104k

The earlier expression metrics weren't based on the merged annotation, so the values aren't comparable. That's why there's a requantification step upstream of cuffdiff. Yes, this step would require the merged gtf and the BAM files.

ADD COMMENT • link 9.0 years ago by Devon Ryan 104k

0

Entering edit mode

Thank you for your answer. I just use it but with all of the bam files in one command and resulting in 1 cxb. Is it right? Or basically, I need to run cuffquant for each accepted bam against the merged GTF file from cuffmerge?

ADD REPLY • link updated 22 months ago by Ram 43k • written 9.0 years ago by bharata1803 ▴ 560

0

Entering edit mode

My understanding is that this needs to be run per BAM file, so you'd then end up with multiple cxb files. Cuffdiff, in turn, can accept multiple cxb files.

ADD REPLY • link 9.0 years ago by Devon Ryan 104k

0

Entering edit mode

Thank you. I will try it.

ADD REPLY • link 9.0 years ago by bharata1803 ▴ 560

0

Entering edit mode

Hello,

I want to ask about cuffmerge. When I do the cuffmerge, there are 2 parameters that I'm a bit confused. The --ref-sequence parameter and --ref-gtf parameter. I use --ref-sequence parameter with Hg38 which I downloaded from Ensembl in each chromosome file and I use human gene GTF that I download from UCSC for --ref-gtf parameter. Do you think it is right? I read a reply in other post in biostar, the --ref-gtf parameter is one of the gtf file from sample, for example GTF file from sample number 34 in my case. Which one do you think it is? Now, I'm running the cuffquant with merged GTF from the later (I cuffmerge with --ref-gtf to number 34 GTF) and still waiting for the result. Thank you.

ADD REPLY • link updated 22 months ago by Ram 43k • written 9.0 years ago by bharata1803 ▴ 560

0

Entering edit mode

There are two issues here, actually. Firstly, the --ref-gtf parameter should take the reference annotation from Ensembl/UCSC/etc., not a GTF file from one of your samples. The second issue is that you should avoid mixing Ensembl and UCSC files. These two sources use slightly different names for each chromosome (UCSC will use things like "chr1" and Ensembl would instead use "1"). These differences mean that cuffmerge will be unable to tell that "1" in one of your samples and "chr1" in the GTF file are the same, meaning that the GTF file will likely get ignored. As a rule of thumb, always use only Ensembl or only UCSC files, that'll prevent a lot of issues. Personally, I prefer the files from Ensembl, they tend to be better managed.

ADD REPLY • link updated 22 months ago by Ram 43k • written 9.0 years ago by Devon Ryan 104k

0

Entering edit mode

Hello,

I redo all of my work with GTF from Ensembl from here: ftp://ftp.ensembl.org/pub/release-79/gtf/homo_sapiens

I found a problem when I tried to do the cuffnorm step. The error said : (8 transcripts) does not match GTF (7 transcripts).

I do all of the work with the sane GTF so I found it strange. Before, when I use GTF file from UCSC, the cuffnorm process is success. Where do you think I made a mistake? Thank you.

ADD REPLY • link updated 22 months ago by Ram 43k • written 9.0 years ago by bharata1803 ▴ 560

0

Entering edit mode

I've never seen that error before. Perhaps you can ask the authors of the tool.

ADD REPLY • link 9.0 years ago by Devon Ryan 104k

0

Entering edit mode

It seems I use both Genome fasta and Gene GTF from UCSC before, after I change the GTF file from Ensmble, it becomes error. Thank you for your help.

ADD REPLY • link 9.0 years ago by bharata1803 ▴ 560

0

Entering edit mode

Did you ever discover a solution to this? Im getting "reconstituted expression bundle (1 transcripts) does not match ( 2 transcripts):....."

ADD REPLY • link 5.2 years ago by Bioinformatics-man • 0