Question: RNA_seq analysis cuffdiff output
0
gravatar for sophialovechan
19 months ago by
sophialovechan30 wrote:

Hi everyone, I am running cufflinks and cuffdiff to get differential expressed genes for two cells. But after cuffdiff, the output from gene.exp.diff is weird. There are several gene ids for the same gene and they have the same locus. I tried to use two different annotation genome, both hg19 and the one from Ensembl. Here is an example of my output.

XLOC_023143 XLOC_023143 NANOG   12:7940389-7949298  H9_0    H9_2    OK  4.17005 6.76886
XLOC_026123 XLOC_026123 NANOG   12:7940389-7949298  H9_0    H9_2    OK  1.30734 1.4116
XLOC_026124 XLOC_026124 NANOG   12:7940389-7949298  H9_0    H9_2    OK  43.3278 17.506
XLOC_026125 XLOC_026125 NANOG   12:7940389-7949298  H9_0    H9_2    OK  22.7351 14.8569

Here is my scripts. cufflinks

module load anaconda/2.2.0 boost/1.47.0/gcc447 samtools/0.1.19 cufflinks/2.2.1
time cufflinks -p 8 --library-type fr-firststrand -o H9_0_1_clout ./H9_0_1/accepted_hits.bam

cuffdiff

module load anaconda/2.2.0 boost/1.47.0/gcc447 samtools/0.1.19 cufflinks/2.2.1
time cuffdiff -o diff_D0_D2 -p 4 \
-b /lustre1/tw30282/annotation/GRCh37_iGenome/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome.fa \
-L H9_0,H9_2 -u ./merged_asm/merged.gtf \
./H9_0_rep1_en/accepted_hits.bam,./H9_0_rep2_en/accepted_hits.bam \
./H9_2_rep1_en/accepted_hits.bam,./H9_0_rep2_en/accepted_hits.bam

Can anyone give me a hint about the problems? Thanks.

rna-seq • 746 views
ADD COMMENTlink modified 19 months ago by andrew.j.skelton735.5k • written 19 months ago by sophialovechan30
1
gravatar for andrew.j.skelton73
19 months ago by
London
andrew.j.skelton735.5k wrote:

The XLOC ID is the arbitrary window that Cuffdiff spans when performing the differential expression test, hence the two different XLOC codes are either overlapping or close to NANOG. As far as I know Cuffdiff has been deprecated / is in low maintenance. You should be using the replacement protocol with Stringtie and Ballgown, or alternative methods such as Salmon, tximport and DESeq2.

ADD COMMENTlink written 19 months ago by andrew.j.skelton735.5k

Thank you for your quick response. I'd like to try it out. Do you know how I can extract a small set of sequencing data from my whole RNA-sequencing dataset? I want to try it with s subset of my data first.

ADD REPLYlink written 19 months ago by sophialovechan30

See: C: How can I create a dummy dataset from my NGS whole-exome data? Use the method described and take a random sample(s) to prevent any bias.

ADD REPLYlink modified 19 months ago • written 19 months ago by genomax63k

Hi I runed my data with Stringtie and Ballgown again but I still got the same problem. That's weird. Before I move to DESeq2, do you think there might be something possibly wrong with the previous steps?

ADD REPLYlink written 19 months ago by sophialovechan30
1

It's not a "problem", as much as a feature of the software. This only occurs when performing novel discovery, so it might be worth just running Stringtie with novel discovery turned off.

ADD REPLYlink written 19 months ago by andrew.j.skelton735.5k

I used -e argument when I runned stringtie which should only map the transcripts which have references, right?

ADD REPLYlink written 19 months ago by sophialovechan30
1

Ok, in that case maybe it's a feature of Ballgown. When -e is used, it should only quantify against the reference provided with -G. Have you tried the DESeq2 approach from Stringtie described here? - I've used the DESeq2 approach recently and it provided the annotation I gave it with -G as the output from prepDE.py.

ADD REPLYlink written 19 months ago by andrew.j.skelton735.5k

Thank you. So prepDE.py is a stringtie built-in package. I should run it after I generated ballgown files. Should I write my scripts like this?

module load stringtie/1.3.3b prepDE.py -l 101 -c -i sample_lst.txt -g annotation.gff

ADD REPLYlink modified 19 months ago • written 19 months ago by sophialovechan30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1128 users visited in the last hour