Question: Zero expression in cuffdiff
0
gravatar for bioinfo_ga
22 months ago by
bioinfo_ga0
bioinfo_ga0 wrote:

I have done reference alignment using tophat pipeline on 150*2 reads. For expression calculation i am using Cuffdiff , however i am getting 0 expression (as given below) for control as well as treated samples.

test_id : TCONS_00000001

gene_id : XLOC_000001   

gene    -

locus  1:5256-11783

sample_1:1A

sample_2:1B 

status :NOTEST

value_1:0

value_2 :0

log2(fold_change): 0

test_stat: 0

p_value: 1

q_value: 1

significant:no
rna-seq cuffdiff • 755 views
ADD COMMENTlink modified 22 months ago by genomax67k • written 22 months ago by bioinfo_ga0

it is really hard to read this, could you edit this dataframe?

ADD REPLYlink written 22 months ago by firatuyulur280

Is this the case for all genes?

ADD REPLYlink written 22 months ago by WouterDeCoster39k

yes this is in case of all the genes

ADD REPLYlink written 22 months ago by bioinfo_ga0

I would recommend you post the command-lines used to help troubleshoot your issue.

ADD REPLYlink written 22 months ago by h.mon25k

Hisat was used for alignment, and further sorted sam was used for cufflinks

Cufflinks:

cufflinks -G Reference.gff -o Cufflinks/Set2 -p 25 --library-type fr-firststrand --upper-quartile-norm  1A_s.sam

Cuffdiff:

cuffdiff -u -p 25 --library-type fr-firststrand --total-hits-norm -o Cuffdiff/1A_Vs_1B Cuffmerge/merged.gtf  1A_s.sam.gz 1B_s.sam.gz
ADD REPLYlink modified 22 months ago by genomax67k • written 22 months ago by bioinfo_ga0
1

1) What about cufflinks? Did you also have zero expression in cufflinks output file?

2) Why you used Reference.gff for cufflinks and Cuffmerge/merged.gtf for cuffdiff?

3) I suspect that you mapped on a reference and run cuffdiff on a different gff: can you send us the first five lines of the sam file you used?

ADD REPLYlink modified 22 months ago • written 22 months ago by Fabio Marroni2.2k

1) Cufflinks is giving me non zero expression values. 2) I have checked with Reference.gff as well as converted it to gtf format, both gave the same result. 3) Reference and gff are same. Sam file first five lines are as follows:

NS500223:206:HJ7TTBGXY:3:11605:17662:2653       73      10      27880   60      4S146M  =       27880   0       CTCCTCTTATAATATCT
NS500223:206:HJ7TTBGXY:3:11605:17662:2653       133     10      27880   0       *       =       27880   0       ATAGCGATTGCATTTTT
NS500223:206:HJ7TTBGXY:4:22502:21548:10964      163     10      33791   60      150M    =       34065   424     CTGGAAGTCATCGAACC
NS500223:206:HJ7TTBGXY:2:12301:7064:10743       419     10      33901   1       148M2S  =       34022   271     CTCCTTTCTTGAAGAAC
NS500223:206:HJ7TTBGXY:2:12301:7064:10743       339     10      34022   1       150M    =       33901   -271    CAGAAGCTCTGATGTGA
ADD REPLYlink modified 22 months ago by genomax67k • written 22 months ago by bioinfo_ga0

Cool... moving forward! I can see that in the .sam file, you are aligning (in the first lines) against sequences whose name is "10". Do you have any entry named "10" in the Cuffmerge/merged.gtf file? If not, this is the problem. What you could do is to extract all the content of the third column of the bam file (or also read the bam header, which could actually be faster) to see what are the names of the sequences you are aligning against, and see if exactly the same names are present in the Cuffmerge/merged.gtf file. BTW: why did you use the -G in cufflinks (i.e. completely rely on the existing annotation) and then perform cuffdiff on a merged gtf?

ADD REPLYlink written 22 months ago by Fabio Marroni2.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1818 users visited in the last hour