Zero expression in cuffdiff
0
0
Entering edit mode
4.3 years ago
bioinfo_ga ▴ 20

I have done reference alignment using tophat pipeline on 150*2 reads. For expression calculation i am using Cuffdiff , however i am getting 0 expression (as given below) for control as well as treated samples.

test_id : TCONS_00000001

gene_id : XLOC_000001

gene    -

locus  1:5256-11783

sample_1:1A

sample_2:1B

status :NOTEST

value_1:0

value_2 :0

log2(fold_change): 0

test_stat: 0

p_value: 1

q_value: 1

significant:no

RNA-Seq cuffdiff • 1.5k views
0
Entering edit mode

it is really hard to read this, could you edit this dataframe?

0
Entering edit mode

Is this the case for all genes?

0
Entering edit mode

yes this is in case of all the genes

0
Entering edit mode

I would recommend you post the command-lines used to help troubleshoot your issue.

0
Entering edit mode

Hisat was used for alignment, and further sorted sam was used for cufflinks

cufflinks -G Reference.gff -o Cufflinks/Set2 -p 25 --library-type fr-firststrand --upper-quartile-norm  1A_s.sam


Cuffdiff:

cuffdiff -u -p 25 --library-type fr-firststrand --total-hits-norm -o Cuffdiff/1A_Vs_1B Cuffmerge/merged.gtf  1A_s.sam.gz 1B_s.sam.gz

1
Entering edit mode

2) Why you used Reference.gff for cufflinks and Cuffmerge/merged.gtf for cuffdiff?

3) I suspect that you mapped on a reference and run cuffdiff on a different gff: can you send us the first five lines of the sam file you used?

0
Entering edit mode

1) Cufflinks is giving me non zero expression values. 2) I have checked with Reference.gff as well as converted it to gtf format, both gave the same result. 3) Reference and gff are same. Sam file first five lines are as follows:

NS500223:206:HJ7TTBGXY:3:11605:17662:2653       73      10      27880   60      4S146M  =       27880   0       CTCCTCTTATAATATCT
NS500223:206:HJ7TTBGXY:3:11605:17662:2653       133     10      27880   0       *       =       27880   0       ATAGCGATTGCATTTTT
NS500223:206:HJ7TTBGXY:4:22502:21548:10964      163     10      33791   60      150M    =       34065   424     CTGGAAGTCATCGAACC
NS500223:206:HJ7TTBGXY:2:12301:7064:10743       419     10      33901   1       148M2S  =       34022   271     CTCCTTTCTTGAAGAAC
NS500223:206:HJ7TTBGXY:2:12301:7064:10743       339     10      34022   1       150M    =       33901   -271    CAGAAGCTCTGATGTGA

0
Entering edit mode

Cool... moving forward! I can see that in the .sam file, you are aligning (in the first lines) against sequences whose name is "10". Do you have any entry named "10" in the Cuffmerge/merged.gtf file? If not, this is the problem. What you could do is to extract all the content of the third column of the bam file (or also read the bam header, which could actually be faster) to see what are the names of the sequences you are aligning against, and see if exactly the same names are present in the Cuffmerge/merged.gtf file. BTW: why did you use the -G in cufflinks (i.e. completely rely on the existing annotation) and then perform cuffdiff on a merged gtf?