Off topic:problems about cuffdiff output
0
1
Entering edit mode
9.4 years ago
512788522 ▴ 20

After running TopHat, I hava run the following command directly(because of no replicates for each condition, i.e normal versus mutation):

cuffdiff -p 8 \
  -o ./cuffdiff_out \
  -L normal,mut \
  -u ./genes.gtf \
  ./tophat/accepted_hits.bam \
  ./tophat_mut/accepted_hits.bam

In addition, I also have performed the differential expression analysis in the following way:

cuffmerge -g ./genes.gtf \
  -p 8 \
  -o ./cuffmerge_out \
  -s ./ucsc.hg19.fa assembly.txt
cuffdiff -p 8 \
  -o ./diff_out \
  -L normal,mut \
  -u ./cuffmerge_out/merged.gtf \
  ./tophat/accepted_hits.bam \
  ./tophat_mut/accepted_hits.bam

There was great difference in their output. In the first way, the gene_exp.diff looked like:

test_id gene_id gene    locus   sample_1        sample_2        status  value_1 value_2 log2(fold_change)  test_stat       p_value q_value significant
A1BG    A1BG    A1BG    chr19:58858171-58874214 normal  mut     OK      33.5662 14.8501 -1.17654  -0.611634        0.50845 0.867476        no
A1BG-AS1        A1BG-AS1        A1BG-AS1        chr19:58858171-58874214 normal  mut     OK      1.17516    1.66153 0.499655        0.071299        0.84635 0.94951 no
A1CF    A1CF    A1CF    chr10:52559168-52645435 normal  mut     NOTEST  0       0       0       0 11       no

but in the second way, it looked like:

test_id gene_id gene    locus   sample_1        sample_2        status  value_1 value_2 log2(fold_change)  test_stat       p_value q_value significant
XLOC_000001     XLOC_000001     DDX11L1 chr1:11873-29370        normal  mut     NOTEST  0       0 00       1       1       no
XLOC_000002     XLOC_000002     OR4F5   chr1:69090-70008        normal  mut     NOTEST  0       0 00       1       1       no
XLOC_000003     XLOC_000003     LOC100132062,LOC100133331       chr1:323891-328581      normal  mut      NOTEST  0       0       0       0       1       1       no

The test_id and gene_id looked so strange, and the number of differentially expressed genes detected were 16 and 51,respectively. Why was there the great difference? What did cause the bias in the final results?

Thanks!

RNA-Seq • 2.5k views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2680 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6