Entering edit mode
11.9 years ago
bio monkey
▴
40
I'm trying to run cuffmerge using samples from cufflinks output files for these.
I did
cuffmerge assemblies.txt
where assmblies.txt had
./A/transcripts.gtf
./B/transcripts.gtf
If you look at the cufflinks output isoforms for these two you get:
tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage FPKM FPKM_conf_lo FPKM_conf_hi FPKM_status
CUFF.1.1 - - CUFF.1 - - ERCC-00003:3-1058 1055 401.337 16651.1 15842.2 17460 OK
CUFF.2.1 - - CUFF.2 - - ERCC-00004:0-597 597 2845.28 118048 114864 121231 OK
CUFF.3.1 - - CUFF.3 - - ERCC-00009:0-999 999 267.03 11078.8 10396 11761.6 OK
CUFF.4.1 - - CUFF.4 - - ERCC-00019:4-600 596 19.7057 817.567 552.313 1082.82 OK
CUFF.5.1 - - CUFF.5 - - ERCC-00022:6-751 745 77.8041 3228.01 2780.37 3675.66 OK
CUFF.6.1 - - CUFF.6 - - ERCC-00025:3-1948 1945 7.95538 330.06 250.591 409.529 OK
CUFF.7.1 - - CUFF.7 - - ERCC-00034:8-980 972 9.71542 403.082 270.55 535.615 OK
CUFF.8.1 - - CUFF.8 - - ERCC-00035:2-1069 1067 30.8168 1278.56 1055.99 1501.12 OK
CUFF.9.1 - - CUFF.9 - - ERCC-00042:1-1045 1044 198.638 8241.3 7668.49 8814.11 OK
and
tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage FPKM FPKM_conf_lo FPKM_conf_hi FPKM_status
CUFF.1.1 - - CUFF.1 - - ERCC-00003:3-1018 1015 478.931 17910.6 17093.9 18727.2 OK
CUFF.2.1 - - CUFF.2 - - ERCC-00004:0-598 598 3221.58 120477 117424 123531 OK
CUFF.3.1 - - CUFF.3 - - ERCC-00009:3-1005 1002 298.562 11165.3 10515.2 11815.4 OK
CUFF.4.1 - - CUFF.4 - - ERCC-00019:26-614 588 16.4694 615.906 394.666 837.145 OK
CUFF.5.1 - - CUFF.5 - - ERCC-00022:4-743 739 99.3447 3715.2 3256.15 4174.25 OK
CUFF.6.1 - - CUFF.6 - - ERCC-00025:470-1946 1476 10.2811 384.482 289.104 479.861 OK
CUFF.7.1 - - CUFF.7 - - ERCC-00034:10-980 970 8.96558 335.286 220.284 450.288 OK
CUFF.8.1 - - CUFF.8 - - ERCC-00035:12-1099 1087 28.0996 1050.84 861.34 1240.34 OK
CUFF.9.1 - - CUFF.9 - - ERCC-00042:0-1015 1015 217.56 8136.1 7585.68 8686.51 OK
but if you look at the cuffmerge output isoforms.fpkm_tracking you get :
tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage FPKM FPKM_conf_lo FPKM_conf_hi FPKM_status
CUFF.1.1 - - CUFF.1 - - ERCC-00002:1-1132 1131 0.000108688 1 0 0 OK
CUFF.2.1 - - CUFF.2 - - ERCC-00003:3-1058 1055 0.000100395 1 0 0 OK
CUFF.3.1 - - CUFF.3 - - ERCC-00004:0-598 598 5.79575e-05 1 0 0 OK
CUFF.4.1 - - CUFF.4 - - ERCC-00009:0-1005 1005 9.70485e-05 1 0 0 OK
CUFF.5.1 - - CUFF.5 - - ERCC-00019:4-614 610 5.7424e-05 1 0 0 OK
CUFF.6.1 - - CUFF.6 - - ERCC-00022:4-751 747 7.1974e-05 1 0 0 OK
CUFF.7.1 - - CUFF.7 - - ERCC-00025:3-1948 1945 0.000165918 1 0 0 OK
Notice how it gives you 1 for all the FPKM values and it gives you a really small coverage. I was under the impression the merged isoforms would have similar FPKM like taking average of the original samples...
Anyone know how to fix this?