Problem In Cuffdiff Output
1
0
Entering edit mode
10.9 years ago

i have some problem related to cuffidiff output, i used this command for replicates:

cuffdiff -o diff_out -b genome/genome.fa -p 10 -L CT,SS --total-hits-norm -u cuffmerge_experiment/merged.gtf tophat_exp/tophat_exp1/accepted_hits.bam,tophat_exp/tophat_exp2/accepted_hits.bam tophat_exp/tophat_exp3/accepted_hits.bam,tophat_exp/tophat_exp4/accepted_hits.bam

in my output file i found gene column is empty:

TCONS_00000001  XLOC_000001     -       C11044824:403-1007      CT      SS      OK      4.25634 4.05292 -0.0706522      0.148488        0.881958        0.999985        no
TCONS_00000002  XLOC_000002     -       C11047824:1-1042        CT      SS      OK      4.96333 5.54938 0.161018        -0.40853        0.682884        0.999985        no
TCONS_00000003  XLOC_000003     -       C11048020:0-588 CT      SS      OK      3.66925 4.25115 0.212368        -0.422853       0.672403        0.999985        no
TCONS_00000004  XLOC_000004     -       C11048338:216-588       CT      SS      OK      0.795992        0.604691        -0.396557       0.258201        0.796252        0.999985        no
TCONS_00000005  XLOC_000005     -       C11049822:426-1035      CT      SS      OK      0.359691        0.790309        1.13566 -0.675737       0.499208        0.999985        no

i cant understand how can be its possible, if any one have suggestion please guide me.

upto cuffmerge i found gene in my file like this:

C11111420       Cufflinks       exon    726     1304    .       -       .       gene_id "XLOC_000033"; transcript_id "TCONS_00000034"; exon_number "1"; oId "Ca_28192"; nearest_ref "Ca_28192"; class_code "="; tss_id "TSS33"; p_id "P11";
C11112120       Cufflinks       exon    391     1218    .       +       .       gene_id "XLOC_000034"; transcript_id "TCONS_00000035"; exon_number "1"; oId "Ca_27742"; nearest_ref "Ca_27742"; class_code "="; tss_id "TSS34"; p_id "P12";
C11112120       Cufflinks       exon    1577    1657    .       +       .       gene_id "XLOC_000034"; transcript_id "TCONS_00000035"; exon_number "2"; oId "Ca_27742"; nearest_ref "Ca_27742"; class_code "="; tss_id "TSS34"; p_id "P12";
C11112120       Cufflinks       exon    1929    2126    .       -       .       gene_id "XLOC_000035"; transcript_id "TCONS_00000036"; exon_number "1"; oId "Ca_27743"; nearest_ref "Ca_27743"; class_code "="; tss_id "TSS35"; p_id "P13";
C11115550       Cufflinks       exon    6       533     .       +       .       gene_id "XLOC_000036"; transcript_id "TCONS_00000037"; exon_number "1"; oId "Ca_27676"; nearest_ref "Ca_27676"; class_code "="; tss_id "TSS36"; p_id "P14";
C11115550       Cufflinks       exon    2561    2800    .       -       .       gene_id "XLOC_000037"; transcript_id "TCONS_00000038"; exon_number "1"; oId "Ca_27677"; nearest_ref "Ca_2767

so how can be its possible that i didn't find in cuffdiff output. please contact on my email address thakkar.bjl@gmail.com

cuffdiff cuffmerge • 4.3k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
1
Entering edit mode
10.9 years ago
Fred ▴ 790

It seems that you don't use a reference GFF/GTF file containing known transcripts and gene names.

You should use a gff/gtf file ( for example the genes.gtf file in one of the archives here http://tophat.cbcb.umd.edu/igenomes.shtml ) in the cuffcompare step:

cuffcompare -r genes.gtf <input1.gtf> [<input2.gtf> .. <inputN.gtf>]}

It produces a cuffcmp.combined.gtf file that you can use in the cuffdiff step.

ADD COMMENT
0
Entering edit mode

i also used cuffcmp.combined.gtf file in that, i used .gff file then reference fasta file and my cufflink output transcripts.gtf file but it also give same result, in gene column it is like "-" .i tried a lot. thanks for your reply.

ADD REPLY
0
Entering edit mode

my gff file look like this.

Ca8     GLEAN   mRNA    37786   42033   0.760266        -       .       ID=Ca_11937;evid_id=GAR_10012294;
Ca8     GLEAN   CDS     41914   42033   .       -       0       Parent=Ca_11937;
Ca8     GLEAN   CDS     41781   41843   .       -       0       Parent=Ca_11937;
Ca8     GLEAN   CDS     41556   41690   .       -       0       Parent=Ca_11937;
Ca8     GLEAN   CDS     41300   41473   .       -       0       Parent=Ca_11937;
Ca8     GLEAN   CDS     41118   41222   .       -       0       Parent=Ca_11937;
Ca8     GLEAN   CDS     40879   40912   .       -       0       Parent=Ca_11937;
ADD REPLY
0
Entering edit mode

Your gtf file seems to be automatically generated by a gene prediction program. So it does not contain any gene annotation (such as gene name, gene id, etc. )

ADD REPLY

Login before adding your answer.

Traffic: 1665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6