Question: cufflink output FPKM values
0
gravatar for blooming.daisy333
13 months ago by
blooming.daisy33360 wrote:

I have run Cufflink on MapSplice output alignment and have got 3 output files. my question is that why im getting 0 FPKM values for some fields but not for other while the last column showing the status of FPKM value is showin OK. further why some fields (like tracking ID, class code etc) for the files isofoms.fpkm-tracking and gene.fpkm_tracking are missing. the command line used and snapshot of out files are given below:

./cufflinks -o /data/memona/cufflinks-2.2.1.Linux_x86_64/result/ -p 64 -G /data/memona/annotation/arboreum.gff3 -b /data/memona/reference/chromosome.fa -u /data/memona/cufflinks-2.2.1.Linux_x86_64/alignmentMap_sorted.sam

isoforms.fpkm_tracking

Cotton_A_36275_BGI-A2_v1.0      -       -               -       -       chr9:95929518-95929903  243     0       0       0       0       OK
Cotton_A_40148_BGI-A2_v1.0      -       -               -       -       chr9:96215509-96219630  783     0       0       0       0       OK
Cotton_A_36277_BGI-A2_v1.0      -       -               -       -       chr9:95673000-95685403  627     12.724  4.98702 3.29366 6.68038 OK
Cotton_A_11823_BGI-A2_v1.0      -       -               -       -       chr9:94356880-94359955  3075    0.471196        0.173075        0       0.346234        OK
Cotton_A_36278_BGI-A2_v1.0      -       -               -       -       chr9:95593880-95595053  1173    0.203524        0.0745688       0       0.223706        OK

genes.fpkm_tracking

tracking_id     class_code      nearest_ref_id  gene_id gene_short_name tss_id  locus   length  coverage        FPKM    FPKM_conf_lo    FPKM_conf_hi    FPKM_status
        -       -               -       -       chr1:84155-84764        -       -       0       0       0       OK
        -       -               -       -       chr1:94826-95120        -       -       0       0       0       OK
        -       -               -       -       chr1:303007-304950      -       -       0       0       0       OK
        -       -               -       -       chr1:334913-336019      -       -       0       0       0       OK
        -       -               -       -       chr1:569413-577498      -       -       0       0       0       OK
        -       -               -       -       chr1:545579-546643      -       -       0       0       0       OK
        -       -               -       -       chr1:328908-331176      -       -       34.7035 30.4456 38.9615 OK

transcripts.gtf

chr1    Cufflinks       transcript      84156   84764   1       -       .       gene_id ""; transcript_id "Cotton_A_10375_BGI-A2_v1.0"; FPKM "0.0000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
chr1    Cufflinks       exon    84156   84764   1       -       .       gene_id ""; transcript_id "Cotton_A_10375_BGI-A2_v1.0"; exon_number "1"; FPKM "0.0000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
chr1    Cufflinks       transcript      94827   95120   1       +       .       gene_id ""; transcript_id "Cotton_A_10374_BGI-A2_v1.0"; FPKM "0.0000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
chr1    Cufflinks       exon    94827   95120   1       +       .       gene_id ""; transcript_id "Cotton_A_10374_BGI-A2_v1.0"; exon_number "1"; FPKM "0.0000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
next-gen • 658 views
ADD COMMENTlink modified 13 months ago by michael.ante3.3k • written 13 months ago by blooming.daisy33360
1
gravatar for michael.ante
13 months ago by
michael.ante3.3k
Austria/Vienna
michael.ante3.3k wrote:

Hi blooming.daisy333,

First, people will tell you not to use Cufflinks any more but StringTie. You might here their point.

AFAIK, Cufflinks needs a GTF file but you are supplying a GFF3 file. Could you get the annotation also in GTF format? If not, you may try to convert it with the UCSC tools to convert first the GFF3 into a genePred file and than from that into a gtf.

Cheers,

Michael

[EDIT] Could you please show the first lines of your annotation file.

ADD COMMENTlink modified 13 months ago • written 13 months ago by michael.ante3.3k

Hi Michael, thanks for the prompt response. Can you plz suggest which softwares are the best and currently being suggested to align RNAseq data to determine splice junctions and what are replacements of cufflink software besides String Tie???

further I have checked the manual of cufflinks and it is suggesting as

-G/--GTF <reference_annotation.(gtf gff)&gt;<="" strong="">

*for me it means that it accepts both gtf and gff file format.*

few line of annotation file are as follows:**

##gff-version null
chr4    GLEAN   mRNA    123284514       123288477       0.999991        -      .                                                                                        ID=Cotton_A_18927_BGI-A2_v1.0;Name=Cotton_A_18927;source_id=CottonA_GLEAN_100222                                                                                        28;identical_support_id=CUFF67.1103.1;evid_id=Cot030308.1
chr4    GLEAN   CDS     123288376       123288477       .       -       0      P                                                                                        arent=Cotton_A_18927_BGI-A2_v1.0
chr4    GLEAN   CDS     123287662       123287826       .       -       0      P                                                                                        arent=Cotton_A_18927_BGI-A2_v1.0
chr4    GLEAN   CDS     123287427       123287536       .       -       0      P                                                                                        arent=Cotton_A_18927_BGI-A2_v1.0
chr4    GLEAN   CDS     123287129       123287237       .       -       1      P                                                                                        arent=Cotton_A_18927_BGI-A2_v1.0
chr4    GLEAN   CDS     123286939       123287051       .       -       0      P                                                                                        arent=Cotton_A_18927_BGI-A2_v1.0
chr4    GLEAN   CDS     123286180       123286330       .       -       1      P                                                                                        arent=Cotton_A_18927_BGI-A2_v1.0
chr4    GLEAN   CDS     123284514       123285671       .       -       0      P                                                                                        arent=Cotton_A_18927_BGI-A2_v1.0
chr9    GLEAN   mRNA    17802711        17803334        1       +       .      I                                                                                        D=Cotton_A_16149_BGI-A2_v1.0;Name=Cotton_A_16149;source_id=CottonA_GLEAN_1003078                                                                                        7;evid_id=Cot023903.1
ADD REPLYlink modified 13 months ago • written 13 months ago by blooming.daisy33360

Hi,

There is a difference between Cufflinks mentioned gff (should be gff2) and a gff3 file. Since you named your file.gff3, i was suggesting to use rather gtf. In your GFF file, there is no aggregation on gene level. Thus, Cufflinks cannot make a connection between an isoform and a gene. You may try the "-g" (--GTF-guide) option rather than the "-G" (--GTF). This will create a guided assembly of your known isoforms into gene loci.

Cheers,

Michael

ADD REPLYlink written 13 months ago by michael.ante3.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1513 users visited in the last hour