I am using GFF file for feature count to produce counts for RNA-Seq analysis and the organism is non-model organism, while calculating counts I am unable to get the proper counts and as the assembly is not good and the gff
  #!genome-build RproC3                                                           
  #!genome-version RproC3                                                         
  #!genome-date 2015-04                                                           
  #!genome-build-accession GCA_000181055.3                                                                
KQ034291        VectorBase      gene    36335   45838   0       +       0       gene_id "RPRC000679";"
KQ034291        VectorBase      transcript      36335   45838   0       +       0       gene_id "RPRC000679"; transcript_id "RPRC000679-RA";"
KQ034291        VectorBase      exon    36335   36356   0       +       0       gene_id "RPRC000679"; transcript_id "RPRC000679-RA"; exon_number "1";"
KQ034291        VectorBase      CDS     36335   36356   0       +       0       gene_id "RPRC000679"; transcript_id "RPRC000679-RA"; exon_number "1";"
KQ034291        VectorBase      exon    40565   40684   0       +       0       gene_id "RPRC000679"; transcript_id "RPRC000679-RA"; exon_number "2";"
KQ034291        VectorBase      CDS     40565   40684   0       +       2       gene_id "RPRC000679"; transcript_id "RPRC000679-RA"; exon_number "2";"
KQ034291        VectorBase      exon    40763   40941   0       +       0       gene_id "RPRC000679"; transcript_id "RPRC000679-RA"; exon_number "3";"
KQ034291        VectorBase      CDS     40763   40941   0       +       2       gene_id "RPRC000679"; transcript_id "RPRC000679-RA"; exon_number "3";"
KQ034291        VectorBase      exon    45833   45838   0       +       0       gene_id "RPRC000679"; transcript_id "RPRC000679-RA"; exon_number "4";"
KQ034291        VectorBase      CDS     45833   45835   0       +       0       gene_id "RPRC000679"; transcript_id "RPRC000679-RA"; exon_number "4";"
KQ034291        VectorBase      stop_codon      45836   45838   0       +       0       gene_id "RPRC000679"; transcript_id "RPRC000679-RA"; exon_number "4";"
KQ034291        VectorBase      gene    48738   55400   0       -       0       gene_id "RPRC003242";"
KQ034291        VectorBase      transcript      48738   55400   0       -       0       gene_id "RPRC003242"; transcript_id "RPRC003242-RA";"
KQ034291        VectorBase      exon    55216   55400   0       -       0       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "1";"
KQ034291        VectorBase      CDS     55216   55289   0       -       0       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "1";"
KQ034291        VectorBase      start_codon     55287   55289   0       -       0       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "1";"
KQ034291        VectorBase      exon    53297   53592   0       -       0       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "2";"
KQ034291        VectorBase      CDS     53297   53592   0       -       1       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "2";"
KQ034291        VectorBase      exon    52421   52605   0       -       0       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "3";"
KQ034291        VectorBase      CDS     52421   52605   0       -       2       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "3";"
KQ034291        VectorBase      exon    51858   51907   0       -       0       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "4";"
KQ034291        VectorBase      CDS     51858   51907   0       -       0       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "4";"
KQ034291        VectorBase      exon    51146   51248   0       -       0       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "5";"
KQ034291        VectorBase      CDS     51146   51248   0       -       1       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "5";"
KQ034291        VectorBase      exon    50189   50352   0       -       0       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "6";"
KQ034291        VectorBase      CDS     50189   50352   0       -       0       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "6";"
KQ034291        VectorBase      exon    48738   48965   0       -       0       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "7";"
KQ034291        VectorBase      CDS     48884   48965   0       -       1       gene_id "RPRC003242"; transcript_id "RPRC003242-RA"; exon_number "7";
"
where the first column id is same for all the genes and coz of which the count file contains the id "KQ034291" repeatedly and nothing else. However, I want to have the gtf/gff file with gene names like RPRC00679,RPRC003242 and so on , so that it shall help me to get unique gene counts , is there a way to do this?
First column should refer to
chromosome name, which in your case seems to be KQ034291. I am not sure why you have (line numbers?) before that name. Where did you acquire this file from?I am also not sure but it was download from database. However I can get rid of it. But can I have the gene name instead of scaffold id in the first column?
You can but then file will not be in GTF/GFF format.
featureCountsshould understand thegene_idattribute in the file you posted.YEs it will recognise at the sequences for alignment used will have the same gene_id.....so i want to know how to do that?
Only after you fix the first column (
chromosome namesneed to match your alignment file). Have you looked at the manual/in-line help forfeatureCounts? The two options you want to pay attention to areI am aware about these two options you have mentioned, I have edited the gtf file mentioned above, I am getting following warning while running featureCounts with no output file:
According to which 9th column has some problem, which is not the real case. As I also did cut-f 9 *.gtf and here is the output :
So I have no clue what is going wrong here , any idea??
Closing a post is not an appropriate action when a question has been answered (geneally mods use that action to close posts deemed inappropriate/duplicate etc). You should accept an answer (green check mark) (moved @Devon's post to an answer) to indicate this question has been answered.