Question: FeatureCounts Not Reading GTF File Correctly
0
gravatar for dec986
2.8 years ago by
dec986160
United States
dec986160 wrote:

Hello,

I am writing my own GTF file, and featureCounts will only process some of the reads and not others, I have no idea why, as far as I can tell, the lines are identical.

why does featureCounts recognize lines like

chrM    ENSEMBL gene    15356   15422   .   -   .   gene_id "ENSMUSG00000064372.1"; gene_type "Mt_tRNA"; gene_status "KNOWN"; gene_name "mt-Tp"; level 3;
chrM    ENSEMBL transcript  15356   15422   .   -   .   gene_id "ENSMUSG00000064372.1"; transcript_id "ENSMUST00000082423.1"; gene_type "Mt_tRNA"; gene_status "KNOWN"; gene_name "mt-Tp"; transcript_type "Mt_tRNA"; transcript_status "KNOWN"; transcript_name "mt-Tp-201"; level 3; tag "basic"; transcript_support_level "NA";
chrM    ENSEMBL exon    15356   15422   .   -   .   gene_id "ENSMUSG00000064372.1"; transcript_id "ENSMUST00000082423.1"; gene_type "Mt_tRNA"; gene_status "KNOWN"; gene_name "mt-Tp"; transcript_type "Mt_tRNA"; transcript_status "KNOWN"; transcript_name "mt-Tp-201"; exon_number 1; exon_id "ENSMUSE00000521550.1"; level 3; tag "basic"; transcript_support_level "NA";

but not read my added lines, written like this?

chr1    ENSEMBL transcript  13139159    13142763    .   -   .   gene_id "Unknown7"; transcript_id "Unknown7"; gene_type "TEC"; gene_status "PUTATIVE"; gene_name "Unknown7"; transcript_type "TEC"; transcript_status "PUTATIVE"; transcript_name "Unknown7"; exon_number 1; exon_id "Unknown7"; level 3;
chr1    ENSEMBL transcript  13139159    13142763    .   -   .   gene_id "Unknown8"; transcript_id "Unknown8"; gene_type "TEC"; gene_status "PUTATIVE"; gene_name "Unknown8"; transcript_type "TEC"; transcript_status "PUTATIVE"; transcript_name "Unknown8"; exon_number 1; exon_id "Unknown8"; level 3;
chr1    ENSEMBL transcript  13139159    13142763    .   -   .   gene_id "Unknown9"; transcript_id "Unknown9"; gene_type "TEC"; gene_status "PUTATIVE"; gene_name "Unknown9"; transcript_type "TEC"; transcript_status "PUTATIVE"; transcript_name "Unknown9"; exon_number 1; exon_id "Unknown9"; level 3;

I realize this is a tedious question... but I've spent hours on this and I can't see the problem :(

rna-seq featurecounts • 1.6k views
ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by dec986160

Could you add the command you use for featurecounts with these gtf files?

ADD REPLYlink written 2.8 years ago by WouterDeCoster37k

the command I use is:

featureCounts -g transcript_id -a ~/GENE_DATA/mm10/embryo_novel_transcripts_only.gtf -o transcript_id_featureCount.tsv sorted.bam
ADD REPLYlink written 2.8 years ago by dec986160
1

Default behavior is to count only for feature 'exon', you have to specify the -t flag, in your case

-t transcript
ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by WouterDeCoster37k

yes! this solves my problem, instead of using -g.

ADD REPLYlink written 2.8 years ago by dec986160
0
gravatar for dec986
2.8 years ago by
dec986160
United States
dec986160 wrote:

The key here is that

-t

means the 3rd column in Gencode GTF.

Also,

-g

means 9th column. I think -t option works better.

ADD COMMENTlink written 2.8 years ago by dec986160
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 760 users visited in the last hour