Question

HTSeq-count, Differential Gene Expression using a GFF to count exons annotations without gene id reference

0

Entering edit mode

4.4 years ago

spen ▴ 40

Hi Everyone,

I looked through and couldn't find a good answer so hoping someone can help.

I have a gff in the format:

1       gramene gene    199345  205715  .       -       .       ID=gene:Zm00001d027240;biotype=protein_coding;gene_id=Zm00001d027240;logic_name=maker_gene
1       gramene mRNA    199345  205715  .       -       .       ID=transcript:Zm00001d027240_T001;Parent=gene:Zm00001d027240;biotype=protein_coding;transcript_id=Zm00001d027240_T001
1       gramene three_prime_UTR 199345  199763  .       -       .       Parent=transcript:Zm00001d027240_T001
1       gramene exon    199345  199771  .       -       .       Parent=transcript:Zm00001d027240_T001;Name=Zm00001d027240_T002.exon8;constitutive=0;ensembl_end_phase=-1;ensembl_phase=1;exon_id=Zm00001d027
240_T002.exon8;rank=9

If I'm running htseq-count with the intention of collecting gene counts for downstream differential gene expression analysis and the exon annotations do not directly refer to the gene Id, how do I collect counts for gene IDs by counting exons, which is the recommended annotation type to count? Do I need to reformat the gff file? Wondering what to set the --idattr option to in HTSeq-count as no reference to the gene ID exists in the exon line. Many thanks for your help!

RNA-Seq sequencing gene • 785 views

ADD COMMENT • link updated 4.4 years ago by hafiz.talhamalik ▴ 350 • written 4.4 years ago by spen ▴ 40

score 0 · Answer 1 · 2019-11-27

0

Entering edit mode

4.4 years ago

hafiz.talhamalik ▴ 350

can you post command you are using ? tool have option for gene based count

ADD COMMENT • link 4.4 years ago by hafiz.talhamalik ▴ 350