I am using DEXSeq for differential exon usage tests. In the vignette of producing ExonCountSet object: http://bioconductor.org/packages/devel/data/experiment/vignettes/pasilla/inst/doc/create_objects.pdf, the author used dexseq_prepare_annotation.py to convert the GTF file to GFF file. In the GFF output, I see that (since the GTF is downloaded from Ensembl) the gene_id's start with "ENSG".
I know that the next step is to use dexseq_count.py on the GFF and SAM files to generate counts. However, because currently we have the count data file (which we prefer to use), we are hoping to use our own counts (i.e. the treated2fb.txt as in the vignette example) for the analysis. The issue is that, our count files contain EntrezGene ID's, NOT Ensembl IDs, and the conversion between the two is not bijective (i.e. 1-1). Therefore, we I run the read.HTSeqCounts() function in R, the error message "Count files do not correspond to the flattened annotation file" appears.
(1) is Ensembl GTF the only input for dexseq_prepare_annotation.py? It seems the resultant GFF file contains only Ensembl gene IDs, accordingly...
(2) in my case of non-Ensembl gene IDs, how can I instruct or manipulate the codes to generate an ExonCountSet object?