Question

dexseq_count.py gives zero counts for any exon

0

Entering edit mode

8.1 years ago

laura.fancello • 0

Hi, I run dexseq_count.py on my SAM file using a GFF file generated (from ensembl Homo_sapiens.GRCh38.83.gtf) with dexseq_prepare_annotation.py

The output file contains zero counts for any exon. I also tried using a modified gff file where instead of 1,2,3,... for chromosomes I had chr1,chr2,chr3,... Can you please help me to fix this?

I used:

DEXSeq 1.6.3
HTSeq 0.6.1p1
python 2.7.6

head gff file:

1   dexseq_prepare_annotation.py    aggregate_gene  11869   14409   .   +   .   gene_id "ENSG00000223972"
1   dexseq_prepare_annotation.py    exonic_part 11869   12009   .   +   .   transcripts "ENST00000456328"; exonic_part_number "001"; gene_id "ENSG00000223972"
1   dexseq_prepare_annotation.py    exonic_part 12010   12057   .   +   .   transcripts "ENST00000456328+ENST00000450305"; exonic_part_number "002"; gene_id "ENSG00000223972"
1   dexseq_prepare_annotation.py    exonic_part 12058   12178   .   +   .   transcripts "ENST00000456328"; exonic_part_number "003"; gene_id "ENSG00000223972"
1   dexseq_prepare_annotation.py    exonic_part 12179   12227   .   +   .   transcripts "ENST00000456328+ENST00000450305"; exonic_part_number "004"; gene_id "ENSG00000223972"
1   dexseq_prepare_annotation.py    exonic_part 12613   12697   .   +   .   transcripts "ENST00000456328+ENST00000450305"; exonic_part_number "005"; gene_id "ENSG00000223972"
1   dexseq_prepare_annotation.py    exonic_part 12698   12721   .   +   .   transcripts "ENST00000456328"; exonic_part_number "006"; gene_id "ENSG00000223972"
1   dexseq_prepare_annotation.py    exonic_part 12975   13052   .   +   .   transcripts "ENST00000450305"; exonic_part_number "007"; gene_id "ENSG00000223972"
1   dexseq_prepare_annotation.py    exonic_part 13221   13374   .   +   .   transcripts "ENST00000456328+ENST00000450305"; exonic_part_number "008"; gene_id "ENSG00000223972"
1   dexseq_prepare_annotation.py    exonic_part 13375   13452   .   +   .   transcripts "ENST00000456328"; exonic_part_number "009"; gene_id "ENSG00000223972"

head sam file:

@HD VN:1.0  SO:coordinate
@SQ SN:chr1 LN:248956422
@SQ SN:chr10    LN:133797422
@SQ SN:chr11    LN:135086622
@SQ SN:chr12    LN:133275309
@SQ SN:chr13    LN:114364328
@SQ SN:chr14    LN:107043718
@SQ SN:chr15    LN:101991189
@SQ SN:chr16    LN:90338345
@SQ SN:chr17    LN:83257441
@SQ SN:chr18    LN:80373285
@SQ SN:chr19    LN:58617616
@SQ SN:chr2 LN:242193529
@SQ SN:chr20    LN:64444167
@SQ SN:chr21    LN:46709983
@SQ SN:chr22    LN:50818468
@SQ SN:chr3 LN:198295559
@SQ SN:chr4 LN:190214555
@SQ SN:chr5 LN:181538259
@SQ SN:chr6 LN:170805979
@SQ SN:chr7 LN:159345973
@SQ SN:chr8 LN:145138636
@SQ SN:chr9 LN:138394717
@SQ SN:chrMT    LN:16569
@SQ SN:chrX LN:156040895
@SQ SN:chrY LN:57227415
@PG ID:TopHat   VN:2.0.11   CL:/apps/leuven/thinking/2014a/software/TopHat/2.0.11-intel-2014a/bin/tophat --b2-sensitive --no-novel-juncs --output-dir /staging/leuven/stg_00003/users/Laura/RPL5_OtherGenes_CooperativeMutations_Isabel18062015/Mapping/2355_007_merged_vs_hg20 --transcriptome-index=/staging/leuven/stg_00003/resources/human/hg20/transcriptome_Ensembl_data/hg20_Ensembl_transcriptome /staging/leuven/stg_00003/resources/human/hg20/hg20 /staging/leuven/stg_00003/users/Laura/RPL5_OtherGenes_CooperativeMutations_Isabel18062015/CleanSequences/2355_007_merged_decontam.fastq
NS500200:118:HFFF3BGXX:3:11605:13657:15043  0   chr1    16184   50  76M *   0   0   ACACCCTGCAGAGCTGGACCCCTGAGCTAGCCATGCTCTGACAGTCTCAGTTGCACACACGAGCCAGCAGAGGCGT    AA6AAEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEE<EAE/EEEEEEEEEEEEEEEEE    AS:i:-10    XN:i:0  XM:i:2  XO:i:0  XG:i:0  NM:i:2  MD:Z:0C72G2 YT:Z:UU NH:i:1
NS500200:118:HFFF3BGXX:4:22511:16316:6491   0   chr1    17464   50  76M *   0   0   GGTCCCAGGCCTCCCGAGCCGAGCCACCCGTCACCCCCTGGCTCCTGGCCTATGTGCTGTACCTGTGTCTGATGAC    AAAAAEEEEEEEEEEEEEEEEEEEEAEEEEEE/EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE6E    AS:i:-4 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:74C1   YT:Z:UU NH:i:1
NS500200:118:HFFF3BGXX:1:22104:5581:5386    0   chr1    19195   50  76M *   0   0   TGTAGCTCCCCTACCTCCAAGAGCCCAGCCCTTGCCCACAGGGCCACACTCCACGTGCAGAGCAGCCTCAGCACTC    AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEAEEE    AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:76 YT:Z:UU NH:i:1
NS500200:118:HFFF3BGXX:1:21211:14990:10490  0   chr1    19721   50  76M *   0   0   CCACAATTCAGAAAGAAAAAAGAAGAGCACCATCTCCTTCCAGTGAGGAAGCGGGACCACCACCCAGCGTGTGCTC    AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE/EEAEEEEEE<EEEEAEEAEE    AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:76 YT:Z:UU NH:i:1

tail output

ENSG00000283122+ENSG00000118495:055 0
ENSG00000283122+ENSG00000118495:056 0
ENSG00000283122+ENSG00000118495:057 0
ENSG00000283123:001 0
ENSG00000283125:001 0
_ambiguous  2613
_ambiguous_readpair_position    0
_empty  15124232
_lowaqual   0
_notaligned 0

dexseq dexseq_count.py exon • 3.2k views

ADD COMMENT • link updated 8.1 years ago by GouthamAtla 12k • written 8.1 years ago by laura.fancello • 0

0

Entering edit mode

Edit: Changed the format. Can you post the command ?

ADD REPLY • link 8.1 years ago by GouthamAtla 12k

score 0 · Answer 1 · 2016-03-04

0

Entering edit mode

8.1 years ago

michael.ante ★ 3.8k

Hi Laura,

The gff has no "chr" included in the chromosomes' name, whilst the mapping has it included. Just use sed -i 's/^/chr/g' my.gff in order to fix this file.

Cheers,

Michael

ADD COMMENT • link 8.1 years ago by michael.ante ★ 3.8k

0

Entering edit mode

Apparently the user already tried with "chr", as mentioned in the question.

ADD REPLY • link 8.1 years ago by GouthamAtla 12k

0

Entering edit mode

I'm sorry, I haven't read that.

If it's not that easy, let me ask some questions: Is the Hg20 annotation you are aligning your reads to synchronous with the one of GRCH38.83? Is the library a stranded one? (If you are not sure, you may use RSeQC: infer_experiment.py) Have you checked your alignment visually with the IGV/IGB browser?

ADD REPLY • link 8.1 years ago by michael.ante ★ 3.8k

0

Entering edit mode

Thanks the problem was indeed the reference annotation file used. I thought it was the same I did the mapping with but it was indeed not the case. Also it was a strandedness problem: adding -s no it works fine

ADD REPLY • link 8.1 years ago by laura.fancello • 0