Question: Some problems with DEXSEQ analyse
gravatar for darbinator
12 weeks ago by
darbinator90 wrote:

Hello everyone,

After perform a DEG (differential expression analysis) with DESEQ2 (Alignement with STAR, counts reads with featureCounts), I wanted to perform this time the same analysis but at the scale of isoforms with DEXSEQ.

I have 150bp paired-ends reads from two conditions (3 controls and 3 Arabidopsis mutants for one gene) that I previously aligned with STAR.

The first python script use by DEXSEQ to prepare a GFF file from my GTF run successfully

But the second script doesn't work well

When I give my bam files like this:

python2 -p yes -s yes -r pos -f bam Reference/reference.DEXSeq.gff alignement.sorted.bam > counts.tsv

I have the following error message:

"If you are using alignment files in a bam format, please update your HTSeq to 0.5.4p4 or higher"

I don't understand why I have this error message because when I import htseq in python to see the version, it's indicate version 0.11.2 ..

I have partly solved the problem by converting bam files to sam files on the fly and indicate the option -f sam, the script perform well but now I have strange results , I have many eons with 0 counts, for exemple for the first gene:

AT1G01010:001   0
AT1G01010:002   0
AT1G01010:003   0
AT1G01010:004   0
AT1G01010:005   0
AT1G01010:006   0

While for the gene level I have many count:

AT1G01010   204

Did someone have an explication to my problem ?

Thank's in advance

ADD COMMENTlink written 12 weeks ago by darbinator90

To your first question, by any chance HTSeq doesn't meet the version requirement in python2, but it does in python(3)?

ADD REPLYlink written 12 weeks ago by Eric Lim1.4k

Hum my pipeline run with python2.7 so I haven't test it, I will test with python3

ADD REPLYlink written 12 weeks ago by darbinator90

I would first check that the reads mapping to AT1G01010 overlap with those exon coordinates. DESeq2 and HTSeq use reads mapping anywhere across the gene, so it's possible those are intronic reads.

I would generate a bed file with the coordinates for all six exons then intersect the bed with the bam and see what the output looks like:

intersectBed -a at1g01010.bed -b alignement.sorted.bam -c > outfile.bed

This should tell you what you should expect the output from to look like.

ADD REPLYlink written 12 weeks ago by shawn.w.foley580

I have perform what you did and I have this result:

1   3631    3913    50
1   3996    4276    76
1   4486    4605    46
1   4706    5095    79
1   5174    5326    37

I have 0 for all the exons, with all of my samples, I can't figure why ..

Even if my my data are paired end, I have launch the script with -p no instead of -p yes, iwithout really believing in it

But surprisingly I have results more convenant:

 AT1G01010:001  32
AT1G01010:002   34
AT1G01010:003   24
AT1G01010:004   37
AT1G01010:005   15

Even if is a bit far than the output of the bed file

ADD REPLYlink written 12 weeks ago by darbinator90

You are saying that your samples are paired-end but that you only obtain expected results when specifying single-end in DEXSeq? How did you align the data with STAR? - please show the commands.

ADD REPLYlink written 11 weeks ago by Kevin Blighe42k

with STAR, I use this command line:

STAR --runThreadN {threads} --sjdbGTFfile {input.gtf} \       # GTF of A.thaliana
        --sjdbOverhang '+str(READ_LENGHT-1)+' --genomeDir {input.starref} \      # --sjdbOverhang 149 in this case
        --outFileNamePrefix Mapping/{wildcards.sample} --readFilesIn {input.r1} {input.r2} \. # outFileName for my 8 samples
        --readFilesCommand "gunzip -c" \    # My trimming data are in gz format, so I gunzip them
        --outSAMtype BAM SortedByCoordinate; \   # Sorting by coordinate, for feature counts
        mv Mapping/{wildcards.sample}Aligned.sortedByCoord.out.bam {output};\
        mv Mapping/{wildcards.sample}*out* Mapping/Out'
ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by darbinator90
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 808 users visited in the last hour