I'm analysing results from a RNA-seq experiment. My organism is an annotated yeast that has some effect on vitis. I have around 15 millions of reads per sample obtained with illumina pair-end, 90 bp length.
I have 2 strains, c13 and c24 which were cultivated in a control condition and one treatment, the experiment was performed with three replicates.
In summary I have:
C13_control (R1,R2,R3), C13_treat (R1,R2,R3)
C24_control (R1,R2,R3), C24_treat (R1,R2,R3)
I want to cheack the effect of my treatment over the yeast with Tuxedo protocol. At this point I have done Tophat alignment with my 12 samples. with my fasta genome indexed as DA and the following pipeline:
tophat -G annotation.gff3 -o out_folder DA c*_control_R*.fastq c*_treatment_R*.fastq
Checking my results I get something curious. Reads from my c13_control sample aligned only in a 4%, while the same strain, but on treatment map a 60% of the reads. In the other hand, read from my c24 strain map in both cases a 60%.
If I ignore this result and I continue the protocol, when I observe my differencial expression in c13 the fold change is huge (even 30 times) and almost all the genes are significant, because obviously is comparing high expression in the treatment over my poor mapping in control, so I have the feeling that my control is not being a control. c24 instead, show a fewer number of genes with fold changes lower that 8 times.
I'm trying to find an explanation for this behaviour or find another approach to raise this problem, so I would appreciate any comment!