Tophat - poor mapping in control vs treament
1
0
Entering edit mode
7.3 years ago
pveraw • 0

Hi everyone!

I'm analysing results from a RNA-seq experiment. My organism is an annotated yeast that has some effect on vitis. I have around 15 millions of reads per sample obtained with illumina pair-end, 90 bp length.

I have 2 strains, c13 and c24 which were cultivated in a control condition and one treatment, the experiment was performed with three replicates.

In summary I have:

C13_control (R1,R2,R3), C13_treat (R1,R2,R3)

C24_control (R1,R2,R3), C24_treat (R1,R2,R3)

I want to check the effect of my treatment over the yeast with Tuxedo protocol. At this point I have done Tophat alignment with my 12 samples. with my fasta genome indexed as DA and the following pipeline:

tophat -G annotation.gff3 -o out_folder DA c*_control_R*.fastq c*_treatment_R*.fastq


Checking my results I get something curious. Reads from my c13_control sample aligned only in a 4%, while the same strain, but on treatment map a 60% of the reads. In the other hand, read from my c24 strain map in both cases a 60%.

If I ignore this result and I continue the protocol, when I observe my differential expression in c13 the fold change is huge (even 30 times) and almost all the genes are significant, because obviously is comparing high expression in the treatment over my poor mapping in control, so I have the feeling that my control is not being a control. c24 instead, show a fewer number of genes with fold changes lower that 8 times.

I'm trying to find an explanation for this behaviour or find another approach to raise this problem, so I would appreciate any comment!

Thanks guys!

rna-seq • 2.2k views
0
Entering edit mode

I've updated your summary slightly, since I think you had the C13 samples listed twice originally.

0
Entering edit mode

Can you do wildcarding like that with tophat? What do your align_summary.txt tophat.log and run.log files look like?

0
Entering edit mode

Yes Devon, I trimmed adapters and QC.

Sorry if I confused you mbio.kyle, I just used * to show that I did the same pipeline for every sample. For example for my control/treatment samples c13 replicate1 I did:

tophat -G annotation.gff3 -o out_folder DA c13_control_R1.1.fastq c13_control_R1.2.fastq
tophat -G annotation.gff3 -o out_folder DA c13_treatment_R1.1.fastq c13_treatment_R1.2.fastq


So far my tophat and run log files does not display any errors or warnings, my align summary are for example:

#### C13_control_R1:

Left reads:
Input     :  13349110
Mapped   :    468836 ( 3.5% of input)
of these:       664 ( 0.1%) have multiple alignments (2 have >20)
Input     :  13349110
Mapped   :    468082 ( 3.5% of input)
of these:       587 ( 0.1%) have multiple alignments (2 have >20)

Aligned pairs:    165921
of these:       193 ( 0.1%) have multiple alignments
141 ( 0.1%) are discordant alignments
1.2% concordant pair alignment rate.


#### C13_treatment_R1:

Left reads:
Input     :  14380800
Mapped   :   8700067 (60.5% of input)
of these:     15961 ( 0.2%) have multiple alignments (318 have >20)
Input     :  14380800
Mapped   :   8684246 (60.4% of input)
of these:     15953 ( 0.2%) have multiple alignments (318 have >20)

Aligned pairs:   6724170
of these:     12113 ( 0.2%) have multiple alignments
12362 ( 0.2%) are discordant alignments
46.7% concordant pair alignment rate.


Thanks!

2
Entering edit mode

You should blast a few of the unmapped reads and see if you received the wrong samples.

0
Entering edit mode

Thanks Devon, I patiently did a BLAST against nr database with the 12 million reads that didn't map and I only got ~95 reads mappig a bacteria, which seems to be contation, but I still guessing what is the other huge percentage lost.

2
Entering edit mode

Maybe throw them at https://onecodex.com/ and see if that reveals any more contamination

0
Entering edit mode

:D, you just save me!

0
Entering edit mode

Have you run FastQC on your raw data? Do the read qualities look OK?

0
Entering edit mode

Hey! Yes, I did, actually the person that did this sequencing removed adapters and the quality is very good. But thank you!