Question: Process split samples
gravatar for iti.gupta
2.3 years ago by
iti.gupta10 wrote:

Hey, I am doing RNA-seq data analysis for a particular complex disorder. I have paired sample for each patient (i.e. both the tumor and non-tumor are taken from same patient). The samples which i got are like:

For Patient1

Patient1_R1_L1_01.fq.gz, Patient1_R1_L1_02.fq.gz, Patient1_R1_L1_03.fq.gz Patient1_R2_L1_01.fq.gz, Patient1_R2_L1_02.fq.gz, Patient1_R2_L1_03.fq.gz
For Patient2

Patient2_R1_L1_01.fq.gz, Patient2_R1_L1_02.fq.gz, Patient2_R1_L1_03.fq.gz, Patient2_R1_L1_04.fq.gz, Patient2_R1_L1_05.fq.gz Patient2_R2_L1_01.fq.gz, Patient2_R2_L1_02.fq.gz, Patient2_R2_L1_03.fq.gz , Patient2_R2_L1_04.fq.gz, Patient2_R2_L1_05.fq.gz

The problem is when I merge this sample using cat, the size of the concatenated file thus created varies for different patients (from ~2gb for one patient to ~9gb for other). I assume that difference in the file size will also be at the level of number of reads per sample.

So my question is how to go about it in such scenario? At what step and what kind of normalization should be performed to tackle this?

**P.S.: Please bear with me if i am asking too silly question. I am newby to this field

rna-seq • 558 views
ADD COMMENTlink modified 2.3 years ago by Devon Ryan95k • written 2.3 years ago by iti.gupta10

I suggest carry on with the sample processing for now, there are other things that must be done first that will influence the normalization (which is many steps down the line):

1) Install and run FASTQC

2) Install and run MULTIQC

3) Post the results of MULTIQC

You will then have a better idea of what is going on in the samples and can go from there.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by YaGalbi1.5k
gravatar for Devon Ryan
2.3 years ago by
Devon Ryan95k
Freiburg, Germany
Devon Ryan95k wrote:

The size difference will be handled once you import counts into DESeq2 or edgeR or similar programs. A 4.5x size difference is still within a reasonable range.

As pointed out by kennethcondon2007 though, do ensure that the quality is comparable across the smaller files.

ADD COMMENTlink written 2.3 years ago by Devon Ryan95k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1358 users visited in the last hour