Question: unequal fastq files
gravatar for sagardesai91
9 months ago by
IBAB, Bengaluru, India
sagardesai9150 wrote:

Hello everyone, I have in-house sequenced raw transcriptome data. The issue is, that the number of reads for different samples are varying by a huge margin. Eg: one fastq file has approx 10 million reads where as the other file has approx 70 million reads. Is there a way to remove reads from the second file to bring it down to approximately the same size as the previous one without introducing any biases before going ahead with differential expression analysis?

ADD COMMENTlink modified 9 months ago by ATpoint30k • written 9 months ago by sagardesai9150


Selecting Random Pairs From Fastq?

Extracting randomly subset of fastq reads from a huge file

Quickest way to extract subset of reads from huge fastq file

Extracting reads and subsetting a fastq file for equal depth is the same, even though any of the established differential analysis tools will normalize for library size automatically. One typically does not filter files manually. Please read the documentation of e.g. limma, edgeR or DESeq2 first to get a good background in RNA-seq analysis.

ADD REPLYlink modified 9 months ago • written 9 months ago by ATpoint30k

I dont't have an issue with extracting a subset of reads. My issue is with unequal library size leading to a huge difference in the number of raw reads, which needs to be normalised before I start processing the fastq files.

ADD REPLYlink written 9 months ago by sagardesai9150
gravatar for Benn
9 months ago by
Benn7.9k wrote:

You don't need to do that, try limma voom instead when library sizes are quite variable between samples.

ADD COMMENTlink written 9 months ago by Benn7.9k

Okay, thank you! I will look into it

ADD REPLYlink written 9 months ago by sagardesai9150
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 637 users visited in the last hour