Question

Fastq reads to csv - How can I make a csv with the read number of several fastq files in the same folder?

0

Entering edit mode

1 day ago

alx.alo • 0

How can I make a CSV file with the read number of several fastq files in the same folder?

I have received my fastq files from an illumina sequencing run, I need to make a CSV file listing the sample ID and read number (it´s a test run to balance our sample pool with 500+ samples). I´ve looked for ways to achieve this and I can´t figure out how to automate the fastq read count, I tried counting the reads and dividing by 4 as described in other posts, but I can´t make it work for all the fastq files, besides, I have a different fastq for each read (R1, R2) and I suppose I should count both as a single file (is that correct?).

I´d really apreciate any pointers to solving this issue

illumina csv read fastq count • 2.0k views

ADD COMMENT • link updated 1 day ago by GenoMax 153k • written 1 day ago by alx.alo • 0

2

Entering edit mode

I need to make a CSV file listing the sample ID and read number (it´s a test run to balance our sample pool with 500+ samples)

While you can easily get this information by using the answer below (or use seqkit stats: https://bioinf.shenwei.me/seqkit/usage/#stats )

ask your sequencing provider for this information. It is available in a csv file in the run reports.

I have a different fastq for each read (R1, R2) and I suppose I should count both as a single file

Yes the matching two paired-end reads come from one unique library fragment. You should count unique lubrary fragments. Sometimes you will see Illumina stats counting both reads to get a total number (double dipping).

ADD REPLY • link 1 day ago by GenoMax 153k

score 3 · Answer 1 · 2025-09-16

3

Entering edit mode

1 day ago

Pierre Lindenbaum 166k

assuming the files are gzipped and end with R1.fastq.gz:

find /path/to/dir  -type f -name "*.R1.fastq.gz" | while read F
do
     echo -n "${F}," && gunzip -c "${F}" | paste - - - - | wc -l
done

or just use FastQC.

ADD COMMENT • link 1 day ago by Pierre Lindenbaum 166k