How can I make a CSV file with the read number of several fastq files in the same folder?
I have received my fastq files from an illumina sequencing run, I need to make a CSV file listing the sample ID and read number (it´s a test run to balance our sample pool with 500+ samples). I´ve looked for ways to achieve this and I can´t figure out how to automate the fastq read count, I tried counting the reads and dividing by 4 as described in other posts, but I can´t make it work for all the fastq files, besides, I have a different fastq for each read (R1, R2) and I suppose I should count both as a single file (is that correct?).
I´d really apreciate any pointers to solving this issue
While you can easily get this information by using the answer below (or use
seqkit stats
: https://bioinf.shenwei.me/seqkit/usage/#stats )ask your sequencing provider for this information. It is available in a csv file in the run reports.
Yes the matching two paired-end reads come from one unique library fragment. You should count unique lubrary fragments. Sometimes you will see Illumina stats counting both reads to get a total number (double dipping).