Hi, all I am new in bioinformatics and need your help on a simple question. I got two fastq files (forward.fastq and reverse.fastq) of one sample by RNA-seq. Then, I used fastqc to treat with two fastq files, separately and I could calculate number of raw reads, Q30, GC content of each fastq file. So how to calculate number of raw reads, Q30, GC content for this sample? Should I get the sum of raw read number of two fastq files for raw reads number of this sample, the mean value of Q30 or GC content of two fastq files for Q30 or GC content of this sample?
I was so confused about that. Thank you for your help.
yes, I got it. But Q30 or GC content of two fastq files is different, so which one is Q30 or GC content of this sample?
I'd report both forward and reverse Q30 and GC from FastQC, so four numbers. This is for quality control, so having all values available will avoid "hiding" bad data.
Can you provide the numbers for two files? As suggested by @Sean provide all numbers if you are giving them to someone.
Thanks you both. Q30 and GC content in forward fastq file are 50 and 92.54%, respectively and in reverse fastq file are 50 and 90.19%, respectively. Is these information enough? Is there a way to just get these information of the sample? Should I merge these two fastq files and then use fastqc to treat with the merged fastq file?
These numbers are functionally the same. No, I would not merge the fastq files. I would report two Q30 numbers and two GC numbers. In all cases where the experiment "worked" as expected, the numbers for read 1 and read 2 will be quite similar. If they are not, you NEED to know that.
Thanks again @Sean, I got it.