I'm new to the field of bioinformatics and have been reading a lot about 16S and RNA-seq analysis, as well as FastQC interpretation. I understand that FastQC is used for DNA analysis, and there will be different plot interpretations for DNA and RNA. My specific question is about the feasibility of per base sequence content analysis and whether I should be concerned about a particular plot. I'm also unsure about when to consider the Qscore as good or bad in relation to per-base analysis. Lastly, how can I determine if my sample has low nucleotide diversity if my %GC falls within the expected range?
I made research about it and found this on https://training.galaxyproject.org/training-material/topics/sequence-analysis/tutorials/quality-control/tutorial.html but I'm still unsecure.
Thank you a lot!
Measure Value - Forward Filename SETO7-16S_S7_L001_R1_001.fastq.gz File type Conventional base calls Encoding Sanger / Illumina 1.9 Total Sequences 79999 Total Bases 20 Mbp Sequences flagged as poor quality 0 Sequence length 64-251 %GC 58
Measure Value - Reverse Filename SETO7-16S_S7_L001_R2_001.fastq.gz File type Conventional base calls Encoding Sanger / Illumina 1.9 Total Sequences 79999 Total Bases 20 Mbp Sequences flagged as poor quality 0 Sequence length 64-251 %GC 56
Phred Q-score - Forward
Phred Q-score - Reverse
Per base - Forward
Per base - Reverse
Per base - Galaxy Training
GenoMax Thank you so much for replying. I was quite unsure about the qualities. One more question. There is no problem within nucleotide diversity by looking at per base plot, right?
Since a specific region is being amplified it is expected that many of the library fragments will have identical sequences. Thus the low nucleotide diversity.