Question: What does a 'normal' FASTQC report look like for scRNA-seq data?
0
gravatar for MutationalMeltdown
6 weeks ago by
MutationalMeltdown20 wrote:

I'm interested specifically in 'failures' I got on R2 (R1 being just the barcode + UMI). Note this is 10X sincle cell RNA-sequencing (scRNA-seq) data.

Test 'failures':

  • Per tile sequence quality (quality appears slightly worse at about 54bp (the middle of the read), with a corresponding increase in Ns; I'm not sure how to interpret this, having read this)
  • Per base sequence content (at start of read only, for this reason I think)
  • Per sequence GC content (there is a shifted peak to the left, i.e. AT-rich)
  • Sequence Duplication Levels (I guess due to highly expressed genes)
  • Overrepresented sequences (no hits; up to 1%)
  • Kmer Content (again at start of read only)

Thank you for your help in interpreting this!

fastqc rna-seq fastq 10x scrna-seq • 165 views
ADD COMMENTlink written 6 weeks ago by MutationalMeltdown20

I have generally not bothered to run FastQC on 10x data. Once you do the analysis using 10x software you can start paying attention to the metrics (no of cells, reads per cell etc).

You could run FastQC on data 10x makes available on their site to see how your data compares.

ADD REPLYlink written 6 weeks ago by genomax71k

Thanks for the suggestion! I will do that.

Perhaps my question is too broad, but the thing that concerns me the most is that there is 1-2bp with slightly lower quality, although not bad enough to fail the "Per base sequence quality" test. These correspond to red blocks on the "Per tile sequence quality" and a small spike in the "Per base N content". Is this something to be concerned about?

I've ran CellRanger and the downstream metrics you mention such as reads per cell etc. seem okay

ADD REPLYlink written 6 weeks ago by MutationalMeltdown20

If you had a few bad tiles or N's in places those sequences should have been taken care by STAR, which cellranger users to do the alignments. If your run metrics look okay in CellRanger then you are probably fine to proceed.

ADD REPLYlink modified 6 weeks ago • written 6 weeks ago by genomax71k

Thanks! I think the run metrics are okay, but "Reads Mapped to [the mouse] Genome" is 90% for some samples and 80% for others, which is lower than in the 10x example data you linked to where it's over 95% (e.g. here), would that concern you?

ADD REPLYlink written 6 weeks ago by MutationalMeltdown20

Test data provided by 10x is likely ideal. So it is not terribly surprising that yours does not look as good. 90% alignments is not bad but this is something you will need to judge yourself. I will see if one of the other mods with more experience with user data wants to chime in.

ADD REPLYlink modified 6 weeks ago • written 6 weeks ago by genomax71k

It's a bit easier to judge the results if you provide the actual figures. From what you've been describing, neither FastQC nor CellRanger summaries are raising any red flags.

ADD REPLYlink written 6 weeks ago by Friederike5.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1989 users visited in the last hour