Question

How to do preprocessing from fastqc

0

Entering edit mode

15 months ago

Lulu • 0

Could someone provide a detailed guide on downstream preprocessing steps based on my FASTQC report? I conducted FASTQC analysis on a paired sample, and here are the results for one of the pairs, which showed similar outcomes. While I am familiar with interpreting FASTQC reports, I am unsure if additional steps are necessary before running STAR. I plan to use Cut Adapt to remove adaptors since they were detected and showed a yellow warning in the report. However, if the adaptor content is indicated as green in the FASTQC report, would it still be necessary to trim it? Additionally, many of my samples exhibit high duplication levels. Should I use Picard to deduplicate before conducting a differential expression analysis? Lastly, how should the red indication for the Per Base Sequence Content be interpreted in the report, and is there a recommended course of action to address it?

enter image description here

fastqc scRNA-seq single-cell • 830 views

ADD COMMENT • link updated 15 months ago by Ram 45k • written 15 months ago by Lulu • 0

1

Entering edit mode

Please read the following blogs from authors of FastQC:

https://sequencing.qcfail.com/articles/libraries-can-contain-technical-duplication/
https://sequencing.qcfail.com/articles/positional-sequence-bias-in-random-primed-libraries/

I plan to use Cut Adapt to remove adaptors since they were detected

That would be fine though you don't need to strictly do it if you are aligning with STAR. STAR will "soft-clip" parts of reads that do not align, which will include adapter sequences.

Note: If this is single cell data (based on included tag) then FastQC is going to be of limited use.

ADD REPLY • link 15 months ago by GenoMax 154k

0

Entering edit mode

Thank you for your help! This is single cell data! I'm confused about why fastqc is of limited use?

ADD REPLY • link 15 months ago by Lulu • 0

0

Entering edit mode

Perhaps I should qualify my comment to say that if this was 10x single cell then using FastQC is of limited use. What kind of single cell data is this? If there are specific instructions on data processing included in the kit used then you should follow those.

ADD REPLY • link 15 months ago by GenoMax 154k