Fastqc lsequence duplication and per base sequence content failed
1
0
Entering edit mode
13 months ago

I have around 150bp, paired-end RNA-seq data of 321 samples with around 30 Million reads per sample and I am interested in quantifying the expression for the transcriptome-wide association studies (TWAS). I have performed QC using fastqc. The per base sequence content and duplication level is high in most of the samples,

MultiQC

For the per base sequence content, the multiqc reports shows that the base content is high in the start of the sequence, especially for the first 10 bases. The fastqc tutoria; says "Whilst this is a true technical bias, it isn't something which can be corrected by trimming and in most cases doesn't seem to adversely affect the downstream" so shall I leave it as it is if it will not affect the downstream analysis or if there is a way to cater it?

Per_base_seq_content

For the error in sequence duplication we have used the coverage of around 30M per sample to get the expression of low expressed transcripts so its possible to have high sequence duplication level, but do I need to do something to control/solve this error?

Se

Am I good to proceed with the quantification using Salmon or do I need to perform some sort of action to improve the per-base sequence content and duplication level? I have checked

Suggestions will be highly appreciated

RNA-seq fastqc • 549 views
ADD COMMENT
3
Entering edit mode
13 months ago
GenoMax 141k

Yes you are good to proceed. If you notice any issues later in the analysis you can backtrack to check on these.

Please see following informative blog posts by authors of FastQC that should address your concerns:

https://sequencing.qcfail.com/articles/positional-sequence-bias-in-random-primed-libraries/
https://sequencing.qcfail.com/articles/libraries-can-contain-technical-duplication/

ADD COMMENT

Login before adding your answer.

Traffic: 2926 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6