Question: FastQC report about WGBS.
0
gravatar for hxlei613
3.4 years ago by
hxlei61390
Bangladesh
hxlei61390 wrote:

Hi~ I'm working on some WGBS data now.

After quality and adapter trimming, Sequence Duplication Levels and Per sequence GC content still cannot pass . In Per sequence GC content, the read peak is higher than theoretical distibution. Is this ok ?

Thank you very much if you can provide some help !

Here are some pictures from FastQC after trimming.

Per base sequence content There is a small fluctuation at the first few bases.Should I trim it ? At the end, the sharp decrease of A at the last position is a result of removing the adapter sequence very stringently, i.e. even a single trailing A at the end is removed.

Per Sequence GC content

Sequence duplication levels Should I deduplicate sequence during quality control ( before mapping ) or filtering reads after alignments using deduplicate_bismark ?

wgbs trimming • 1.9k views
ADD COMMENTlink modified 15 months ago by Biostar ♦♦ 20 • written 3.4 years ago by hxlei61390
0
gravatar for igor
3.4 years ago by
igor9.1k
United States
igor9.1k wrote:

Don't worry too much about FastQC reports. They are very conservative. It's nearly impossible to have them all pass. You should manually look at the report and see if each section makes sense.

See previous discussion here about FastQC: FastQ quality check : what can we correct ?

And you can use deduplicate_bismark to remove duplicates, which is convenient if you are using Bismark for everything else.

ADD COMMENTlink written 3.4 years ago by igor9.1k

Thank you very much ! Do you mean duplicates can be kept before mapping ?

ADD REPLYlink written 3.4 years ago by hxlei61390

Correct, there's no reason to bother deduplicating before alignment.

ADD REPLYlink written 3.4 years ago by Devon Ryan93k

Yes. Usually you identify duplicates if two reads (or read pairs) align to the same exact spot in the genome.

ADD REPLYlink written 3.4 years ago by igor9.1k

For pair-end alignments, does bismark consider a duplicate if both partner reads start and end at the exact same position ? Or if only one of the partner reads ?

ADD REPLYlink written 3.4 years ago by hxlei61390

Oh, I figure it out. A duplicate is which both partner reads start and end at the exact same position. Thank you very much.

ADD REPLYlink written 3.4 years ago by hxlei61390
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 683 users visited in the last hour