PacBio - reads with Q<20
1
0
Entering edit mode
2.1 years ago
pingu77 ▴ 20

Hi all,

I am fairly new to PacBio data analysis and I have a question:

Why do I need to extract the hifi reads? What is the meaning of the other reads that have a Q<20 mean? Should I simply ignore those reads? If I include them in the analysis, are the results reliable?

Thank you for your time!

pacbio hifi quality • 797 views
ADD COMMENT
1
Entering edit mode
2.1 years ago

This really all depends on what you plan to do with the data downstream. Most of the current downstream applications expect HiFi (>=Q20) data. For instance, ff you're going to generate a _de novo_ assembly with hifiasm or call small variants with DeepVariant, including the <Q20 reads will cause problems with accuracy, memory usage, and runtime. For detecting structural variation with pbsv, if you use the correct parameters, you might get some added value from the <Q20 reads.

ADD COMMENT
0
Entering edit mode

thanks for your answer! But why are there so many reads with Q<20? I wasn't able to find this information. Also, I would like to look at the repeat regions, I think if I include reads with Q<20 I will have memory usage problem

ADD REPLY

Login before adding your answer.

Traffic: 3480 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6