Do I need to go back and filter my long-reads?
0
1
Entering edit mode
9 weeks ago
eebloom ▴ 80

A while back I asked about filtering basecalls from dorado. I was taken up with other projects so am only returning to this particular dataset now.

I took people's advice and did not filter my data and aligned to human reference genome GRCh38. My most recent samples have higher basecall quality and better alignment but the older samples have more of a tail (see violin plot below).

Is this good enough or should I filter my data e.g. for basecall quality > Q9 ?

Plot of percent identity of long-read alignments

alignment nanopore filtering QC ONT • 499 views
ADD COMMENT
0
Entering edit mode

Is this good enough

You need to make the call. We don't know what you are planning to do with this data downstream.

older samples have more of a tail

That is a characteristic of the sample (assuming same experimental procedure and flowcell/pore, dorado model was used). You could track it as a batch in case that has any effect on the analysis later.

ADD REPLY
0
Entering edit mode

Thanks, I think it would be a good idea to track the results downstream to look for batch effects. I think I will filter the reads ultimately. I had previously been recommended to consider keeping all reads from basecalling. I think to perform variant calling on human data probably best to consider reads with higher quality and mappability.

ADD REPLY
0
Entering edit mode

I think I will filter the reads ultimately.

If your data was not run and basecalled using the same exact version of pore/software for all samples then you would be making this decision solely based on the visual representation you are seeing above. If everything was done exactly the same then you can decide the best course of action. You should apply the same exact filter to all samples to keep processing consistent.

Note: If this data is for patient care then you must use criteria that you can justify and feel confident about. We are only offering general advice based on features you show without having access to the entire dataset.

ADD REPLY
0
Entering edit mode

Do not delete posts that have received feedback. Engage with the user providing you feedback.

ADD REPLY
0
Entering edit mode

Apologies, I deleted the question as I wasn't sure it would be helpful to others and it didn't seem to have a clear answer, not to snub the feedback from @GenoMax, who's feedback I really value!

But I'm not sure I agree - I think you should have the right to delete a question you yourself have submitted. But that's just my opinion.

ADD REPLY
0
Entering edit mode

I think you should have the right to delete a question you yourself have submitted

If you were to talk to someone, can you make them forget that you talked to them? Similarly, if it has been a while and your post is unanswered/hasn't gotten any feedback, you can delete it because it's kind of like you said something out loud that no one responded to. Once you get some feedback though, it becomes a conversation that is useful not just to you but to others facing a similar problem - it might be the exact thing someone needs to get to their next step, and since this is an open science forum, the knowledge needs to remain open.

ADD REPLY

Login before adding your answer.

Traffic: 1713 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6