Question: (Closed) Custom filteration of fastq file
gravatar for vivekr
5 weeks ago by
vivekr10 wrote:

I need to do custom filtration in fastq file as follows:

  1. Remove reads having at most 2 bases under quality score 20.

  2. Remove reads with unique sequence having read count less than 10.

I tried finding tools to do that but there is no such tool for above type of filterations. I also write some python script which use HTSeq package to read and process fastq file. But the script is extremely slow and take a day to process one file while I have 30 samples). Is there any fast way for this type of custom filtrations in fastq file.


ngs fastq • 114 views
ADD COMMENTlink written 5 weeks ago by vivekr10

Remove reads with unique sequence having read count less than 10.

Could you please explain why do you want to do this?

ADD REPLYlink written 5 weeks ago by finswimmer11k

I am not sure about its exact cause and this is also new to me. I am trying to reproduce one nature paper which has pipeline for small RNA seq data to identify mature + Isomir miRNAs. In order to reproduce the exact results, I am doing what exactly has been written in paper.

ADD REPLYlink written 5 weeks ago by vivekr10

Hello vivekr!

We believe that this post does not fit the main topic of this site.

PLease close this post as I am not getting any response.

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.


ADD REPLYlink written 25 days ago by vivekr10
Please log in to add an answer.
The thread is closed. No new answers may be added.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 742 users visited in the last hour