Question: filter_fasta.py not removing sequences from fastq based on read IDs
0
gravatar for fjs5035
15 months ago by
fjs50350
fjs50350 wrote:

I'm attempting to use filter_fasta.py in macqiime to remove sequences from a fastq based on a .txt file of read IDs.

filter_fasta.py -f my_reads.fastq -o filtered_reads.fq -s read_ids.txt -n

The input file has 366000 reads. The output file is 366000 reads. Nothing is being removed. I ensured the read IDs are actually represented in the fastq with grep. My read ID .txt file has only one ID per line. Any ideas what could be wrong?

qiime sequence fastq sequencing • 517 views
ADD COMMENTlink modified 15 months ago by lakhujanivijay4.4k • written 15 months ago by fjs50350
2

Hi fjs5035

I just added a hyperlink to the script you are referring to.

Additionally, could you add the outputs of following commands, it will you get quick answers

  1. output few lines from your id file

    head -n 10 read_ids.txt

  2. output few read headers

    grep "^@" my_reads.fastq | head -n 10

ADD REPLYlink modified 15 months ago • written 15 months ago by lakhujanivijay4.4k

There is also a very useful tool in bbMap for your requirement. In case your issue persists, give a try with filterbyname.sh in bbMap suite of tools available at link.

ADD REPLYlink written 15 months ago by Jeffin Rockey1.1k
1

Just a hunch, does your read_ids.txt contain the "@" at the start of the fastq identifiers?

ADD REPLYlink written 15 months ago by cschu1811.8k

It does. I take it from your comment that that shouldn't happen.

ADD REPLYlink written 15 months ago by fjs50350

Look at this Remove Reads from fastq file based on read IDs

ADD REPLYlink written 15 months ago by lakhujanivijay4.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 818 users visited in the last hour