Question: removal of spesific read from fastq file
1
gravatar for blooming.daisy333
4 months ago by
blooming.daisy33310 wrote:

Hellow, Can someone kindly let me know that how to remove a specific read from paired end fastq file using awk or any other command...???

next-gen • 219 views
ADD COMMENTlink written 4 months ago by blooming.daisy33310
3
gravatar for cpad0112
4 months ago by
cpad01128.3k
India
cpad01128.3k wrote:

some example data would help. If you know the read by id, then try (seqkit is available here and you can write output to fastq)

$ seqkit grep -r -p <read_id> -v input.fastq

example:

$ seqkit grep -v -rp 'K00193:38:H3MYFBBXX:4:2119:24527:21657/1' hcc1395_normal_rep1_r1.fastq.gz
ADD COMMENTlink modified 4 months ago • written 4 months ago by cpad01128.3k

Thank you so much for kind support. im newbie to NGS and linux and interested to know that Is it possible to remove specific reads using linux commands only rather to use ant toolkit. thank

ADD REPLYlink written 4 months ago by blooming.daisy33310
1

Assuming that you have fastq gzipped,

$ zgrep "@" input.fastq.gz | grep -v  "<readname to be excluded>" | while read line; do zgrep -A 3  $line input.fastq.gz ; done
  1. First argument zgreps @ in each line. This is to print all the headers.
  2. Second argument searches all the headers that doesn't match the provided read name
  3. Here, zgrep can be used. However, zgrep seems to have some limitations. Hence a while loop. If you do not like loop, you can use parallel (GNU-parallel available in most of the distros)

Please direct the output to a file of your choice

$ zgrep "@" input.fastq.gz | grep -v  "<readname to be excluded>"  | parallel zgrep -A 3 {} input.fastq.gz

If you have fastq unzipped, try this:

$ sed  -n '/@/!d; /< read name>/!p' test.fastq | grep -A 3 -f -  test.fastq

(note: if sample read id contains strand information (/1 or /2), make sure that they are escaped. For eg. if read id is K00193:38:H3MYFBBXX:4:2119:24527:21657/1, sed command would be:

$ sed  -n '/@/!d; /K00193:38:H3MYFBBXX:4:2119:24527:21657\/1/!p' test.fastq | grep -A 3 -f -  test.fastq
ADD REPLYlink modified 4 months ago • written 4 months ago by cpad01128.3k

thank you so much for kind help.. I will try it and will discuse the output,.. many thanks agauin..

ADD REPLYlink written 4 months ago by blooming.daisy33310
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 705 users visited in the last hour