Question: removal of spesific read from fastq file
2
gravatar for blooming.daisy333
6 months ago by
blooming.daisy33320 wrote:

Hellow, Can someone kindly let me know that how to remove a specific read from paired end fastq file using awk or any other command...???

next-gen • 373 views
ADD COMMENTlink written 6 months ago by blooming.daisy33320
4
gravatar for cpad0112
6 months ago by
cpad01129.3k
India
cpad01129.3k wrote:

some example data would help. If you know the read by id, then try (seqkit is available here and you can write output to fastq)

$ seqkit grep -r -p <read_id> -v input.fastq

example:

$ seqkit grep -v -rp 'K00193:38:H3MYFBBXX:4:2119:24527:21657/1' hcc1395_normal_rep1_r1.fastq.gz
ADD COMMENTlink modified 6 months ago • written 6 months ago by cpad01129.3k

Thank you so much for kind support. im newbie to NGS and linux and interested to know that Is it possible to remove specific reads using linux commands only rather to use ant toolkit. thank

ADD REPLYlink written 6 months ago by blooming.daisy33320
1

Assuming that you have fastq gzipped,

$ zgrep "@" input.fastq.gz | grep -v  "<readname to be excluded>" | while read line; do zgrep -A 3  $line input.fastq.gz ; done
  1. First argument zgreps @ in each line. This is to print all the headers.
  2. Second argument searches all the headers that doesn't match the provided read name
  3. Here, zgrep can be used. However, zgrep seems to have some limitations. Hence a while loop. If you do not like loop, you can use parallel (GNU-parallel available in most of the distros)

Please direct the output to a file of your choice

$ zgrep "@" input.fastq.gz | grep -v  "<readname to be excluded>"  | parallel zgrep -A 3 {} input.fastq.gz

If you have fastq unzipped, try this:

$ sed  -n '/@/!d; /< read name>/!p' test.fastq | grep -A 3 -f -  test.fastq

(note: if sample read id contains strand information (/1 or /2), make sure that they are escaped. For eg. if read id is K00193:38:H3MYFBBXX:4:2119:24527:21657/1, sed command would be:

$ sed  -n '/@/!d; /K00193:38:H3MYFBBXX:4:2119:24527:21657\/1/!p' test.fastq | grep -A 3 -f -  test.fastq
ADD REPLYlink modified 6 months ago • written 6 months ago by cpad01129.3k

thank you so much for kind help.. I will try it and will discuse the output,.. many thanks agauin..

ADD REPLYlink written 6 months ago by blooming.daisy33320
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1074 users visited in the last hour