I have a fastq file from a sequencing run, and I have a list of read names, which I wish to exclude from the fastq files. I am having some trouble filtering the whole entry based on names only. I have pasted my attempts and rationale below, and any assistance would be much appreciated!
cat names_pc3.txt
a308dbd5-df59-47f2-a92a-7068ffd0ce8d
353002fa-3d36-4e03-a8c4-b6e983464bf9
697f8ebb-9fa7-487e-ae91-8095bfb9968b
05887f55-8028-4524-8000-71e72de789d0
I can exclude the line containing the read name with grep, however I wish to remove the whole 4 line entry for each read.
To remove the line containing the read name, this works:
grep --no-group-separator -v -f names_pc3.txt PAQ11486_pass_barcode01_52e0f3ff_4295e9c8_0.fastq
To view the entry for the reads I want to exclude, this works:
grep --no-group-separator -A 3 -f names_pc3.txt PAQ11486_pass_barcode01_52e0f3ff_4295e9c8_0.fastq
However, when I try to do both at the same time using the command below, it just prints all entries in the fastq:
grep --no-group-separator -A 3 -v -f names_pc3.txt PAQ11486_pass_barcode01_52e0f3ff_4295e9c8_0.fastq
Would love some help with a different approach or correction to my current approach if possible! Thanks.
Thank you so much Pierre, that worked perfectly! Much appreciated.