I need to get rid of all the reads with 3' prime A in a fastq file and get the new fastq without them. How ccould you acheive this ?
I need to get rid of all the reads with 3' prime A in a fastq file and get the new fastq without them. How ccould you acheive this ?
See if this does the trick:
cat your.fastq | paste - - - - | awk -F '\t' '{if ($2 !~/A$/){ print $0}}'| tr "\t" "\n" > filtered.fastq
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Just to be clear you are only talking about discarding reads where the last base is
A. Can you modify the answers from your question yesterday to start thinking about how you can do this: Command to count reads in fastq file with last basesCurious as to why you want to do this.
Why do you want that ; I am really curious!
By that, do you mean you don't want reads like these
AAAGTACGATCACTACTACATC
AAGTACGATTAACTACTACATC
AGTGTACGGGGATCACTACTAC
But these will be okay?
I want reads like: AATTTATATGGGAGCCAC But not: GATTAGGGCCGCGGGATA
I need to analyze small RNA structures so need to do this with reads not ending with A