dear BioStars users,
I would like to extract from my pair-end fastq files information how many times my read is occurring in my fastq file.
So output could look -
my read (sequence) - how many times I found it in fastq file :
CCGGCTCGC - 140x CTTCGCGCC - 2x
I tried to use awk to comparing all reads to each other, but it does not work very well :-(
Is there any tool or idea how to compare all reads to each other and extract how many times is occurring my reads in fastq file?
Thank you so much for any idea and help! I hope my question is clear..