Entering edit mode
2.7 years ago
17318598206
▴
20
Hi, I wanna extract some reads which contain my interest sequence information from pair-end fastq files,at least 10 bases can be matched, with a maximum of 2 mismatches allowed . How could I extract these reads?
interest sequence like:
GTTTAATTGAGTTGTCATATGTTAATAACGGTAT
CAAATTAACTCAACAGTATACAATTATTGCCATA
thank you!
For PE, try cutadapt. Your interest sequence is too long (~ 34 nt) and if minimum 10 bases to be matched, rest bases are mismatched and would be higher than 2 mismatches. Requirements are confusing. Are these two sequences are specific (one for R1 and the other for R2) or applicable to both forward and reverse reads?