HI,
I have paired end reads, and want to extract reads which have the insert TGTATGTAAACTTCCGACTTCAACTGTA
in them.
I tried with grep -A2 -B1 "TGTATGTAAACTTCCGACTTCAACTGTA" input.fq |grep -v "^\-\-$"
> 1.fq and 2.fq
But they dont align with Bowtie2 anymore, because the reads have differing headers.
I even tried using bbduk.sh in1=input_1.fq in2=input_2.fq out1=matched_1.fq out2=matched_2.fq k=28 literal=TGTATGTAAACTTCCGACTTCAACTGTA rcomp=f
but it is of no avail.
Can someone help.
Regards.
Question: Read detection with pattern in paired end FASTQ file
0
amitpande74 • 10 wrote:
1
Pierre Lindenbaum ♦ 133k wrote:
paste <(cat fq1 | paste - - - - ) <(cat fq2 | paste - - - - ) | grep TGTATGTAAACTTCCGACTTCAACTGTA | tr "\t" "\n" > interleaved.fastq
1
GenoMax ♦ 94k wrote:
but it is of no avail.
You should set the value of k=
to something less than 1/2 of the length of string you are trying to search. Unless you do that the initial seed matches may not be found. I would try k=9
with your bbduk.sh
command.
because the reads have differing headers.
That is a different issue. Are your reads out of sync in R1/R2 files? If so you need to repair.sh
them.
Please log in to add an answer.
Use of this site constitutes acceptance of our User
Agreement
and Privacy
Policy.
Powered by Biostar
version 2.3.0
Traffic: 2217 users visited in the last hour