Question: Search Motif In Raw Reads
2
gravatar for Nicolas Rosewick
5.4 years ago by
Belgium, Brussels
Nicolas Rosewick6.7k wrote:

Hi,

How can I find a specific pattern (~20nt long) and its reverse complement from my raw reads (paired-end data so 2 fastq files) ? and then extract them into 2 new fastq files ?

Thanks a lot,

N.

motif search read • 2.5k views
ADD COMMENTlink modified 5.4 years ago by k.nirmalraman930 • written 5.4 years ago by Nicolas Rosewick6.7k
1

And what does this pattern look like? How specific - like no mismatches/indels?

ADD REPLYlink written 5.4 years ago by Martin A Hansen3.0k
4
gravatar for Martin A Hansen
5.4 years ago by
Martin A Hansen3.0k
Denmark
Martin A Hansen3.0k wrote:

Use Biopieces www.biopieces.org) and try something like this:

read_fastq -i file1.fq,file2.fq | patscan_seq -i -c -p acgtactagctagctactagc[2,1,1] | grab -p PATTERN -K | write_fastq -o matched.fq -x

This allows for 2 mismatches, 1 insertion and 1 deletion (and ambiguity codes).

ADD COMMENTlink written 5.4 years ago by Martin A Hansen3.0k
4
gravatar for Pierre Lindenbaum
5.4 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum112k wrote:

gunzip and paste both fasts on the fly, with awk , group by '4 lines', test if the pair matches your needs (here first read starts with the regex "^ATGGG[GC]AAAA*" split the output into two files using awk.

paste <(gunzip -c input_1.fastq.gz) <(gunzip -c input_2.fastq.gz)  |\
awk '{a[i]=$0;++i;if(i==4){split(a[1],S,"\t"); if(S[1] ~ "^ATGGG[GC]AAAA*") {for(j=0;j<4;++j) printf("%s\n",a[j]);}; i=0;}}' |\
awk -F '    ' '{print $1 >> "select_1.fq"; print $2 >> "select_2.fq";}'
ADD COMMENTlink written 5.4 years ago by Pierre Lindenbaum112k
0
gravatar for k.nirmalraman
5.4 years ago by
k.nirmalraman930
Germany
k.nirmalraman930 wrote:

Well, MEME might not be the right choice for this, but FIMO could be used... It allows motif discovery based on PWM .

ADD COMMENTlink written 5.4 years ago by k.nirmalraman930
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1409 users visited in the last hour