Question: trim leading T or A from fastq file
0
gravatar for meishengxiao86
4.7 years ago by
Germany
meishengxiao8630 wrote:

Hi, I am working with RNAseq data. I want to trim the leading T or A from my fastq file then perform mapping. Does anyone know how to do that? Thanks a lot!

Meisheng

rna-seq • 1.4k views
ADD COMMENTlink modified 4.7 years ago by Pierre Lindenbaum120k • written 4.7 years ago by meishengxiao8630
0
gravatar for Pierre Lindenbaum
4.7 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum120k wrote:

The following awk script should do the job (look at the length of the AT prefix in the 2nd line)

 

 awk '{if(NR%4==2) {L=0;if(match($0,/^[ATat]+/)>0) { L=RLENGTH;} $0=substr($0,L+1);} else if(NR%4==0) {$0=substr($0,L+1);} print;}' in.fastq > out.fastq

 

ADD COMMENTlink written 4.7 years ago by Pierre Lindenbaum120k

Hi Pierre, Thanks very much for your reply! I have tried the code you supplied, however it seems not workable. The following is the content of the error. 

awk: {if(NR%4==2) {L=0;if(match(-bash,/^[Tt]+/)>0) { L=RLENGTH;} -bash=substr(-bash,L+1);} else if(NR%4==0) {-bash=substr(-bash,L+1);} print;}
awk:                                                                  ^ syntax error
awk: {if(NR%4==2) {L=0;if(match(-bash,/^[Tt]+/)>0) { L=RLENGTH;} -bash=substr(-bash,L+1);} else if(NR%4==0) {-bash=substr(-bash,L+1);} print;}
awk:                                                                                                              ^ syntax error

Do you have any idea about this? Thanks!

 

Meisheng

 

ADD REPLYlink written 4.7 years ago by meishengxiao8630
1

make sure to copy/paste the code correctly, it does work

ADD REPLYlink written 4.7 years ago by Istvan Albert ♦♦ 80k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1254 users visited in the last hour