Question: conditional expression in sed ...?
0
gravatar for CAnna
2.9 years ago by
CAnna10
CAnna10 wrote:

Hi,

I have a file looking like this

@HISEQ:229:C81CCANXX:1:1101:10157:17161/1
AAAAAAAAAAAAAAAAAAAA
+
CCCCCGGGG/1GGGGGGCEEG
@HISEQ:229:C81CCANXX:1:1101:10741:22239/1
GCCTTGCTATTGACTCTACT
+
BBBB@EEGGGGGDGGEGGGG
@HISEQ:229:C81CCANXX:1:1101:10901:88419/1
GCTTAGGGATTTTATTGGTA

I would like to remove this /1 at the end of the lines (read names).

I did

sed -i -e 's/\/1//g' MyFile.txt

But the problem is that is also removes the /1 occurring in the middle of the 4th line (sequence quality).

Is there a way to substitute the /1 only on lines starting with @HISEQ (a sort of conditional expression) ?

I also tried:

awk -F "/" '/^@HISEQ/{ $2 = "" ; print $0 }' essai.change.name> file.txt

The problem then is that I have only the @HISEQ lines and I loose the lines in between.

Thank you!

C. Anna

sequence • 694 views
ADD COMMENTlink modified 2.9 years ago by Pierre Lindenbaum119k • written 2.9 years ago by CAnna10

Avoid the -i flag on sed if you are not sure what you are doing, better would be to just pipe the result to e.g. head to check if the command performed as you thought.

ADD REPLYlink written 2.9 years ago by WouterDeCoster38k
0
gravatar for Pierre Lindenbaum
2.9 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:

'$' for end of the line:

sed 's%/1$%%'

Is there a way to substitute the /1 only on lines starting with @HISEQ

sed '/^@HISEQ/s%/1$%%'
ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by Pierre Lindenbaum119k

Does it work with GNU sed? (No change in result when I tried)

I did the following

sed '/^@HISEQ/s/\/1//' in.fq
ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by venu6.1k
$ echo -e "@HISEQa\1\nx\1" | sed '/^@HISEQ/s/\\1//'
@HISEQa
x\1

$ sed --version
sed (GNU sed) 4.2.2
ADD REPLYlink written 2.9 years ago by Pierre Lindenbaum119k

This one

sed '/^@HISEQ/s%/1$%%'

worked perfectly! Thank you

I would like to make sure I understand the syntax though.

It says: for the lines starting with @HISEQ (^@HISEQ), delete (s%), the character "1" (1$%%). Is this right ?

I'm not sure to understand the function of the %%

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by CAnna10

sed 's%a%b%' is the same than 's/a/b/' using '%' instead of '/' make it simplier because I don't need to escape things. e.g: 's%http://google.com/%url%'

ADD REPLYlink written 2.9 years ago by Pierre Lindenbaum119k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1424 users visited in the last hour