2
2
Entering edit mode
5.5 years ago

Hi all, I have fastQ file and I need to rename it using sed command. below the explanation :

The read names in my files are

@HWI-ST365:251:D0RP0ACXX:5:1101:4471:2213#12_1
@HWI-ST365:251:D0RP0ACXX:5:1101:4471:2213#12_2


And i want to transform them in the format:

@HWI-ST365:251:D0RP0ACXX:5:1101:4471:2213#12/1
@HWI-ST365:251:D0RP0ACXX:5:1101:4471:2213#12/2


reads fastq sed lunix sequencing • 3.9k views
0
Entering edit mode

What have you tried so far?

0
Entering edit mode

based of some forum, I tried sed -i 's/_///g' myfile but I'm not pro of linux I don't know how to do..

1
Entering edit mode

Avoid sed -i when you are not sure that your command will be the right one. If you do something incorrectly, it will corrupt your original file. When trying different sed commands, you may want to run

sed 's/from/to/g' <input> | head (to only look at the first lines)

or

sed 's/from/to/g' <input> | head | less -S ( in the case of long lines)

2
Entering edit mode
5.5 years ago
iraun ★ 3.8k

Well, you're very close to the solution. You only need to scape to '/' character: sed -i 's/_/\//g' should work.

Just a little advice, try to call to sed command in the following way:

cat file.fq | sed 's/_/\//g' > reformat.fq


In this way you can go back to the original input file in the case that something has gone wrong. In my opinion it is a good practice.

1
Entering edit mode

Just as an alternative tr '_' '/' < file.fq > new_file.fq

0
Entering edit mode

Thank you I'll try this.

1
Entering edit mode
5.5 years ago
michael.ante ★ 3.7k

Check your fastq format. If you have Phred +64 (Illumina 1.3 or 1.5) you can run into a encoding problem: in Phred +64, '_' is a valid encoding for a quality score, '/' is not. Thus, you'll need to check if you are in the header-line or not (e.g. using awk: awk '{if(NR%4==1){gsub(/_/,"/")}; print}'' )

1
Entering edit mode

I agree that you should verify the absence of _ in your quality sequence before to simply go for a sed 's/_/\//g'. because if there is you will change all your quality score coded _ by a new existing score \