rename the contents of a file
1
0
Entering edit mode
3.9 years ago
AbdelAbdel ▴ 30

I extracted genes from prokka .ffn file. I found copies of gene sequences, but the ID differs. I wanted to know how I could change the ID of each of the ???

example I did to replace (.fasta_*_) with (_) :

>2549870-Q2398_S6.fasta_00234_Adenylate_kinase

sed -i 's:.fasta_*******_:_:g'

>2549870-Q2398_S6_Adenylate_kinase

my question: with which character do I replace the stars for you to recognize any number?

thank you in advance

sequencing prokka • 613 views
ADD COMMENT
0
Entering edit mode
3.9 years ago
Joe 21k

You can replace the * with [0-9] which is a character class for digits only.

Depending on whether there are always 5 or not, you might want to use [0-9]{5} or perhaps even just any number of digits between those underscores ([0-9]+). We don't know how consistent your files are going to be though so its impossible to say which you need as the right balance of strictness versus 'sensitivity'.

ADD COMMENT

Login before adding your answer.

Traffic: 2006 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6