Removing a DOT after each sequence
2
0
Entering edit mode
3.9 years ago
Ric ▴ 440

Hi, I have a FASTA file which contains a . after each sequence. How is it possilble to remove the dot?

>gene39576 gene=rps16
MVKLRLKRCGRKQRAVYRIVAIDVRSRREGKDLRKVGFYDPIKNQTYLNVPAILYFLEKG
AQPTGTVQDILKKAEVFKELRPNQS.
>gene39578 gene=psbK
MLNTFSLIGICLNSTLFSSSFFFGKLPEAYAFLNPIVDIMPVIPLFFFLLAFVWQAAVSF
R.
>gene39579 gene=psbI
MLTLKLFVYTVVIFFVSLFIFGFLSNDPGRNPGREE.
>gene39580 gene=NitoCp007
MIPDVILDVKNKIKRGFPCLIFKFSYDLVYSTHLTKNKNKGFRNLKKKNQVINGKRGIRT
LGTINSYNGLAIRRFSPLSHLSQLKKIITT.
>gene39584 gene=atpA
MVTIRADEISNIIRERIEQYNREVKVVNTGTVLQVGDGIARIHGLDEVMAGELVEFEEGT
IGIALNLESNNVGVVLMGDGLLIQEGSSVKATGRIAQIPVSEAYLGRVINALAKPIDGRG
EISASEFRLIESAAPGIISRRSVYEPLQTGLIAIDSMIPIGRGQRELIIGDRQTGKTAVA
TDTILNQQGQNVICVYVAIGQKASSVAQVVTTLQERGAMEYTIVVAETADSPATLQYLAP
YTGAALAEYFMYRERHTLIIYDDPSKQAQAYRQMSLLLRRPPGREAYPGDVFYLHSRLLE
RAAKLSSSLGEGSMTALPIVETQSGDVSAYIPTNVISITDGQIFLSADLFNSGIRPAINV
GISVSRVGSAAQIKAMKQVAGKLKLELAQFAELEAFAQFASDLDKATQNQLARGQRLREL
LKQSQSAPLTVEEQIMTIYTGTNGYLDSLEVGQVRKFLVELRTYLKTNKPQFQEIISSTK
TFTEEAEALLKEAIQEQTDRFILQEQA.

Thank you in advance,

sequence bioawk • 2.2k views
ADD COMMENT
1
Entering edit mode

I have tried both the cmd suggested above.

seqkit replace -sp "\." -r "" test.fa and grep -v "\." test.fa

&

seqkit replace -sp "\." -r "" test.fa and grep -v "\." test.fa

But i am still having the dots. I am adding the sequence here for you convenience (9th charecter on the stringļ¼‰.

Can i please get any suggestions? @pierre lindenbaum cpad0112

>asap_1791
LTICQSKS.LCLGLGINLHLMYLGEIAPKRMRGILTLTCAVYLSIGKLLAQVIGLKELMGTEDMWPYLLA
ADD REPLY
0
Entering edit mode

You can use the following:

sed 's/\.//g' test.fa > test2.fa
ADD REPLY
2
Entering edit mode
3.9 years ago

sed :

 sed '/^[^>]/s/\.$//'
ADD COMMENT
0
Entering edit mode

It turns out a DOT is also available in the middle of each sequence. How it be possible to extend the sed command?

ADD REPLY
0
Entering edit mode

Did you try sed '/^[^>]/s/\.//' ? @ Ric

ADD REPLY
1
Entering edit mode
3.9 years ago

try seqkit replace -sp "\." -r "" test.fa and grep -v "\." test.fa on OP fasta sequences.

ADD COMMENT

Login before adding your answer.

Traffic: 2208 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6