script to print fasta file
0
0
Entering edit mode
4.1 years ago
ravi.eshwari ▴ 10

Hi,

i have fasta files which look like this

>CATGTAGTGATTGATAGTGATA(1)
CATGTAGTGATTGATAGTGATA
>ATCCGTGAGCTTGAAGGATCCGCC(1)
ATCCGTGAGCTTGAAGGATCCGCC
>AAAACTACATATACATTCGGATT(2)
AAAACTACATATACATTCGGATT
>CCTGCATAGAGGATTCCGAAC(1)
CCTGCATAGAGGATTCCGAAC
>CATGAACAAGATGTTTGAGAACT(1)
CATGAACAAGATGTTTGAGAACT

I need to edit the header for each sequence and along with it print the sequences based on the abundance with unique header sequence

can someone kindly help me with this

How do i do this either in linux or shell scripting

Thank you!

sequence fasta • 897 views
ADD COMMENT
0
Entering edit mode

And what have you tried so far?

ADD REPLY
1
Entering edit mode

i used the following command

awk '/^>/{print ">seq1" ++i; next}{print}' < SRR1266859.fa > SRR1266859.new.fa

it did change the header sequence but i now need to also print the sequence with more than one abundance with unique header

ADD REPLY
0
Entering edit mode

Please provide an example of the output you would expect - your question is unanswerable at the moment.

Edit it how? What abundance?

ADD REPLY
1
Entering edit mode

In addition: Technically the posted example qualifies for fasta format but having sequence repeated in the header is not going to make this easy to decipher.

ADD REPLY

Login before adding your answer.

Traffic: 1981 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6