Question: script to print fasta file
0
gravatar for ravi.eshwari
19 days ago by
ravi.eshwari10
ravi.eshwari10 wrote:

Hi,

i have fasta files which look like this

>CATGTAGTGATTGATAGTGATA(1)
CATGTAGTGATTGATAGTGATA
>ATCCGTGAGCTTGAAGGATCCGCC(1)
ATCCGTGAGCTTGAAGGATCCGCC
>AAAACTACATATACATTCGGATT(2)
AAAACTACATATACATTCGGATT
>CCTGCATAGAGGATTCCGAAC(1)
CCTGCATAGAGGATTCCGAAC
>CATGAACAAGATGTTTGAGAACT(1)
CATGAACAAGATGTTTGAGAACT

I need to edit the header for each sequence and along with it print the sequences based on the abundance with unique header sequence

can someone kindly help me with this

How do i do this either in linux or shell scripting

Thank you!

sequence fasta • 120 views
ADD COMMENTlink modified 19 days ago • written 19 days ago by ravi.eshwari10

And what have you tried so far?

ADD REPLYlink written 19 days ago by lakhujanivijay4.8k
1

i used the following command

awk '/^>/{print ">seq1" ++i; next}{print}' < SRR1266859.fa > SRR1266859.new.fa

it did change the header sequence but i now need to also print the sequence with more than one abundance with unique header

ADD REPLYlink modified 19 days ago by lakhujanivijay4.8k • written 19 days ago by ravi.eshwari10

Please provide an example of the output you would expect - your question is unanswerable at the moment.

Edit it how? What abundance?

ADD REPLYlink written 19 days ago by Joe16k
1

In addition: Technically the posted example qualifies for fasta format but having sequence repeated in the header is not going to make this easy to decipher.

ADD REPLYlink written 19 days ago by genomax80k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1096 users visited in the last hour