Question: Creating Unique Fasta Headers
0
gravatar for peerlesshm
2.7 years ago by
peerlesshm0
peerlesshm0 wrote:

hi,

I have a fast file similar to below; I have already run sed -n and down a regular expression in text wrangler to remove the headers leaving just a ">" followed by the sequence. The file is pretty massive and i want to know how to make each identifier unique either by using a simple python script or some other command (Not including awk due to assignment restrictions) - I've considered using linenumber = linenumber +1 for a count but I'm unsure.

I want each header to be something like s1234 with the being s1235, s1236 etc or something rather.

Thanks!

> AGCTCAGATGCTGATCGATAGACTAG > GATGCTAGCTAGCTAGATCGATCGAT > ACGACTACAGATAGTAGATGATAGAC

python shell bioinformatics fasta • 1.2k views
ADD COMMENTlink modified 2.7 years ago by igor7.1k • written 2.7 years ago by peerlesshm0
0
gravatar for igor
2.7 years ago by
igor7.1k
United States
igor7.1k wrote:

I think fastx_renamer from the FASTX-Toolkit will do that: http://hannonlab.cshl.edu/fastx_toolkit/commandline.html#fastx_renamer_usage

ADD COMMENTlink written 2.7 years ago by igor7.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1365 users visited in the last hour