Question: (Closed) Trim the ID line of a fasta
0
gravatar for SaltedPork
17 months ago by
SaltedPork100
SaltedPork100 wrote:

I have a fasta file with some very silly ID's (straight from NCBI) that look like this:

>KF859747.1 HIV-1 isolate DEURF09UG005 from Uganda gag protein (gag) gene, complete cds; pol protein (pol) gene, partial cds; vif protein (vif), vpr protein (vpr), tat protein (tat), rev protein (rev), vpu protein (vpu), and envelope glycoprotein (env) genes, complete cds; and nonfunctional nef protein (nef) gene, complete sequence

If all I wanted was the >KF859747 part to be in my fasta, are there any nifty one liners to do this on the command line?

regex fasta • 459 views
ADD COMMENTlink modified 17 months ago by lieven.sterck5.5k • written 17 months ago by SaltedPork100

In case you are interested in obtaining such headers using a desktop GUI application, you may have a look at our SEDA software (http://www.sing-group.org/seda/). It has different options for processing headers as well as obtaining them using the "Statistics" option. Regards.

ADD REPLYlink written 17 months ago by Hugo150

Hello SaltedPork!

Commonly asked question. Please search Biostars first in future.

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

ADD REPLYlink modified 17 months ago • written 17 months ago by genomax70k
2
gravatar for apeltzer
17 months ago by
apeltzer140
Tuebingen, Germany
apeltzer140 wrote:

Something like

cat blabla.fasta | cut -f 1 -d " " > output.fasta

That should keep your FastA sequence, as it simply returns the first column (-f 1) of each row before a single dash (-d " ") .

ADD COMMENTlink written 17 months ago by apeltzer140
2
gravatar for lieven.sterck
17 months ago by
lieven.sterck5.5k
VIB, Ghent, Belgium
lieven.sterck5.5k wrote:

same principle but with sed

cat file.fasta | sed 's/ .*//g' > out.fasta
ADD COMMENTlink written 17 months ago by lieven.sterck5.5k
Please log in to add an answer.
The thread is closed. No new answers may be added.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 941 users visited in the last hour