Remove part of a header in a fasta file
2
0
Entering edit mode
18 months ago
kcl58759 • 0

Hi I need help writing a command to remove part of a header from my scaffold fasta file. I have headers that look like

>scaffold3247|size3454
TTATATAACTAATTAGATAAAATAGCTAATAATAAAAGCTTCTATATAACTAGCCTTCTTTTAATCTATATAATAAGCTTAGCTAATAAAAAGGCCCACT
TTTTTTTCCA

>scaffold11172|size823
GCTCAGCATGCCGTTGCCAACGCCGCGGGCGCTCATTTGCTGCAATCCAGCCGCCTTATTCCTGCTGCTGTCCTTGAGAGCCACGAGCCGGCCACCGTTG
ACAAACGTCTGGAACCGTAACCCAGACTCAGGCCCTTTGTAAGGCAGAGGCAGGAGCATGTTGACACTCCCGGCTGCGAAAAGATCACCACCAACAGCGT
CTTGACCATCGTGAGGCCCCAGC

and i need to get rid of the |size part

so

>scaffold3247
TTATATAACTAATTAGATAAAATAGCTAATAATAAAAGCTTCTATATAACTAGCCTTCTTTTAATCTATATAATAAGCTTAGCTAATAAAAAGGCCCACT
TTTTTTTCCA

>scaffold111
GCTCAGCATGCCGTTGCCAACGCCGCGGGCGCTCATTTGCTGCAATCCAGCCGCCTTATTCCTGCTGCTGTCCTTGAGAGCCACGAGCCGGCCACCGTTG
ACAAACGTCTGGAACCGTAACCCAGACTCAGGCCCTTTGTAAGGCAGAGGCAGGAGCATGTTGACACTCCCGGCTGCGAAAAGATCACCACCAACAGCGT
CTTGACCATCGTGAGGCCCCAGC

I am a novice at this so I am sure there is a way to use awk or sed but I am quite lost! Any help would be greatly appreciated!

fasta • 704 views
ADD COMMENT
3
Entering edit mode
18 months ago
iraun 6.2k

A simple cut command could do it:

cut -d'|' -f1 input.fa > output.fa
ADD COMMENT
1
Entering edit mode
18 months ago
liorglic ★ 1.4k

Or you could use sed: sed 's/|.*//' input.fa > output.fa

ADD COMMENT

Login before adding your answer.

Traffic: 1736 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6