How to create a fasta file from a list of sequences
3
0
Entering edit mode
13 months ago
Alex S ▴ 20

I have a txt file with more than a thousand DNA sequences as follows:

seq-name1 DNA-sequence1
seq-name2 DNA-sequence2
seq-name3 DNA-sequence3

Does anyone know a code to transform this file into a fasta file?

>seq-name1
DNA-sequence1
>seq-name2
DNA-sequence2
>seq-name3
DNA-sequence3
fasta sequences DNA • 865 views
ADD COMMENT
2
Entering edit mode
13 months ago
Mensur Dlakic ★ 27k

This command prints > followed by the contents of the first column, then a new line character (\n) followed by second column. It is a fairly trivial operation and should be easy to find many similar solutions by Googling this site or the whole internet.

awk '{print ">"$1"\n"$2}' input.txt > output.fas
ADD COMMENT
0
Entering edit mode
13 months ago
tothepoint ▴ 800

You can try sed 's/\(seq-name[0-9]\)\s\(DNA-sequence[0-9]\)/>\1\n\2/g' input_file > output_file

ADD COMMENT
0
Entering edit mode
13 months ago
size_t ▴ 120

perl: perl -ae 'print ">$F[0]\n$F[1]\n";' in >out.fa

ADD COMMENT

Login before adding your answer.

Traffic: 2092 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6