tab to fasta file conversion
2
1
Entering edit mode
6.3 years ago

can someone kindly help me out how to convert tab delimited protein file (ID in one column and sequence in second column) into fasta file ?? any simple solution plz

thanks

rna-seq • 10k views
ADD COMMENT
0
Entering edit mode

Try something-there are multiple solutions. If you get stuck, post your efforts and errors.

ADD REPLY
0
Entering edit mode

some example data?

ADD REPLY
0
Entering edit mode

Something like

protein1      AGCHCGCGAC
protein2      GAGCSFATHCK
ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

it is easy to make, using for example the interface galaxy, this function is present here

ADD REPLY
5
Entering edit mode
6.3 years ago
drkennetz ▴ 560

I think you could do this with awk, give this a try:

awk '{print ">"$1"\n"$2}' tab.tsv > seqs.fa

Let me know if that works for you!

Dennis

edit: $1 is your name column and $2 is your sequence column, so switch those if the order is sequence, name.

ADD COMMENT
3
Entering edit mode
6.3 years ago

if first column doesn't have >:

awk -v OFS="\n" '{print ">"$1,$2}' test.txt
sed -e 's/^/>/;s/\t/\n/g' test.txt 
parallel  --colsep '\t'  echo -e '\>{1}\\n{2}'  :::: test.txt
ADD COMMENT

Login before adding your answer.

Traffic: 1005 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6