Question: How to add some text to a sequence name
0
gravatar for alexandrapuertolas
2.2 years ago by
alexandrapuertolas0 wrote:

Hi all!

I have a fasta file with the following sequence names:

>ITSPF14_7020;size=17;
TTTCCGTAGGTGAACCTGCGGAAGGATCATTACCACACCTTCGACGGCTGCTGCTGCGTGGCGGGCCCTATCACTGGCGAGCGT
TTGGGTCCCTCTCGGGGGAACTGAGCTAGTAGCCTCTCTTTTAAACCCATTCTGT..............

>ITSPF14_733;size=110;
TCGGAGTAAAATCTCGACGGCTGCTGCTGCGTGGCGGGCCCTATCACTGGCGAGCGT
TTGGGTCCCTCTCGGGGGAACTGAGCTAGTAGCCTCTCTTTTAAACCCATTCTGT...............

And I would like to add a label after the sequence name that already has, without removing it:

>ITSPF14_7020;size=17; DQ071354_1_Phytophthora_cactorum_strain_Shakuyaku1_1_beta_tubulin_gene_partial_cds

I have an excel file that relates each sequence ID (ITSPF14_7020) with their correspondent label

(DQ071354_1_Phytophthora_cactorum_strain_Shakuyaku1_1_beta_tubulin_gene_partial_cds)

Does anyone know how can I do this directly with a script ?? (and save me lot of time if I have to do it manually!)

Thanks a lot for your help!

Alexandra

next-gen sequence • 524 views
ADD COMMENTlink modified 2.2 years ago by Pierre Lindenbaum120k • written 2.2 years ago by alexandrapuertolas0
1

Renaming fasta headers according to a matching name list

ADD REPLYlink written 2.2 years ago by genomax67k
1

*linearize the fasta, * use awk to create a new column containing the accession number with awk

awk -F '[>;]' '{printf("%s\t%s\n",$2,$0);}'

  • sort on first column *extract the accession number from the 2nd file and sort
  • join both files using linux 'join'
  • use awk to convert back to fasta.

A little search in biostars.org will allow you to find numerous examples for each steps...

ADD REPLYlink written 2.2 years ago by Pierre Lindenbaum120k

what genomax2 said : Renaming fasta headers according to a matching name list

ADD REPLYlink written 2.2 years ago by Pierre Lindenbaum120k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1847 users visited in the last hour