Question: change headers from fasta files
0
gravatar for ulises.rodriguez
2.5 years ago by
ulises.rodriguez0 wrote:

I have multifasta files containing sequences like these:

>16S_ribosomal_RNA
attgcaggtcagcatactgcagtgaattcgttcc

>16S_ribosomal_RNA
attgcaggtcagcatactgcagtgaattcgttcc

>16S_ribosomal_RNA
attgcaggtcagcatactgcagtgaattcgttcc

these sequences are contained in fasta files named as following :

Bacteria_especie_strain.fasta

and I would like that headers has the same name that the multifasta file :

>16S_ribosomal_RNA_Bacteria_especie_strain.fasta 
attgcaggtcagcatactgcagtgaattcgttcc

> 16S_ribosomal_RNA_Bacteria_especie_strain.fasta 
attgcaggtcagcatactgcagtgaattcgttcc

>16S_ribosomal_RNA_Bacteria_especie_strain.fasta 
attgcaggtcagcatactgcagtgaattcgttcc
sequence • 908 views
ADD COMMENTlink modified 7 months ago by Biostar ♦♦ 20 • written 2.5 years ago by ulises.rodriguez0
3
gravatar for cschu181
2.5 years ago by
cschu1812.5k
cschu1812.5k wrote:
awk -v fn="Bacteria_especie_strain.fasta"  '/^>/ { print $0"_"fn; next; } { print $0; }' Bacteria_especie_strain.fasta > Bacteria_especie_strain.fasta.modified_headers
ADD COMMENTlink written 2.5 years ago by cschu1812.5k
3
gravatar for h.mon
2.5 years ago by
h.mon31k
Brazil
h.mon31k wrote:

See answers from Question: Changing names of Fasta headers.

ADD COMMENTlink written 2.5 years ago by h.mon31k
1
gravatar for cpad0112
2.5 years ago by
cpad011214k
Hyderabad India
cpad011214k wrote:

Output from awk:

 $ awk   '/^>/ { print $0"_"FILENAME; next}1' Bacteria_especie_strain.fasta

>16S_ribosomal_RNA_Bacteria_especie_strain.fasta
attgcaggtcagcatactgcagtgaattcgttcc
>16S_ribosomal_RNA_Bacteria_especie_strain.fasta
attgcaggtcagcatactgcagtgaattcgttcc
>16S_ribosomal_RNA_Bacteria_especie_strain.fasta
attgcaggtcagcatactgcagtgaattcgttcc

output from sed and parallel:

$ parallel  'sed "/^>/ s/.*/&_{}/g"' {} ::: Bacteria_especie_strain.fasta 

>16S_ribosomal_RNA_Bacteria_especie_strain.fasta
attgcaggtcagcatactgcagtgaattcgttcc
>16S_ribosomal_RNA_Bacteria_especie_strain.fasta
attgcaggtcagcatactgcagtgaattcgttcc
>16S_ribosomal_RNA_Bacteria_especie_strain.fasta
attgcaggtcagcatactgcagtgaattcgttcc

input:

$ cat Bacteria_especie_strain.fasta 

>16S_ribosomal_RNA
attgcaggtcagcatactgcagtgaattcgttcc
>16S_ribosomal_RNA
attgcaggtcagcatactgcagtgaattcgttcc
>16S_ribosomal_RNA
attgcaggtcagcatactgcagtgaattcgttcc
ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by cpad011214k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1170 users visited in the last hour