Question: change headers from fasta files
0
gravatar for ulises.rodriguez
24 months ago by
ulises.rodriguez0 wrote:

I have multifasta files containing sequences like these:

>16S_ribosomal_RNA
attgcaggtcagcatactgcagtgaattcgttcc

>16S_ribosomal_RNA
attgcaggtcagcatactgcagtgaattcgttcc

>16S_ribosomal_RNA
attgcaggtcagcatactgcagtgaattcgttcc

these sequences are contained in fasta files named as following :

Bacteria_especie_strain.fasta

and I would like that headers has the same name that the multifasta file :

>16S_ribosomal_RNA_Bacteria_especie_strain.fasta 
attgcaggtcagcatactgcagtgaattcgttcc

> 16S_ribosomal_RNA_Bacteria_especie_strain.fasta 
attgcaggtcagcatactgcagtgaattcgttcc

>16S_ribosomal_RNA_Bacteria_especie_strain.fasta 
attgcaggtcagcatactgcagtgaattcgttcc
sequence • 784 views
ADD COMMENTlink modified 6 weeks ago by Biostar ♦♦ 20 • written 24 months ago by ulises.rodriguez0
3
gravatar for cschu181
24 months ago by
cschu1812.3k
cschu1812.3k wrote:
awk -v fn="Bacteria_especie_strain.fasta"  '/^>/ { print $0"_"fn; next; } { print $0; }' Bacteria_especie_strain.fasta > Bacteria_especie_strain.fasta.modified_headers
ADD COMMENTlink written 24 months ago by cschu1812.3k
3
gravatar for h.mon
24 months ago by
h.mon29k
Brazil
h.mon29k wrote:

See answers from Question: Changing names of Fasta headers.

ADD COMMENTlink written 24 months ago by h.mon29k
1
gravatar for cpad0112
24 months ago by
cpad011213k
India
cpad011213k wrote:

Output from awk:

 $ awk   '/^>/ { print $0"_"FILENAME; next}1' Bacteria_especie_strain.fasta

>16S_ribosomal_RNA_Bacteria_especie_strain.fasta
attgcaggtcagcatactgcagtgaattcgttcc
>16S_ribosomal_RNA_Bacteria_especie_strain.fasta
attgcaggtcagcatactgcagtgaattcgttcc
>16S_ribosomal_RNA_Bacteria_especie_strain.fasta
attgcaggtcagcatactgcagtgaattcgttcc

output from sed and parallel:

$ parallel  'sed "/^>/ s/.*/&_{}/g"' {} ::: Bacteria_especie_strain.fasta 

>16S_ribosomal_RNA_Bacteria_especie_strain.fasta
attgcaggtcagcatactgcagtgaattcgttcc
>16S_ribosomal_RNA_Bacteria_especie_strain.fasta
attgcaggtcagcatactgcagtgaattcgttcc
>16S_ribosomal_RNA_Bacteria_especie_strain.fasta
attgcaggtcagcatactgcagtgaattcgttcc

input:

$ cat Bacteria_especie_strain.fasta 

>16S_ribosomal_RNA
attgcaggtcagcatactgcagtgaattcgttcc
>16S_ribosomal_RNA
attgcaggtcagcatactgcagtgaattcgttcc
>16S_ribosomal_RNA
attgcaggtcagcatactgcagtgaattcgttcc
ADD COMMENTlink modified 24 months ago • written 24 months ago by cpad011213k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1577 users visited in the last hour