11,648 results • Page 1 of 233
Hi all, I'm looking for a simple solution for renaming fasta headers. I have this fasta header >trpE___AA_HMM___6fa05435949258489b608db9e58e5ba38821f2f26fffe5755daff43abin_id...gene_callers_id:113772|start:215745|stop:217260|length:1515 And I would like to rename it only like this >MALBOS1_000000117228 That means, remove everything before the pattern "contig:" and after…
updated 17 months ago • Diego
Hi All, Can someone help me to rename the following fasta headers >M42.PS.NODE_36229_length_665_cov_0.0565371:i:367638.gene_3804 >M43.PS.NODE_26456_length_662_cov_0.0603908...gt;M42_gene_3804 >M43_gene_16048 >M52_gene_14948 >M53_gene_3089 The fasta file has 80,0000 such headers! I am totally new :-) SO I would really appreciate away using…
updated 6.7 years ago • davkam30
I have about 100 multiple fasta files (e.g., file.faa), which I have to rename with the species name mentioned in the fasta header. The fasta headers of these...format: >XP_003072227.1 aminopeptidase N [Encephalitozoon intestinalis ATCC 50506] How do I rename 'file.faa' to 'Encephalitozoon intestinalis.faa'? I saw that people have used awk and sed for a similar purpose but could
updated 3.9 years ago • KG
Hi all, I would like to rename the headers of my fasta with a list of IDs. For each sequence I have this type of header . >range=chr16:946803-947997 In...txt file I have a list of IDs, in the same order than the sequences, that I would like to use as headers. I guess a simple approach based on bash/awk/sed should work, but I couldn't manage to do so. Cheers
updated 16 months ago • Bertrand
Hello Community! I know several similar questions have been asked but they all seem to want to rename their fasta headers entirely using a new variable name, or they have a separate text file with names that they would...Hello Community! I know several similar questions have been asked but they all seem to want to rename their fasta headers entirely using a new variable name, or they have a sep…
updated 4.6 years ago • FreshBio
Hi, I have mutiple fasta file and I want to change the header, for this I am using - awk '/^>/{print ">C1_" ++i; next}{print}' C1_pandaseq.fasta > C1_pandaseq_new.fasta...input fasta- >M03419:60:656544:1:1101:25150:3877:1 CCTACGGGTGGCTGCAGTGGGGAATTTTGGACAA >M03419:60:656544:1:1101:8498:4267:1...60:656544:1:1101:7884:4445:1 CCTACGGGTGGCAGCA…
updated 7.2 years ago • bioinformaticssrm2011
Hello guys, can you please help me. I want to rename headers of different sequences in large FASTA file. I want to rename each individual sequence with different name
updated 3.6 years ago • lukhanyomakhabane
Hello! I have a FASTA file and I want to change their headers into a new name. Through searching here on this platform I have found some relevant...but not saved. Also a huge portion of sequences is removed...and I do the first sequences their headers are not named..any idea what could be the problem. I am using Linus konsol. My input sequences > LTR-12 ATTGGAAAACAAACTATCCTACCTTC…
updated 3.6 years ago • lukhanyomakhabane
I have multiple fasta files like: >TestSample1/India/Jan/2021 CAACTTTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTC ACTCGGCTGCATGCTTAGTGCACTCACGCAGTATAATTAATAACTAATTACTGTCGTTGA...ACTCGGCTGCATGCTTAGTGCACTCACGCAGTATAATTAATAACTAATTACTGTCGTTGA I want to rename the header like this: >ABCD_0001 CAACTTTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTC ACTCGG…
updated 22 months ago • sunilthorat
Hi everyone, I'm encountering a problem with too long fasta headers. They get truncated at the 20th position by a program (TargetP) I'm using. Example: ``` >ConsensusfromContig10000...thousands of entries named "ConsensusfromContig1". Is there any software or any script I can use to rename the headers in a way that they are 20 characters long and still able to get identified? I have on…
updated 16 months ago • branokdrung
Hi I have around 85 gene sequences in individual fasta files. I'd like to rename each file with their header name containing the gene name in [gene=]. For each header, I only want what...is in-between the brackets. I'm trying to do this through linux commands. in fasta file input ``` >lcl|NC_018552.1_cds_YP_006666009.1_1 [gene=rps12] [locus_tag=C329_pgp044] [db_xref=GeneID:13540299...gb…
updated 5 months ago • sebabiokr
Hi all, Can anyone advise me on how to rename the header in my fasta file? Say I have a **seq.fa** file with transcript sequences: >TR1|c0_g1_i1 GTCGAGCATGGTCTTGGTCATCTTCCTTTCAAAGAA...TR6|c0_g1_i1 scaf0159424_10142.072_wgs How to I add contig names to the fasta file headers so that the "scaf0..." identifier comes before the "TR..."? Desired output: >scaf0432344_5…
updated 3.2 years ago • san.san
Hi, I have a fasta file **input.fa** which has duplicate fasta headers. >UID584_Org_name_strain JGTTQEWEILYNMCCDASSHHGTQEFGRKKQLAADBNGTAVQEGFN...LTTREKKQLAADBNGTAVQEGFNMENDLALFVTYBNGTAVQVIIPPTREWWQQLGTGTSEKKQLAADBNGTAVQEGF I want to rename all the headers with numbers before the first underscore '_' like this: >UID584.1_Org_name_strain JGTTQEWEILY…
updated 5.2 years ago • MB
HI, I had 1000's fasta sequences. How to rename/replace the fasta subsequence headers, >TRINITY ID's with >species names (given below). 1. &gt
updated 3.1 years ago • sunnykevin97
Hello, i want to change the fasta header of this input file : >M04631:312:000000000-C6V6K:1:2107:11495:1734 1:N:0:ACTGAGCG+TTATGCGA >M04631:312:000000000...C6V6K:1:2107:13059:1785 1:N:0:ACTGAGCG+TTATGCGA In this fasta header: >adjH001 >adjH002 >adjH.... >adjH099 >adjH100 what script do I have…
updated 5.3 years ago • kari_vo3
Hi, I have different fasta files. I want to keep some part of the headers and add a name to simplify the downstream analysis and since the ids in files...are not in continuation, so simply renaming in series using awk won't help. Some of my fasta headers are like this (augustus output file) >g1134t1 geneg1134 I want...to keep the header and just add the species_genus name after `&…
updated 13 months ago • mirza
Hi I had around 10,0000 gene sequences in individual fasta files. I'd like to rename each file with their header name containing the **gene name**. **Original file** head 1.fasta ==> 1.fasta
updated 2.0 years ago • sunnykevin97
I have large fasta files containing all the sequences of some large families of receptors, each sequence is currently indicated by the...ensembl ID. I would like to change each ensembl ID header to be the gene ID. I have a list of all the corresponding IDs (ensembl - gene), is there a way to change these headers? For example...I want to change this type of ID header: >ENSMUSG000000…
updated 7.7 years ago • pendragon
NCBI, the filename with be for example GCF_006351845.1_ASM635184v1_genomic.fna and the corresponding fasta header >NZ_CP040904.1 Enterococcus faecium strain N56454 chromosome, complete genome After some formatting, all...my fasta headers are like this, for example: >NZ_CP040904.1_Ef I would like to rename my filename like this Ef_GCF_006351845.1_ASM635184v1_genomic.fna..…
updated 3.7 years ago • genomes_and_MGEs
Hi, I have very little experience with scripts. I want to change my FASTA sequence headers (I have 100's of FASTA sequences per file) from very long headers to headers with the sample name (CM1) and
updated 6.4 years ago • mollysil
Here, you can see that each sequence has an id number (as like **EPI_ISL_1034736**) in the header. I want to keep only the id number in the header. The resulted file will be as like below: >EPI_ISL_1034736 CCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGG...Can any of you help me to achieve this? I can use the `seqkit replace` tool to rename with my own …
updated 2.9 years ago • arriyaz.nstu
I have about ~150 sub directories in a directory. Each of those subdirectories contains multiple fasta files. I want to rename all the fasta headers with adding the name of the file. I tried using this command: for i in ./*mg/*.faa; do...inplace '/>/{gsub(">","&"FILENAME"_");gsub(/\.faa/,x)}1' $i; done This command however renames the fasta headers with sub-fol…
updated 18 months ago • nitinra
Hi, I would like to create a new fasta file from the original genome fasta and a vcf file. The fasta file will only have full gene sequences included. I can use...o sample_SNV.fasta -V sample_SNV_selected.vcf -L ref_gene.bed But I would like the output fasta to have the gene names as the header. For instance the current fasta output from gatk is: >1 chr01:2350 AGAAAGGACAGAAAAA…
I'm having some problems trying to change headers of hundreds of fasta files. Each fasta file is a gene sequence for several species, but for some species header is different...Zootermopsis_sp >EOG7B67Q7|Ectop_sp|C265168_a_16_0_l_886|.|.|Zoractes_sp The structure of fasta files, for the first gene is something like >sp1 >sp2 >sp3 >EOG7B67Q7…
updated 2.2 years ago • Oscar
Hey guys, I have a multi-fasta file containing several extracted regions, such as >NZ_KI973281.1_1234..56789 atattgagctaaaaaaatcagttttccca...32476 tgcagaagtaagggggtaacaccatgcct... ... I would like to include strain name on fasta header, such as >Enterobacter_sp._MGH_6_NZ_KI973281.1_1234..56789 atattgagctaaaaaaatcagttttccca... >Enterobacter_horm…
updated 5.2 years ago • genomes_and_MGEs
Hi FASTA header looks like: >1570-13.segment.flu1_PB2 >1570-13.segment.flu2_PB1 >1570-13.segment.flu3_PA etc Filenames...looks like: 201301234.fasta I want to have FASTA headers that looks like: >201301234_PB2 >201301234_PB1 >201301234_PA I have seen this answer: https://www.biostars.org
updated 5.0 years ago • SaltedPork
Hey guys, I have tons of protein multi-fasta files and I would like to append the name of the file to the fasta-headers. For example, for a input file one.txt with the...headers >1 ATGC... >2 ATGCAT... I would like to have the output >one_1 ATGC... >one_2 ATGCAT... I use bbrename for DNA sequences, but
updated 4.1 years ago • genomes_and_MGEs
correct script below and thanks for the help! # this script is used to convert fastq files to fasta files # then to rename the fasta ID with the sample ID from the lab from Bio import SeqIO import sys # grabbing the file and...seq_file = sys.argv[1] labels = seq_file.split(".") # converting the file from fastq to fasta SeqIO.convert(seq_file,"fast…
updated 8.2 years ago • skbrimer
A/name1/175/2012 201200287,A/name2/287/2012 201200845,A/name3/845/2012 Currently my fasta headers look like: >201200175_AA >201200175_AB >201200175_BB and I want to change it to: >A/name1/175/2012_AA &gt...175/2012_AB >A/name1/175/2012_BB I want to preserve the suffix (_AA etc..). I have multiple fasta files, and t…
updated 5.0 years ago • SaltedPork
In a typical FASTA file, how can the header be used as its filename (i.e., replace the current file name with header ID) ? I have multiple such FASTA
updated 6.3 years ago • cerulean
Dear all, I got a large fasta file with two types of header, the header name for some sequences is `contig` and for some of them is `cx_gx_ix`. I want to rename...all headers named `contig` to another name say `c_g_i`. Please help me how I should do it
updated 19 months ago • seta
Hi all, I usually use some command to rename the header, but they changed the whole header. I want to re-name just part of the header in fasta file. In fact, the header
updated 13 months ago • seta
I have assembled transcript files with thousands of sequences with headers as like: ``` >TRINITY_DN50_c0_g2_i1 len=1961 path=[0:0-1960] >TRINITY_DN59_c0_g1_i2 len=1961 path=[0:0-1960] ``` But, I want...to rename them into as like: ``` >TRINITY_1 >TRINITY_2 ``` Just all sequences will retain with TRINITY adding chronological number
updated 10 months ago • Riyad
Hi all, I was wondering if anyone could help me with renaming the header lines on my FASTA files? I am using a simulated paired end metagenome. The data currently looks like: >r1.1...Where >r1.1 and .r1.2 are a set of paired end reads. In order to run MIRA, i need to rename the header lines into a more conventional format, i.e. >r1/1 and >r1/2. Does anyone k…
updated 7.9 years ago • mn1154
5.27 and 7.89 Gb) FASTQ files of transcriptome data from a collaborator. What they sent me has no header information for any of the sequences in the files. Here's the (truncated) first two sequences of the first file: @No name...etc., for the millions of sequences... but I need to make sure all the sequence data has individual header names that correspond to that particular sequence (so when I…
updated 4.7 years ago • Josh Herr
inside the file have titles that are actually the reference used to map data against. I want to rename the file, and all 8 sequences, with the same name, but the 8 segments also need to say which gene they are. I have started...script i started Not sure how this will display #!/bin/bash set -e #script to copy, rename, and edit fasta files #Needs an input file fastafile…
updated 5.7 years ago • James
Hi community, I know this is a common type of question since there are lots of posts about FASTA headers, but since I don't have a good basis in informatics or with sed's syntax I don't really know how to make the right command...So, I have a FASTA file that looks like this: >CDS CDS FIG00947432: hypothetical protein 2322:2624 forward MW:11399 MSSKYPVAYAVGQKIKSLRKSQGYTVFQLAKEIDIS…
updated 6.3 years ago • Alec Watanabe
My fastas look like: >123456789.1 AGCT >123456789.2 AGCT >222221122.1 AGCT The fasta does not have end of line characters...so the sequence is all on one line. The fasta headers have a .1 .2 .3 and so on up to .8 at the end of them. cat ids.txt 123456789 123456789.25/12/2019 222221122 222221122.03...03/2020 Desired Fasta: &…
updated 3.8 years ago • SaltedPork
Hello everyone, I would like to create my gene name in header of fasta file, Could anyone help me to create python script for me please? Original name from gene prediction tool : &gt...and I would like to convert name or rename like this : >XY000_2FG00001_00007 MCIECVRVLHGFGLNAYIGGARETKRGLGCATVKPATSFFNKGKKEKGLREVFSERYYYYTPWLATPYIKL
updated 7.1 years ago • nut_B
Hi! I have folder with multiple fasta files, each file has few sequences like this: ``` >KLTH0E08624g KLTH0E08624g MAREITDIKEFLELARRADVKTATVKINKKLNKSGKAFRQTKFKVRGSRYLYTLIVNDAG...I need to make a bash script which parse that files to get new headers in each fasta (first 4 letters): ``` >KLTH MAREITDIKEFLELARRADVKTATVKINKKLNKSGKAFRQTKFKVRGSRYLYTLIVNDAG ``` and save...bash and and…
updated 17 months ago • Jimpix
I have multifasta sequence and i am trying to rename the header and print the first id only >gene_1|id=abc|protein=capA|ABD5672.4 ATGCGTCGACGTCGTACGGGTTTT CGTACGGGTTATGCGTCGACGTCG
updated 7.2 years ago • ####
I have a .fasta file that I'm trying to use `bedtools getfasta` on. When I run it I get the error > WARNING. chromosome (N) was not found in the...FASTA file. Skipping. which I think is because my .fasta headers look like this >HWI-D00270:252:CB1D5ANXX:8:1303:19141:48584...TTTGTTTTCAGGTGTGGAATGTGCTTTCTACCACGGCTACAAATACTACAAAGGATGTAGTA Is there a way I can edit the header to…
updated 5.3 years ago • namea38
Hello! I'm building a database of a certain gene family. I downloaded the fastas from uniprot , concatenated the resulting fastas using `cat` and the fasta headers of each sequence have the following...that the gene information (the `GN=` part) is the first string after the first pipe sign (|) on each fasta header. is there a way to do that using awk or R string manipulation? I want that all my…
updated 24 months ago • v.berriosfarias
I need to rename a fasta file according to a blast output file where my fasta was a query. Not all sequences had a hit. **my.fasta:** >ASSI-1_0...AGGACACCAGAAACTTTCTCCAAAGCTGAATTTGTGTATT The way I went about this is this: 1. Pull out fasta headers and get rid of ">": `grep "^>" my.fasta | sed -e 's/>//g' > headers.fasta` 2. Cut first two columns ou…
updated 7.6 years ago • san.san
Hi I would like to change the long header of the contigs to be shorter. How can I conduct that using one of Galaxy's tools? Regards
updated 11 months ago • adnan.lahuf
I did was check almost one by one that errors and I did a table, and it looks like: Variants Rename Hypothetical protein Hypothetical protein Hypothetical protein, putative Hypothetical protein hypothetical...but with same database, so if I don´t do anything I will have th…
updated 7.8 years ago • Buffo
I have so many bacterial Refseq fasta files and want to parse the headers in the fasta files to see if is there any word 'chromosome' in the headers, as a side note...there are some sequences in FASTA files started with '>' so i want to parse all the lines staring with '>' . I know i have files that do not have word like 'chromosome...I would like to separate the files with header '…
updated 5.7 years ago • Shelle
Hi all, I have fasta files containing sequences from different loci (one fasta file per individual) and would like to change the headers...to then merge everything into one big fasta file. This is the format of the first two lines of one my fasta files (called individual1-allele1.fa): >lcl|contig_13517...contig_22604 AAGGATTAAAAATGAAAACTATGCAAAACTATGAGGAATAAAACTTCTTACATCTGAACT …
updated 13 months ago • Zoe
I have a fasta file how to rename the sequence according to the name of the file, for example, the sequence name in the gene.fasta file...is >1, >2, renamed >gene1, gene2. Thank you very much for telling me
updated 2.2 years ago • 1774815377
Hello I have a fasta file with amino acids sequences, which was translated from nucleotide sequence by transeq. >CE99543_15407_1 MAV..... >CE99641_51257_1...MSQ...... I want to delete `_1` at the end of fasta header because it was somehow added by transeq. >CE99543_15407 MAV..... >CE99641_51257 MSQ...... Just deleting `_1` would not work...because there are hea…
updated 4.7 years ago • ysas
11,648 results • Page 1 of 233
Traffic: 2850 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6