11,355 results • Page 1 of 228
Hello! I'm building a database of a certain gene family. I downloaded the fastas from uniprot , concatenated the resulting fastas using `cat` and the fasta headers of each sequence have the following...that the gene information (the `GN=` part) is the first string after the first pipe sign (|) on each fasta header. is there a way to do that using awk or R string manipulation? I want that all my…
updated 23 months ago • v.berriosfarias
I have so many bacterial Refseq fasta files and want to parse the headers in the fasta files to see if is there any word 'chromosome' in the headers, as a side note...there are some sequences in FASTA files started with '>' so i want to parse all the lines staring with '>' . I know i have files that do not have word like 'chromosome...I would like to separate the files with header '…
updated 5.6 years ago • Shelle
Hi all, I'm looking for a simple solution for renaming fasta headers. I have this fasta header >trpE___AA_HMM___6fa05435949258489b608db9e58e5ba38821f2f26fffe5755daff43abin_id
updated 17 months ago • Diego
Hi all, I have fasta files containing sequences from different loci (one fasta file per individual) and would like to change the headers...to then merge everything into one big fasta file. This is the format of the first two lines of one my fasta files (called individual1-allele1.fa): >lcl|contig_13517...contig_22604 AAGGATTAAAAATGAAAACTATGCAAAACTATGAGGAATAAAACTTCTTACATCTGAACT …
updated 13 months ago • Zoe
Hello I have a fasta file with amino acids sequences, which was translated from nucleotide sequence by transeq. >CE99543_15407_1 MAV..... >CE99641_51257_1...MSQ...... I want to delete `_1` at the end of fasta header because it was somehow added by transeq. >CE99543_15407 MAV..... >CE99641_51257 MSQ...... Just deleting `_1` would not work...because there are hea…
updated 4.7 years ago • ysas
I am trying to open the fasta file with sequences. My bio perl script opens the sequence but not with fasta header. How can I get fasta header with bio...perl -w use Bio::SeqIO; $seqio_obj = Bio::SeqIO->new(-file=> "no_plasmid.fasta", -format=>"fasta"); $seq_obj = $seqio_obj->next_seq; $acc = $so->accession_number; while($seq_obj = $seqio_obj->next_s…
updated 7.7 years ago • bandanaschapagain
Hi All, Can someone help me to rename the following fasta headers >M42.PS.NODE_36229_length_665_cov_0.0565371:i:367638.gene_3804 >M43.PS.NODE_26456_length_662_cov_0.0603908...gt;M42_gene_3804 >M43_gene_16048 >M52_gene_14948 >M53_gene_3089 The fasta file has 80,0000 such headers! I am totally new :-) SO I would really appreciate away using…
updated 6.7 years ago • davkam30
Hi, I have a fasta file with 300 protein sequences. I intend to construct a phylogenetic tree with it. I would want only the accession number...and the organism name in the fasta header and remove the rest of the information. Can anybody suggest how to do this? I have a linux based system with perl...and python installed. For example, i want to convert a header like this: >gi|685204…
updated 8.0 years ago • bkvijay.jayaraman
I have a fasta file with headers that I want to compare with two other text files and if it's present in the first file, put it first on the...to compare the text files to the fasta header and if there's a match, organize/reorder the fasta header file so that that match name is the first entry on the header...look like this. (for the first one since fly is present in file2, place it as the first …
updated 17 months ago • jnora0625
Hi, I have 10 fasta files (each file with 20 gene sequences from each of the 10 samples). I would like to create 20 files, specific to each gene...from 10 samples. I proceeded as follows to extract genes with the file_name in header: pyfasta extract --header --fasta test.fasta gene_name1 | awk '/^>/ {$0=$0 "_sample1"}1' > gene_name1.fasta Output: >gene_na…
updated 6.7 years ago • bioinfo8
I have fasta file namely `119XCA.fasta` as shown below, >cellulase ATGCTA >gyrase TGATGCT >16s TAGTATG I need to remove all the...fasta headers, keep the sequences one by one and need to write file name as a fasta header. The expected outcome is shown below...TAGTATG I have used the following script `sed '/^>/d' foo.fa > out.fa` which re…
I have strange fasta headers like this for some good number of sequences, >gi|61221638|sp|P0A366.1| >gi|61221640|sp|P0A368.1|CR1AA_BACTE...I would like to replace the other (`>gi`) in the fasta header to blank or `;`. Can anyone suggest how to do it. I have many such sequences in a big fasta file
updated 4.7 years ago • empyrean999
Hi, pls, let me know how can i edit the fasta file header. >LR99555.1 Avo, chromosome: 1 I want this header like this. >LR99555.1
updated 2.0 years ago • p
if this has been asked before, but I have a genome assembly file that I just converted from .bam to fasta format in order to start annotation. I would like to run CEGMA on this assembly, because I have concerns about the quality...but the problem is that the default header format when the fasta was created is not acceptable. This is because in the current format here are 5237924 sequences...with …
updated 22 months ago • zgayk
I have large fasta file. As you see below there are > sign present in some fasta header like >exon2_ENST00000218032|>exon2_ENST00000218032...gt;exon17_ENST00000253024|>exon17_ENST00000253024 I want to remove the >sign from the header sequence, after remove the header is then look like this >exon2_ENST00000218032|exon2_ENST00000218032 &…
updated 3.1 years ago • harry
Hello, i want to change the fasta header of this input file : >M04631:312:000000000-C6V6K:1:2107:11495:1734 1:N:0:ACTGAGCG+TTATGCGA >M04631:312:000000000...C6V6K:1:2107:13059:1785 1:N:0:ACTGAGCG+TTATGCGA In this fasta header: >adjH001 >adjH002 >adjH.... >adjH099 >adjH100 what script do I have…
updated 5.2 years ago • kari_vo3
I have downloaded a reference uniprotkb FASTA file. How can I only extract the FASTA headers of each gene (raw-wise) into a CSV file using R
updated 13 months ago • WUSCHEL
Hi community, I am not an expert with sed but i want to edit the headers of each sequence in a fasta file. I want to let only the gene id **>NODE_39_length_59461_cov_85.505003_1** The header
updated 2.2 years ago • Candela
Hi, I would like to modify the fasta headers from a file. I would like to change: >A0A0F2M4U6|A0A0F2M4U6_SPOSC Endoplasmic reticulum chaperone BiP OS
updated 2.4 years ago • marcus.teixeira
NCBI, the filename with be for example GCF_006351845.1_ASM635184v1_genomic.fna and the corresponding fasta header >NZ_CP040904.1 Enterococcus faecium strain N56454 chromosome, complete genome After some formatting, all...my fasta headers are like this, for example: >NZ_CP040904.1_Ef I would like to rename my filename like this Ef_GCF_006351845.1_ASM635184v1_genomic.fna..…
updated 3.7 years ago • genomes_and_MGEs
I'm having some problems trying to change headers of hundreds of fasta files. Each fasta file is a gene sequence for several species, but for some species header is different...I'm having some problems trying to change headers of hundreds of fasta files. Each fasta file is a gene sequence for several species, but for some species header is different for each gene, for example: >EOG7B0…
updated 2.2 years ago • Oscar
Hello Community! I know several similar questions have been asked but they all seem to want to rename their fasta headers entirely using a new variable name, or they have a separate text file with names that they would like to use to replace their sequence headers. In my case, I just want to rename the fasta headers in my very large fasta file using a chunk from the header already present. Her…
updated 4.6 years ago • FreshBio
Hello I have a lot of sequences in a FASTA file, and I want to extarct a specific sequence knowing the header ID. for example the header of a sequence is: NODE_19_length_5758_cluster_19_candidate_1...I know that with `grep` I can extract the header, but i want the below sequences to appear on stdout. How can I do this on bash
updated 3.3 years ago • v.berriosfarias
In this example of fasta sequence, you see there is some repeat of fasta sequence many times.for example- exon19_ENST00000194900|exon21_ENST00000194900...exon18_ENST00000194900|exon21_ENST00000194900 So I want to remove all fasta sequence which has the same header in the fasta file and keep only 1 fasta sequnece. I want to remove fasta sequence on...the basis of header not the sequence. Thanks in…
updated 3.1 years ago • harry
Hey guys, I have a multi-fasta protein file like this >SF_hydrolase MKG... >LH_reductase MKI... >SM_hydrolase MSN... Basically, I would like to extract...only the fasta headers that have the other "reductase". I know how to extract headers that have the same headers as the ones present on...a list, but I don't know how to extract fasta-headers solely based on o…
updated 5.1 years ago • genomes_and_MGEs
Hello Everyone Can anyone you guide me editing of the fasta header file. My fasta header file shown as below >NP_006556.1 transcriptional repressor CTCF isoform 1 [Homo sapiens
updated 3.6 years ago • bioinformatics.queries
I have about 100 multiple fasta files (e.g., file.faa), which I have to rename with the species name mentioned in the fasta header. The fasta headers of these
updated 3.9 years ago • KG
I have large fasta files containing all the sequences of some large families of receptors, each sequence is currently indicated by the...ensembl ID. I would like to change each ensembl ID header to be the gene ID. I have a list of all the corresponding IDs (ensembl - gene), is there a way to change these headers? For example...I want to change this type of ID header: >ENSMUSG000000…
updated 7.7 years ago • pendragon
So I have a director full of fasta files and I want to change the fasta header in each one by the name of their corresponding fasta file. For example: HC1993.fa...gt; X58834 CCTGCATCTGCAA HC1993.fa > HC1993 CCTGCATCTGCAA I have about 50 fasta files like that in a directory that I was to do the same thing to. I've been using this sed command for one file that works...sed '…
updated 3.8 years ago • tpaisie
Hello everybody! I have a fasta file I'm looking to work with in qiime. Unfortunately, it doesn't currently meet their formatting requirements. I need...to change headers like this: >3180275|DCO_MAC_Bv6--LI09_3|40099 XXXXXXXXXXXXXXXXXXXXXX >13488354|DCO_MAC_Bv6--LD09_2_3|2 XXXXXXXXXXXXXXXXXXXXXX...gt;333430241|DCO_MAC_Bv6--LO13_8|1 XXXXXXXXXXXXXXXXXXXXXX To…
updated 11 months ago • Dani
I have 5000 FASTA sequences with Uniprot ids. Now, I want to add a unique identifier at the beginning of each FASTA header. An example will...And so on I want to add ABC0001 to ABC5000 at the beginning of the fasta header. And the corresponding gene name from my txt file. gopA ABC0001 A12345 gopD ABC0002 B57384 ........................ fotR ABC5000 C12345...And so on As I understand, I …
updated 10.4 years ago • bioinfo
Hi all friends, I have a large fasta file that most sequences have a identical header (they differ from the length). I usually extracted the sequences of interest...requires the Biopython library" sys.exit(0) try: fasta_file = sys.argv[1] # Input fasta file wanted_file = sys.argv[2] # Input wanted file, one gene name per line result_file = sys.argv[3] # Output fasta file …
updated 7.1 years ago • seta
Hello; I need to process fasta header by matching fasta description (not fasta id) with a first column in a another file with two columns and print second...column in file on to fasta header. Here are examples and what i have till now. file1.txt (list file) group_1 gene 1 group_2 gene 2 group_3 gene 3 group_4...my $input; close $infile; { local $/ = undef; …
updated 7.8 years ago • empyrean999
Hello everyone, I try to replace the headers A of a FASTA file (file.fasta) with headers B. For this, I have a list which match the headers names. >A_1 >B_1 >A_2 &gt...B_2 >A_3 >B_3 etc... I am using this loop to replace the headers: cat list | while read f ; do echo $f > temp_file A=$(awk '{print $1}…
updated 3.5 years ago • Begonia_pavonina
Hi everyone! I need help with something. I am very new to bioinformatics. I have a fasta file with 32K reference sequences for an X gene. The headers are the Accession numbers, but I need to change them for the...So I think I already did the hardest part) but now I need to combine this information and change de headers of my fasta for the GI of each sequence. I've tried with this script: ``` …
updated 20 months ago • marcelavillegasp
I have a fasta file with the following format: >BNY.1.2.t17987.mrna1 CDS=1-1065 seq... How can I remove everything after ".mrna1" from...the headers
updated 4.2 years ago • 2822462298
Hello, I have a list of headers, I need to extract the sequence from the fasta file. how can I do it? kindly let me know. The header file looks like this &gt...gt;TRINITY_DN74659_c0_g1_i1 >TRINITY_DN74659_c0_g1_i1 >TRINITY_DN74698_c0_g1_i1 fasta file looks like this >TRINITY_DN74697_c0_g1_i1 len=243 path=[221:0-242] [-1, 221, -2] GTATGTCCCACCAGACAC…
updated 22 months ago • Princy
of interest. I am almost done with the script. But I would also like to include gene names in the fasta headers. By default, it only include corrdinates in fasta headers. Below is my script: >coords=Chr1 1000 2000 forward...gt;TTTGGGGTTATAAATTATTAGAAGTT...... I was wondering if there is a way to include the gene name in fasta header. Thanks, R
updated 7.1 years ago • RT
I am a newbi for linux stuff... I would like to modify the header of fasta file. **My header is like: >100123_00010T gene=100123_00010** **And, I would like to have headers like "100123_00010
updated 12 months ago • hellokwmin
Hi, I have a fasta file, which has some same headers like below. They have different sequence but same header. How can I merge them or what...should I do? I want to run orthoMCL but it requires unique headers. ``` >c12358_g1_i9 >c12358_g1_i9
updated 21 months ago • Mehmet
Hello, I am trying to convert my vcf files to fasta. However, after aligning to reference, vcf ID from the header disappears, and bcftools/vcftools are writing only reference...seq name in file header. Like > NC_xxxx.1 Any ideas? I run consensus script like for file in $inpath/*.vcf ; do echo $file bname=$(basename $file) echo...base name is …
updated 3.5 years ago • storm1907
Does anyone have a handy method for making a fasta header comply with the UniProt header specifications? http://www.uniprot.org/help/fasta-headers In particular, I would
updated 7.2 years ago • nickp60
Hi, I have mutiple fasta file and I want to change the header, for this I am using - awk '/^>/{print ">C1_" ++i; next}{print}' C1_pandaseq.fasta > C1_pandaseq_new.fasta...input fasta- >M03419:60:656544:1:1101:25150:3877:1 CCTACGGGTGGCTGCAGTGGGGAATTTTGGACAA >M03419:60:656544:1:1101:8498:4267:1...gt;C1_3 CCTACGGGTGGCAGCAGTGGGGAATATTGGACAATGC…
updated 7.2 years ago • bioinformaticssrm2011
Hi, I have protein fasta file whose headers look like '>evm.model.chr.9.52'. There are almost 30k+ proteins. I have performed functional annotations...Now, I al performing some analysis and I want to add atleast protein name or even GO term in fasta header so it would make things alot easier for me. I want something like; >evm.model.chr.9.52 GO:1234678 Can I do it with
updated 15 months ago • ahmadjoyyia
Hello All, I have a multi fasta file with millions of sequences. I want to duplicate a part of the header and join it to the header itself with a pipe, while...another part (of the header) should be deleted. Let's say I have a fasta file, "input.fasta," which looks like this: >Gene1 wbdfwbf ATGCCGATGCAGTGACG...f 1 < input.fasta > out1.fasta` for deleting spa…
updated 2.1 years ago • bionix
reference genome sequence using the BWA software, and it gave me a .sam file. I used samtools SAM to FASTA to convert the aligned reads to fasta file. I want to look at assembly statistics and also evaluate completeness with...BUSCO. I received the following error: **The character "/" is present in the fasta header >A00600:204:HFMJ3DSX3:3:1101:3640:1125/1, which will crash Reader. Please…
updated 18 months ago • hpalk42
Hello, I have a text file with thousands of unique sequences in fasta format. Each read has a header in the following format: 122391_Tcount2352_Acount2352_Bcount0_length293 It's obvious...was used as some point in the pipeline. I'm curious to see if anyone here has encountered this header format before and can tell me which part of the sequence header represents the count of reads. Thanks …
updated 5.3 years ago • genya35
I have a fasta file with hundreds of sequences and their respective headers. The headers (all of them) are in the format >ABCD [id_123...I have a fasta file with hundreds of sequences and their respective headers. The headers (all of them) are in the format >ABCD [id_123] (gene_XYZ) [protein_ijk] [protein_id=qqq] [123..899] .......seqeunce............ >…
updated 7.2 years ago • leo1985.arnab
Hello I have a fasta file with sequence headers written as ``` >0|quiver|1..2075|- >0|quiver|2210..3058|- >0|quiver|3112..4169|- ``` and so on till around
updated 20 months ago • utkarsh.sood
Hi, I would like to create a new fasta file from the original genome fasta and a vcf file. The fasta file will only have full gene sequences included. I can use...o sample_SNV.fasta -V sample_SNV_selected.vcf -L ref_gene.bed But I would like the output fasta to have the gene names as the header. For instance the current fasta output from gatk is: >1 chr01:2350 AGAAAGGACAGAAAAA…
11,355 results • Page 1 of 228
Traffic: 2665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6