Batch rename contigs in multiple assembly files
1
0
Entering edit mode
3.4 years ago
dangohh ▴ 10

Hi,

Having hundreds of assemblies with between 50-100 contigs. All contigs are named Contig_1, Contig_2 etc ... But the assemblies themselves have an ID I want on the contigs as well for the downstream analysis. So the ID of the file should be included before the contig. I.e:

<fileid>_Contig_1 instead of only Contig_1

Please help!

assembly • 930 views
ADD COMMENT
1
Entering edit mode
3.4 years ago

you could run sed 's/>/fileid>/g' on your files, providing the fileid you want to use.

Integrate this in a bash loop if you want to run it on a whole list of files.

ADD COMMENT
1
Entering edit mode
awk '/>/{sub(">","&"FILENAME"_");sub(/\.fasta/,x)}1' sample_1.fasta

That worked great, how do I now put it in a loop for all .fasta in my directory?

(Definitely signing up for an intro to bash scripting) :D

ADD REPLY
0
Entering edit mode
for i in *.fasta ; do
  <your awk command >
done
ADD REPLY

Login before adding your answer.

Traffic: 3107 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6