Question: Extract text between delimiters and use it to rename the same file
0
gravatar for akansha.gitanjali
7 months ago by
akansha.gitanjali10 wrote:

I have file that looks like

>sp|Q5T4S7|UBR4_HUMAN E3 ubiquitin-protein ligase UBR4 OS=Homo sapiens OX=9606 GN=UBR4 PE=1 SV=1
362 S AGC AQQVRTGSTSSKEDD 2.108 0.386

364 S AGC QVRTGSTSSKEDDYE 0.556 0.386

555 S AGC LQRQRKGSMSSDASA 4.466 0.386

625 S AGC ESSPRVKSPSKQAPG 2.518 0.386

904 T AGC DSNSRRATTPLYHGF 3.45 0.386

1049 S AGC SSRLRISSYVNWIKD 0.972 0.386

1473 T AGC AWLTRMTTSPPKDSD 0.463 0.386

1504 S AGC TYIVRENSQVGEGVC 1.114 0.386

1787 S AGC EEKPKKSSLCRTVEG 1.593 0.386

1941 T AGC DSSKRKLTLTRLASA 1.859 0.386

I used cut -d'|' -f2 output2.txt | head -1 to output Q5T4S7. Now I want to use this text to rename the same file. How do I do that?

bash shell linux • 201 views
ADD COMMENTlink modified 7 months ago by RamRS27k • written 7 months ago by akansha.gitanjali10

assuming that only one header is present in a file and assuming that protein ID is always as in the example file, please try following bash script:

for i in *.fa; do echo cp $i $(awk -F "|" '/^>/ { print $2".fasta"}' $i);done

Script would dry-run the copy old file name to new file name. Please remove echo after new file and old file name validation. New files are named .fasta so that the code would not work on newly created fasta files.

ADD REPLYlink modified 7 months ago • written 7 months ago by cpad011213k
1
gravatar for RamRS
7 months ago by
RamRS27k
Houston, TX
RamRS27k wrote:

Please note that this is a linux/unix question and not necessarily bioinformatics. You should search StackOverflow for answers to questions like these.

Copying over my answer from the other location where you asked this question (which is something you should not do):

Here is a two step process that is easier to understand:

NEW_NAME=$(head -n 1 output.txt | cut -d'|' -f2)
mv output.txt $NEW_NAME

and here is a single line, that creates a sub-shell to handle creating the new name. I personally prefer this one as it is an extension of a more generic renaming operation (mv $filename ${filename/pattern/replacement})

mv output.txt $(head -n 1 output.txt | cut -d'|' -f2)

PS: I also optimized the cut | head to a better order of operations head | cut.

ADD COMMENTlink modified 7 months ago • written 7 months ago by RamRS27k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1099 users visited in the last hour