how to remove a substring for multiple file names?
2
0
Entering edit mode
20 months ago

Hello I have files with the following file names:

VIR3A_CCGCGGTT-CTAGCGCT_L00M_1.fastq.gz
VIR3A_CCGCGGTT-CTAGCGCT_L00M_2.fastq.gz
VIR3B_TTATAACC-TCGATATC_L00M_1.fastq.gz
VIR3B_TTATAACC-TCGATATC_L00M_2.fastq.gz

I want to delete all the "ACTGs" so I can have files like VIR3A_1.fastq.gz, VIR3A_2.fastq.gz and so on using bash

Thanks for your time :)

string bash • 865 views
ADD COMMENT
1
Entering edit mode

with rename

$ ls
VIR3A_CCGCGGTT-CTAGCGCT_L00M_1.fastq.gz  VIR3A_CCGCGGTT-CTAGCGCT_L00M_2.fastq.gz                                                                                                          

$ rename -nv 's/^(\w+_).*_(.*)$/$1$2/' *.gz

Using expression: sub { use feature ':5.32'; s/^(\w+_).*_(.*)$/$1$2/ }    
'VIR3A_CCGCGGTT-CTAGCGCT_L00M_1.fastq.gz' would be renamed to 'VIR3A_1.fastq.gz'
'VIR3A_CCGCGGTT-CTAGCGCT_L00M_2.fastq.gz' would be renamed to 'VIR3A_2.fastq.gz'

with bash

$ for i in *.fastq.gz; do echo $i ${i%%_*}_${i##*_}; done

VIR3A_CCGCGGTT-CTAGCGCT_L00M_1.fastq.gz VIR3A_1.fastq.gz
VIR3A_CCGCGGTT-CTAGCGCT_L00M_2.fastq.gz VIR3A_2.fastq.gz
VIR3B_TTATAACC-TCGATATC_L00M_1.fastq.gz VIR3B_1.fastq.gz
VIR3B_TTATAACC-TCGATATC_L00M_2.fastq.gz VIR3B_2.fastq.gz
ADD REPLY
0
Entering edit mode

Thanks for your reply!, for curiosity, what the "%%" and the "##" do here?

ADD REPLY
3
Entering edit mode
20 months ago
ls *.fastq.gz | awk -F _ '{printf("mv \"%s\" \"%s_%s\"\n",$0,$1,$NF);}' > script.bash

check script.bash and the invoke it bash script.bash

ADD COMMENT
0
Entering edit mode

yeah this worked, thanks so much!!

ADD REPLY
1
Entering edit mode
20 months ago
ATpoint 70k
for i in *.fastq.gz; do mv $i $(echo $i | awk -F "_" '{print $1"_"$4}'); done
ADD COMMENT

Login before adding your answer.

Traffic: 1761 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6