how to remove a substring for multiple file names?
2
0
Entering edit mode
2.8 years ago

Hello I have files with the following file names:

VIR3A_CCGCGGTT-CTAGCGCT_L00M_1.fastq.gz
VIR3A_CCGCGGTT-CTAGCGCT_L00M_2.fastq.gz
VIR3B_TTATAACC-TCGATATC_L00M_1.fastq.gz
VIR3B_TTATAACC-TCGATATC_L00M_2.fastq.gz

I want to delete all the "ACTGs" so I can have files like VIR3A_1.fastq.gz, VIR3A_2.fastq.gz and so on using bash

Thanks for your time :)

string bash • 1.6k views
ADD COMMENT
1
Entering edit mode

with rename

$ ls
VIR3A_CCGCGGTT-CTAGCGCT_L00M_1.fastq.gz  VIR3A_CCGCGGTT-CTAGCGCT_L00M_2.fastq.gz                                                                                                          

$ rename -nv 's/^(\w+_).*_(.*)$/$1$2/' *.gz

Using expression: sub { use feature ':5.32'; s/^(\w+_).*_(.*)$/$1$2/ }    
'VIR3A_CCGCGGTT-CTAGCGCT_L00M_1.fastq.gz' would be renamed to 'VIR3A_1.fastq.gz'
'VIR3A_CCGCGGTT-CTAGCGCT_L00M_2.fastq.gz' would be renamed to 'VIR3A_2.fastq.gz'

with bash

$ for i in *.fastq.gz; do echo $i ${i%%_*}_${i##*_}; done

VIR3A_CCGCGGTT-CTAGCGCT_L00M_1.fastq.gz VIR3A_1.fastq.gz
VIR3A_CCGCGGTT-CTAGCGCT_L00M_2.fastq.gz VIR3A_2.fastq.gz
VIR3B_TTATAACC-TCGATATC_L00M_1.fastq.gz VIR3B_1.fastq.gz
VIR3B_TTATAACC-TCGATATC_L00M_2.fastq.gz VIR3B_2.fastq.gz
ADD REPLY
0
Entering edit mode

Thanks for your reply!, for curiosity, what the "%%" and the "##" do here?

ADD REPLY
3
Entering edit mode
2.8 years ago
ls *.fastq.gz | awk -F _ '{printf("mv \"%s\" \"%s_%s\"\n",$0,$1,$NF);}' > script.bash

check script.bash and the invoke it bash script.bash

ADD COMMENT
0
Entering edit mode

yeah this worked, thanks so much!!

ADD REPLY
1
Entering edit mode
2.8 years ago
ATpoint 81k
for i in *.fastq.gz; do mv $i $(echo $i | awk -F "_" '{print $1"_"$4}'); done
ADD COMMENT

Login before adding your answer.

Traffic: 2555 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6