batch renaming FASTQ files
1
1
Entering edit mode
6 weeks ago
amitpande74 ▴ 20

Hi,

I have 40 fastq files named as

SRR1313062_GSM1401648_ZR-75-1_Homo_sapiens_RNA-Seq_1.fastq.gz  
SRR1313062_GSM1401648_ZR-75-1_Homo_sapiens_RNA-Seq_2.fastq.gz

until

SRR1313131_GSM1401717_06-01-A009b_Homo_sapiens_RNA-Seq_1.fastq.gz
SRR1313131_GSM1401717_06-01-A009b_Homo_sapiens_RNA-Seq_2.fastq.gz

I want to change these names to

SRR1313062_1.fastq.gz  
SRR1313062_2.fastq.gz

and so on until

SRR1313131_1.fastq.gz 
SRR1313131_2.fastq.gz

Have tried :

rename  's/(\w_).*_(R[0-9])_.*(.fastq.gz)/$1$2$3/' filenames

and

brename -p '^(\w+)_.+_.+(_R[12]_).+' -r '${1}$2.fastq.gz' filenames

but nothing is working. Can someone kindly help.

fastq rename batch • 132 views
ADD COMMENT
1
Entering edit mode

If _GSM...RNA-seq is common in all names, keep it simple:

$ rename -n  's/_GSM.*_RNA-Seq//' *.gz

with parallel

parallel --dry-run mv {} /new_folder/{=s/_GSM.\*_RNA-Seq//=} ::: *.gz

If you would like to use regex, use following:

$ rename -n  's/^(.*)_GSM.*(_[12])/$1$2/' *.gz
ADD REPLY
1
Entering edit mode
6 weeks ago
tshtatland ▴ 80

Use rename:

Dry run:

rename -n 's{ ^ ( [^_]* ) .* ( _[12].fastq.gz ) $ }{$1$2}x' *.fastq.gz

Actually rename:

rename 's{ ^ ( [^_]* ) .* ( _[12].fastq.gz ) $ }{$1$2}x' *.fastq.gz

The command-line utility rename comes in many flavors. Most of them should work for this task. I used the rename version 1.601 by Aristotle Pagaltzis. To install rename, simply download its Perl script and place it into $PATH. Or install rename using conda, like so:

conda install rename
ADD COMMENT

Login before adding your answer.

Traffic: 2430 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6