Question: How to rename multiple fastq files
0
gravatar for ENK
6 weeks ago by
ENK0
Denmark
ENK0 wrote:

Hellooo geeks.. I want to rename multiple files:

original names are as follows;

GEN191010_N_NBS0_lib94256_1700_1_R1.fastq
GEN191010_N_NBS0_lib94256_1700_1_R2.fastq
GEN191010_N_NBXBS10_lib94257_1700_1_R1.fastq
GEN191010_N_NBXBS10_lib94257_1700_1_R2.fastq

However, I want the final names like this:

NBS0_1_R1.fastq
NBS0_1_R2.fastq
NBXBS10_1_R1.fastq
NBXBS10_1_R2.fastq

Your help will be very much appreciated.

next-gen • 221 views
ADD COMMENTlink modified 6 weeks ago by Karma270 • written 6 weeks ago by ENK0
2

helloooooooo newbie. What have you tried ?

ADD REPLYlink written 6 weeks ago by Pierre Lindenbaum127k

We need to know more about your files. Are the strings you want removed the same in all cases as this suggests?

If not, do they follow a regular pattern?

ADD REPLYlink written 6 weeks ago by Joe16k

Thank you all for your suggestions. I used that of Mensur and it work just fine. Much appreciated.

ADD REPLYlink written 6 weeks ago by ENK0
5
gravatar for shenwei356
6 weeks ago by
shenwei3565.1k
China
shenwei3565.1k wrote:

Lots of solution from others work, but I'd like to recommend a safer solution of mine (brename), in case you overwrite files with others by accident, which is common in batch renaming files using regular expression.

brename checks all operations before execution for safety.

$ brename --include-filters  '.fastq$' --ignore-ext  \
 -p 'GEN191010_N_(.+?_).+(R[12])' -r '$1$2' --dry-run
[INFO] main options:
[INFO]   ignore case: false
[INFO]   search pattern: GEN191010_N_(.+?_).+(R[12])
[INFO]   include filters: .fastq$
[INFO]   search paths: ./
[INFO] 
[INFO] checking: [ ok ] 'GEN191010_N_NBS0_lib94256_1700_1_R1.fastq' -> 'NBS0_R1.fastq'
[INFO] checking: [ ok ] 'GEN191010_N_NBS0_lib94256_1700_1_R2.fastq' -> 'NBS0_R2.fastq'
[INFO] checking: [ ok ] 'GEN191010_N_NBXBS10_lib94257_1700_1_R1.fastq' -> 'NBXBS10_R1.fastq'
[INFO] checking: [ ok ] 'GEN191010_N_NBXBS10_lib94257_1700_1_R2.fastq' -> 'NBXBS10_R2.fastq'
[INFO] 4 path(s) to be renamed
ADD COMMENTlink modified 6 weeks ago • written 6 weeks ago by shenwei3565.1k
4
gravatar for ATpoint
6 weeks ago by
ATpoint31k
Germany
ATpoint31k wrote:

Using the field splitting function of awk. This assumes that formatting is the same for all files. The advantage is that you do not need any regex that alters the fields directly (like deleting numbers or characters) but you simply split them by their common delimiter _ and then select those you want to build the final file name.

for i in *.fastq
  do
  mv $i $(echo $i | awk '{split($1,a,/_/); print a[3]"_"a[5]"_"a[6]"_"a[7]}')
  done
ADD COMMENTlink modified 5 weeks ago • written 6 weeks ago by ATpoint31k
4
gravatar for Mensur Dlakic
6 weeks ago by
Mensur Dlakic4.1k
USA
Mensur Dlakic4.1k wrote:

Below is a shell script that replaces defined strings inside a group of files with the same extension. It is probably an overkill in your case since you can simply enter 4 mv commands instead of 2 needed with this script. First save the script as fix-name.com and make it executable (chmod +x fix-name.com). You also need to have a (t)csh installed, which I guess is not a given these days. I am sure someone will come up with a better bash script in no time.

In your case, enter:

fix-name.com fastq GEN191010_N_ ""
fix-name.com fastq _lib94257_1700 ""

The script:

#!/bin/tcsh
if ( "$1" == "" ) then
    echo ""
    echo " This script renames all files with a given extension by"
    echo " replacing part of their names with user specified strings."
    echo ""
    echo " The correct syntax is:"
    echo ""
    echo " fix-name.com <file extension> <replace what> <replace with>"
    echo ""
    echo " For example, to rename all *junk.txt files so that junk"
    echo " is removed from their names, use this command:"
    echo " "
    echo " fix-name.com txt junk ''"
    echo " "
    echo " First argument (file extension without .) has to be entered."
    echo " The defaults are junk and an empty string, which means"
    echo " removing junk from file names."
    echo ""
    exit 9
endif

if ( "$2" == "" ) then
    setenv STR1 "junk"
    else
    setenv STR1 $2
    endif

if ( "$3" == "" ) then
    setenv STR2 ""
    else
    setenv STR2 $3
    endif

find . -maxdepth 1 -name "*.$1" -print | agrep "$STR1" | sort > tmp-list1
cp tmp-list1 tmp-list2
perl -pi -e 's/\.\///g' tmp-list2
perl -pi -e 's/$ENV{"STR1"}/$ENV{"STR2"}/g' tmp-list2
perl -pi -e 's/\.\//mv /g' tmp-list1
paste -d" " tmp-list1 tmp-list2 > tmp-list
source tmp-list >& /dev/null
rm tmp-list tmp-list1 tmp-list2
ADD COMMENTlink written 6 weeks ago by Mensur Dlakic4.1k
2
gravatar for Joe
6 weeks ago by
Joe16k
United Kingdom
Joe16k wrote:

Making assumptions about the consistency of your files:

for file in /path/to/*.fastq ; do 
    mv $file $(echo $file | sed -e 's/GEN191010_N_//gi'  -e 's/lib[0-9]\{5\}_[0-9]\{4\}_//gi')
done

NB: untested code.

ADD COMMENTlink modified 6 weeks ago • written 6 weeks ago by Joe16k
2
gravatar for Joe
6 weeks ago by
Joe16k
United Kingdom
Joe16k wrote:

An alternative (and the best) approach, using Unix's rename:

rename -nv 's/GEN191010_N_(.*)_lib[0-9]{5}_[0-9]{4}_/$1/gi' *.fastq

Drop the -n if you're happy with the substitutions, and it will actually perform the replacement.

ADD COMMENTlink modified 4 weeks ago by RamRS26k • written 6 weeks ago by Joe16k
0
gravatar for Malcolm.Cook
6 weeks ago by
Malcolm.Cook1.1k
kansas, usa
Malcolm.Cook1.1k wrote:

GNU Parallel lets you harness the power of perl regular expressions:

parallel --dry-run mv {=Q($_)=} {=Q(s/GEN\d+_N_(\w+)_lib\d+_\d+_(\d+_R\d)/$1_$2/)=} ::: *.fastq

note: run it once with the --dry-run to make sure it does what you want, then run again without to do the deed.

ADD COMMENTlink modified 4 weeks ago by RamRS26k • written 6 weeks ago by Malcolm.Cook1.1k
parallel --dry-run mv {} '{=s/GEN\d+_N_(\w+)_lib\d+_\d+_(\d+_R\d)/$1_$2/=}' ::: *.fastq
ADD REPLYlink written 5 weeks ago by ole.tange3.7k

Hi @Ole - I thought the Q() would allow the recipe to work even if filename had whitespace in them but I guess that was unneeded over-protection... yes?

ADD REPLYlink written 5 weeks ago by Malcolm.Cook1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1208 users visited in the last hour