Qiime2 manifest file for multiple pair end read
0
0
Entering edit mode
2.2 years ago
Bioinfonext ▴ 320

Hi,

I do have multiple pair end read file for amplicon sequencing, I am trying to make bash script` to generate manifest file: manifest file should contain information like this:

Blockquote

*sample-id,filename,direction

a,a_1_R1_001.fastq.gz,forward

b,b_2_R1_001.fastq.gz,forward

c,c_3_R1_001.fastq.gz,forward*

location of file: /mnt/scratch/users/3052771/Amplicon_data_july_2018/16S_Analysis/soil_16s

file name is like this:

Soil-33_S73_L001_R1_001.fastq Soil_9_S42_R1_001.fastq

Soil-33_S73_L001_R2_001.fastq Soil_9_S42_R2_001.fastq

script:

>

 echo "sample-id,absolute-filepath,direction" > manifest.csv

> raw_data='/mnt/scratch/users/3052771/Amplicon_data_july_2018/16S_Analysis/soil_16s/'

> *#Since the format asks to separate 'foward' and 'reverse' iterating for R1, then same loop for R2*

 for sampleID in $(ls ${raw_data}/*gz |

> cut -d'-' -f2-4 | sort | uniq) do

>     path=$(find $raw_data -name "*$sampleID*R1*")

>     echo "$sampleID,$path,forward" >> manifest.csv done

> *# Iterating for R2*

 for sampleID in $(ls ${raw_data}/*gz | cut -d'-' -f2-4 | sort | uniq) do

>     path=$(find $raw_data -name "*$sampleID*R2*")
>     echo "$sampleID,$path,reverse" >> manifest.csv done

but it is showing error:

find: warning: Unix filenames usually don't contain slashes (though

pathnames do). That means that '-name */mnt/scratch/users/3052771/Amplicon_data_july_2018/16S_Analysis/soil_16s//Soil_9_S42_R2_001.fastq.gz*R2*'' will probably evaluate to false all the time on this system. You might find the '-wholename' test more useful, or perhaps '-samefile'. Alternatively, if you are using GNU grep, you could use 'find ... -print0 | grep -FzZ/mnt/scratch/users/3052771/Amplicon_data_july_2018/16S_Analysis/soil_16s//Soil_9_S42_R2_001.fastq.gzR2*''.

next-gen • 2.1k views
ADD COMMENT
0
Entering edit mode

manifest.csv giving out is like this:

6_S22_L001_R1_001.fastq.gz,,forward

6_S22_L001_R2_001.fastq.gz,,forward

7_S32_L001_R1_001.fastq.gz,,forward

7_S32_L001_R2_001.fastq.gz,,forward

/mnt/scratch/users/3052771/Amplicon_data_july_2018/16S_Analysis/soil_16s//Soil_10_S52_R1_001.fastq.gz,,forward

/mnt/scratch/users/3052771/Amplicon_data_july_2018/16S_Analysis/soil_16s//Soil_10_S52_R2_001.fastq.gz,,forward

/mnt/scratch/users/3052771/Amplicon_data_july_2018/16S_Analysis/soil_16s//Soil_11_S62_R1_001.fastq.gz,,forward

/mnt/scratch/users/3052771/Amplicon_data_july_2018/16S_Analysis/soil_16s//Soil_11_S62_R2_001.fastq.gz,,forward

but the output should be like this:

Soil_10_S52,/mnt/scratch/users/3052771/Amplicon_data_july_2018/16S_Analysis/soil_16s//Soil_10_S52_R1_001.fastq.gz,forward
ADD REPLY

Login before adding your answer.

Traffic: 1768 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6