I am interested in automating the combining of multiple fastq.gz files. As you can see in the image below I have already indexed my reference genome with BWA and I have combined the fastq.gz files from my COL strain. Because this Arabidopsis is a test set, I could do it manually. However, in the near future I plan to do the same thing with over 500 strains. So I would like to automate the process.
I have used the linux command cat *_1.fastq.gz > Combined_1.fastq.gz to get it to work at the COL strain. Should I create some sort of bash loop for things like this?
I have already aligned these Combined_1 and Combined_2 to the reference genome and I have used these results with BWA sampe to get a SAM file. So far, results look promising. But, could this process also be automated?