Entering edit mode
4.4 years ago
dante
•
0
Say I have a folder with three different sub folders containing exome sequence data:
folder1 has files
folder1_file.rgfile
folder1_file.r1.fq.gz
folder1_file.r2.fq.gz
folder2 with files
folder2_file.rgfile
folder2_file.r1.fq.gz
folder3_file.r2.fq.gz
folder3 has files
folder3_file.rgfile
folder3_file.r1.fq.gz
folder3_file.r2.fq.gz
The three .rgfiles contain one of :
@RG\tID:C47BPACXX:8\tPL:illumina\tPU:HISEQ2000B:567:C47BPACXX:8\tLB:folder1_file-test\tSM:folder1_file\tDS:folder1_file-A-test
@RG\tID:C45C8ACXX:4\tPL:illumina\tPU:HISEQ2000B:568:C45C8ACXX:4\tLB:folder2_file-test\tSM:folder2_file\tDS:folder2_file-B-test
@RG\tID:C5YA7ACXX:4\tPL:illumina\tPU:HISEQ2000B:571:C5YA7ACXX:4\tLB:folder3_file-test\tSM:folder3_file\tDS:folder3_file-C-test
How do write a bash script that performs alignment of these samples to the genome (say, GRCh37.fa) using a for loop using RG sequences in the .rgfile. I would like to use bwa to do the alignment.
I would really appreciate if someone could show how to do this.