Creating a script to run STAR alignment on multiple fastq files
1
0
Entering edit mode
22 months ago
mropri ▴ 150

I have fastq files named in the following way:

rep1_A01_R2.fastq rep1_A02_R2.fastq rep1_B01_R2.fastq rep1_B02_R2.fastq rep1_C01_R2.fastq rep1_C02_R2.fastq rep1_D01_R2.fastq rep1_D02_R2.fastq

and many more with the same convention of letters and numbers.

I know how to run STAR, was just wondering if there is a way so when I read in the files I do not have to list all the files but can create a command that pulls each fastq file and runs alignment on it. Appreciate any help.

RNA-seq STAR Alignment • 2.2k views
ADD COMMENT
1
Entering edit mode
22 months ago

https://raw.githubusercontent.com/alexdobin/STAR/d14a0a992f94ba3a64c26dd08ac58e2b4ab134f3/doc/STARmanual.pdf

Multiple samples can be mapped in one run with a single output. This is equivalent to concatenating the read files before mapping, except that distinct read groups can be used in --outSAMattrRGline command to keep track of reads from different files. For single-end reads use a comma separated list (no spaces around commas), e.g.: --readFilesIn sample1.fq,sample2.fq,sample3.fq

For paired-end reads, use comma separated list for read1, followed by space, followed by comma separated list for read2, e.g.: --readFilesIn s1read1.fq,s2read1.fq,s3read1.fq s1read2.fq,s2read2.fq,s3read2.fq

ADD COMMENT
0
Entering edit mode

I think OP is looking for a simple loop rather than pooling everything into a single run or doing RG magic.

ADD REPLY
0
Entering edit mode

that would be helpful as well. If you have something in mind. I tried creating a for loop with sed but am getting stuck.

ADD REPLY
0
Entering edit mode

Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2419 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6