STAR alignment on Windows
2
0
Entering edit mode
5 months ago
tryseq ▴ 10

I'm sorry this is a very beginner level question, but I was wondering if I can run the STAR aligner on a Windows machine? I do have access to a Linux remotely and have used STAR that way in the past, but can I run it on my laptop if I have python installed? (In an attempt to align my data faster by using 2 machines)

RNA-seq STAR • 1.4k views
ADD COMMENT
0
Entering edit mode

STAR is very memory hungry, so unless the reference genome is small, or you have a really good laptop (32Gb of RAM or more), you won't be able to use STAR with the laptop. For example, STAR needs ~30Gb for mapping using the human genome as reference.

ADD REPLY
0
Entering edit mode

Oh I see. I forgot about that part of STAR. I only have 16 Gb of RAM, and need to align to the mouse genome, which I'm assuming will be also quite memory heavy (but not as much as the human genome). Thank you for your help!

ADD REPLY
4
Entering edit mode
5 months ago
Gordon Smyth ★ 3.4k

As far as I know, STAR doesn't run under Windows 10.

Would you consider another aligner? Rsubread has been compared favorably to STAR in performance and runs very nicely under Windows:

https://academic.oup.com/nar/article/47/8/e47/5345150

Rsubread doesn't need any unix subsystem or any third-party software except for R. You just install it as an R package. It works the same under Unix, Windows 10 or Mac. It has relatively low memory demands and there are options to split up the index if you need to further conserve memory usage. I find Rsubread as fast on my Windows 10 laptop as on my institution's linux systems, unless I start using a large number of linux cores simultaneously.

ADD COMMENT
3
Entering edit mode
5 months ago

Use the Windows Subsystem for Linux and treat it like a typical linux install. If you're doing any bioinformatics on a Windows machine, you'll want to get acquainted with it.

ADD COMMENT
2
Entering edit mode

STAR can not work with the NTFS filesystem!

So the data should be in ext4 partition in WSL2.

But it is not easy to mount an ext partition in WSL2 so far (need to be on Windows 10 Build 20211 or higher to access this feature. You can join the Windows Insiders Program to get the latest preview builds.).

Updated: From STAR source code:

"Exiting because of FATAL ERROR: could not create FIFO file " + tmpFifo + "\n"

"SOLUTION: check the if run directory supports FIFO files.\n"

"If run partition does not support FIFO (e.g. Windows partitions FAT, NTFS), "

"re-run on a Linux partition, or point --outTmpDir to a Linux partition.\n"

ADD REPLY
2
Entering edit mode

Actually, STAR can work on the NTFS filesystem. Their error message just isn't as helpful as it could be.

The FIFO file error on an NTFS filesystem only happens when you feed .fastq.gz files to STAR and use the "--readFilesCommand" flag to specify a way to unzip the .gz files. If you give STAR .fastq files (not gzipped) and omit the "--readFilesCommand" flag, STAR works just fine with WSL2 on an NTFS filesystem. For example, the following code works well on my Windows 10 laptop without mounting an ext partition.

For people new to STAR, please note that the arguments fed to STAR in the code below are probably not appropriate for your RNAseq data and/or your computer. For one thing, this code assumes you have >40 GB RAM and >15 CPU threads on your computer. It also assumes you are aligning paired-end sequencing, have already generated a STAR index for your reference genome in the ../STAR_Genome_Reference folder, and that your read lengths are 151 bp. If you're new to this, please read the STAR manual entry for any flags you plan to use in your code to make sure you understand what those flags do.

#Creating directory to store STAR alignment results in
    mkdir -p ../STAR_Alignments

#Unzipping fastq files manually to avoid FIFO file creation by STAR, which isn't supported on WSL2.
    gunzip -k "./SampleA_1.fastq.gz"
    gunzip -k "./SampleA_2.fastq.gz"

#Doing STAR alignment
    STAR \
    --genomeDir ../STAR_Genome_Reference \
    --runThreadN 15 \
    --outSAMtype BAM SortedByCoordinate \
    --readFilesIn "SampleA_1.fastq" "SampleA_2.fastq" \
    --sjdbOverhang 150 \
    --twopassMode Basic \
    --limitBAMsortRAM 40000000000 \
    --limitOutSJcollapsed 1000000 \
    --outFileNamePrefix "../STAR_Alignments/SampleA"

#Deleting unzipped fastq files after alignment is complete to save space
    rm "./SampleA_1.fastq"
    rm "./SampleA_2.fastq"
ADD REPLY
0
Entering edit mode

Thank you for clarifying this point.

ADD REPLY
0
Entering edit mode

I was not aware of that, great point.

ADD REPLY
0
Entering edit mode

That is very intriguing! I did not know that existed. Even if I can't get around the RAM problem, I am going to set that up and try to use it!

ADD REPLY
0
Entering edit mode

16 GB might be enough for mouse, but yeah, WSL works with 98% of things now, the major exception being containers like Docker/Singularity.

ADD REPLY

Login before adding your answer.

Traffic: 1914 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6