STAR Intron Motif Script Gives Segmentation fault Error
0
0
Entering edit mode
7 months ago
Y • 0

I have the following inputs:

# Define input directory containing FASTQ files
Input_directory="/path/to/fastq/folder"

# Define output directory for STAR output files
Output_directory="/path/to/output/directory"

# Define paths to reference files
Annotation_GTF="/path/to/Zebra/fish/GRCz11.110.chr.gtf"
Genome_FASTA="/path/to/soft/masked/Zebra/fish/primary_assembly.fa"
Reference="/path/to/soft/masked/STAR/created/reference/only/for/use/with/STAR"

# Define the number of threads to use
num_threads=4

To this script:

# Loop through each pair of paired FASTQ files in the input directory and subdirectories
for forward_file in "${Input_directory}"/*_R1.fq; do
    # Extract the file name without extension
    file_name=$(basename "${forward_file}" _R1.fq)

    # Extract the sample name from the file name
    sample_name="${file_name/_R1/}"

    # Path to the corresponding reverse FASTQ file
    reverse_file="${forward_file/_R1/_R2}"

    echo "Forward File: ${forward_file}"
    echo "Reverse File: ${reverse_file}"
    echo "Output Directory: ${Output_directory}"

    # Create a unique temporary directory for this sample
    TMPDIR="${Output_directory}/${file_name}___STAR_temporary_directory"

    echo "The temporary directory is: ${TMPDIR}"

    # Change working directory to the output directory
    cd "${Output_Directory}"

    echo "Made temporary directory: ${TMPDIR}"

    # Run STAR alignment
    STAR \
        --genomeDir "${Reference}" \
        --readFilesIn "${forward_file}" "${reverse_file}" \
        --outFileNamePrefix "${Output_directory}/${sample_name}_Soft___" \
        --runThreadN "${num_threads}" \
        --genomeLoad NoSharedMemory \
        --outSAMtype BAM SortedByCoordinate \
        --outTmpDir "${TMPDIR}" \
        --outStd Log \
        --outSAMunmapped Within \
        --outSAMattributes Standard \
        --outSAMstrandField intronMotif \
        --sjdbGTFfile "${Annotation_GTF}" \
        --genomeFastaFiles "${Genome_FASTA}"

done

But I get the error:

line 65: 37898 Segmentation fault      (core dumped) STAR --genomeDir "${Reference}" --readFilesIn "${forward_file}" "${reverse_file}" --outFileNamePrefix "${Output_directory}/${sample_name}_Soft" --runThreadN "${num_threads}" --genomeLoad NoSharedMemory --outSAMtype BAM SortedByCoordinate --outTmpDir "${TMPDIR}" --outStd Log --outSAMunmapped Within --outSAMattributes Standard --outSAMstrandField intronMotif --sjdbGTFfile "${Annotation_GTF}" --genomeFastaFiles "${Genome_FASTA}"

Line 65 is the start of the for loop but I cannot seem to find any error there.

This happens when I use STAR version 2.7.11a and STAR version STAR/2.7.10b. Why is this occurring?

STAR Linux • 1.1k views
ADD COMMENT
0
Entering edit mode

When STAR crashes it is usually due to its excessive memory demands. First, remove --outSAMtype BAM SortedByCoordinate and change that to output an unsorted file. This will massively decrease memory. If you need sorted bam then later use samtools sort. How much memory is available?

ADD REPLY
0
Entering edit mode

I give it 90g of vmem. It should be enough just for less than 15 bams.

ADD REPLY
0
Entering edit mode

If you need a sorted BAM (which in 99% cases you will not), always use --limitBAMsortRAM and set it to less than your vmem (say, 75G in your case). But like ATPoint says, it's best to output unsorted BAM and then use samtools sort followed by whatever. Even if you need wiggle/bedgraph files, use alignReads mode with unsorted BAM, sort+index the BAM then use inputAlignmentsFromBAM mode with the sorted+indexed BAM file.

ADD REPLY
0
Entering edit mode

I did what Ram suggested and limited the Ram suggested and added --limitBAMsortRAM 75000000000 but then I get EXITING because of fatal ERROR: could not make temporary directory: /path/to/temporary/directory/ SOLUTION: (i) please check the path and writing permissions

ADD REPLY
0
Entering edit mode

Are you starting these jobs in parallel with that simple for loop? Looks like there are 15 samples? Each of those jobs is going to try and use memory you have so that is the reason you are running out of RAM. Sorting the files afterwards would be an efficient operation as has been suggested.

ADD REPLY
0
Entering edit mode

I don't believe that they are in parallel. Its just a for loop.

ADD REPLY
0
Entering edit mode

/path/to/temporary/directory/

Is that really the path or did you camourflage it?

ADD REPLY
0
Entering edit mode

I work on an HPC and I cannot share paths this is why it is not written there.

ADD REPLY
0
Entering edit mode

Then say this upfront...

Anyway, the error is clear, makee sure you can write this location and have mkdir permissions.

ADD REPLY
0
Entering edit mode

For the run with the EXITING because of fatal ERROR: could not make temporary directory I had used 2.7.11a and I know some versions have difficultly making the temporary directory due to issues with the STAR software so I am doing a run with STAR version 2.7.10b to see.

ADD REPLY
0
Entering edit mode

I still get the error:

line 65: 29089 Segmentation fault

I am not sure why this is occurring.

ADD REPLY
0
Entering edit mode

You're running out of memory. I've never done this, but try using a better --genomeLoad option. See this post: STAR genomeLoad issue

You will need to figure out how to implement the load-genome-before-looping-over-samples step described by Devon.

ADD REPLY
0
Entering edit mode

I will try and figure it out on my own given what you all have mentioned. Thank you for your time.

ADD REPLY

Login before adding your answer.

Traffic: 1793 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6