Question: STAR-RNA Segmentation Fault
0
gravatar for Poetic_Premium6
11 months ago by
Kiel, Germany
Poetic_Premium60 wrote:

Hey all, I've been stalking this website for a few months for help (with success), and now I've run into my first problem that I can't solve :O . I am a wet-biologist by training so I'm still a noob at bioinformatics!

I am using STAR for E. coli reads and lately I have been receiving "segmentation Fault" error during the mapping portion. I have about 25 Gb of free RAM and read/write/execute permissions all set for user. I am using Debian and tried using both STAR versions2.5 & 2.6 I do not think anything is wrong with my code... STAR --runThreadN 3 --genomeDir /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/ --sjdbGTFfile /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/UTI89_annotation.gtf --sjdbOverhang 100 --readFilesIn /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/UTI89_PostTrim1.fastq /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/UTI89_PostTrim2.fastq --readFilesCommand zcat --outSAMtype BAM SortedByCoordinate Unsorted

Does anyone have a fix or some suggestions so I may continue troubleshooting?

Thanks for your time @Biostars, ~Jonathan J.S

ADD COMMENTlink modified 11 months ago by Jeffin Rockey1.1k • written 11 months ago by Poetic_Premium60

Could you copy/paste the log file involved please.

Do not use --readFilesCommand zcat if your files are not compressed

Did your index creation end succesfully ?

What was your command to create your index file ?

ADD REPLYlink modified 11 months ago • written 11 months ago by Bastien Hervé4.2k
1
gravatar for Bastien Hervé
11 months ago by
Bastien Hervé4.2k
Limoges, CBRS, France
Bastien Hervé4.2k wrote:

Seems like you mixed up the two commands.

First create the index

STAR --runThreadN 3 --runMode genomeGenerate --genomeDir /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/ --genomeFastaFiles /path/to/reference/genome/Escherchia_coli_UTI89_reference_genome.fasta --sjdbGTFfile /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/UTI89_annotation.gtf --sjdbOverhang 100

Then, process to the alignment

STAR --runThreadN 3 --runMode alignReads --genomeDir /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/ --readFilesIn /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/UTI89_PostTrim1.fastq /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/UTI89_PostTrim2.fastq --outSAMtype BAM SortedByCoordinate Unsorted
ADD COMMENTlink modified 11 months ago • written 11 months ago by Bastien Hervé4.2k

Hey Bastien, thank you for your clarification in splitting the commands. This is way is cleaner for sure! While the genome generation has not been a problem, I am still receiving the "segmentation fault" error the alignment portion.

Additionally, I think that the --readFilesCommand zcat must follow the rna-seq FASTQ files? Regardless, do you have any further suggestions? Thanks for your time!

ADD REPLYlink written 11 months ago by Poetic_Premium60

Could you please execute the following command and copy/paste the result below :

ls -alt /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/

--readFilesCommand UncompressionCommand option, where UncompressionCommand is the un-compression command that takes the file name as input parameter, and sends the uncompressed output to stdout. For example, for gzipped files (*.gz) use --readFilesCommand zcat OR --readFilesCommand gunzip -c. For bzip2-compressed files, use --readFilesCommand bunzip2 -c

As you have UTI89_PostTrim1.fastq and not UTI89_PostTrim1.fastq.gz, you do not need --readFilesCommand

ADD REPLYlink modified 11 months ago • written 11 months ago by Bastien Hervé4.2k

Good point, as it is redundant! Sure, here are the results:

-rw-rw-rw- 1 jonathan7 jonathan7 19483 Jun 4 09:53 Log.out

-rw-rw-rw- 1 jonathan7 jonathan7 236 Jun 4 09:53 Log.progress.out

drwxrwxrwx 10 jonathan7 jonathan7 4096 Jun 4 09:53 .

-rw-rw-rw- 1 jonathan7 jonathan7 0 Jun 4 09:53 Aligned.out.bam

-rw-rw-rw- 1 jonathan7 jonathan7 0 Jun 4 09:53 Aligned.sortedByCoord.out.bam

drwx------ 3 root root 4096 Jun 4 09:53 _STARtmp

-rw-rw-rw- 1 jonathan7 jonathan7 1565873619 Jun 4 09:46 SAindex

-rw-rw-rw- 1 jonathan7 jonathan7 42734764 Jun 4 09:46 SA

-rw-rw-rw- 1 jonathan7 jonathan7 2774 Jun 4 09:46 exonGeTrInfo.tab

-rw-rw-rw- 1 jonathan7 jonathan7 864 Jun 4 09:46 exonInfo.tab

-rw-rw-rw- 1 jonathan7 jonathan7 1031 Jun 4 09:46 geneInfo.tab

-rw-rw-rw- 1 jonathan7 jonathan7 5505024 Jun 4 09:46 Genome

-rw-rw-rw- 1 jonathan7 jonathan7 870 Jun 4 09:46 genomeParameters.txt

-rw-rw-rw- 1 jonathan7 jonathan7 6 Jun 4 09:46 sjdbInfo.txt

-rw-rw-rw- 1 jonathan7 jonathan7 0 Jun 4 09:46 sjdbList.fromGTF.out.tab

-rw-rw-rw- 1 jonathan7 jonathan7 0 Jun 4 09:46 sjdbList.out.tab

-rw-rw-rw- 1 jonathan7 jonathan7 4268 Jun 4 09:46 transcriptInfo.tab

-rw-rw-rw- 1 jonathan7 jonathan7 15 Jun 4 09:46 chrLength.txt

-rw-rw-rw- 1 jonathan7 jonathan7 39 Jun 4 09:46 chrNameLength.txt

-rw-rw-rw- 1 jonathan7 jonathan7 24 Jun 4 09:46 chrName.txt

-rw-rw-rw- 1 jonathan7 jonathan7 18 Jun 4 09:46 chrStart.txt

drwx------ 2 root root 4096 Jun 4 09:37 _STARgenome

drwxrwxrwx 5 jonathan7 jonathan7 4096 Jun 4 04:52 ..

-rw-rw-r-- 1 jonathan7 jonathan7 134 Jun 1 04:08 genes.fpkm_tracking

-rw-rw-r-- 1 jonathan7 jonathan7 134 Jun 1 04:08 isoforms.fpkm_tracking

-rw-rw-r-- 1 jonathan7 jonathan7 0 Jun 1 04:08 skipped.gtf

-rw-rw-r-- 1 jonathan7 jonathan7 0 Jun 1 04:08 transcripts.gtf

-rw-rw-rw- 1 jonathan7 jonathan7 206422016 May 31 09:34 UTI89_PostTrim2.fastq.gz

-rw-rw-rw- 1 jonathan7 jonathan7 0 May 24 04:59 G15489_htseq_counts_pos_sorted.out

-rw-rw-rw- 1 jonathan7 jonathan7 32 May 24 04:59 accepted_hits_uniq_sorted.bam.bai

-rw-rw-rw- 1 jonathan7 jonathan7 358 May 24 04:59 accepted_hits_uniq_sorted.bam

-rw-rw-rw- 1 jonathan7 jonathan7 348 May 24 04:59 accepted_hits_uniq.bam

-rw-rw-rw- 1 jonathan7 jonathan7 32 May 24 04:59 Aligned.bam.bai -rw-rw-rw- 1 jonathan7 jonathan7 1803 May 23 07:47 Log.final.out

-rw-rw-rw- 1 jonathan7 jonathan7 0 May 23 07:47 SJ.out.tab

-rw-rw-rw- 1 jonathan7 jonathan7 348 May 23 07:31 Aligned.bam

-rw-rw-rw- 1 jonathan7 jonathan7 777 May 22 10:10 Aligned.out.sam

-rw-rw-rw- 1 jonathan7 jonathan7 626458 May 18 05:14 UTI89_annotation.gtf

drwxrwxrwx 2 jonathan7 jonathan7 4096 May 18 04:54 genomeDir

-rw-rw-rw- 1 jonathan7 jonathan7 5244844 May 18 04:30 UTI_genomic.fasta

-rw-rw-rw- 1 jonathan7 jonathan7 311377111 Apr 30 05:19 UTI89_PostTrim1.fastq.gz

-rw-rw-rw- 1 jonathan7 jonathan7 1002466424 Apr 30 05:19 UTI89_PostTrim1.fastq 

-rw-rw-rw- 1 jonathan7 jonathan7 1285209680 Apr 30 04:57 UTI89_PostTrim2.fastq 

-rw-rw-rw- 1 jonathan7 jonathan7 3893 Apr 24 08:35 index.html

ADD REPLYlink modified 11 months ago • written 11 months ago by Poetic_Premium60

Thanks, and in the Log files what do you have ?

ADD REPLYlink written 11 months ago by Bastien Hervé4.2k

Oh lovely, it is just a fatal input error now based on a read input rather than a "Segmentation Fault". Anyway, the log.out file exceeds the character limitation, which specific portions are necessary for diagnostics?

ADD REPLYlink modified 11 months ago • written 11 months ago by Poetic_Premium60

Oh lovely, it is just a fatal input error now based on a read input rather than a "Segmentation Fault"

You are tilting me OP :), you have fastq.gz not fastq, look :

-rw-rw-rw- 1 jonathan7 jonathan7 311377111 Apr 30 05:19 UTI89_PostTrim1.fastq.gz

-rw-rw-rw- 1 jonathan7 jonathan7 206422016 May 31 09:34 UTI89_PostTrim2.fastq.gz

Use this :

STAR --runThreadN 3 --runMode alignReads --genomeDir /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/ --readFilesIn /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/UTI89_PostTrim1.fastq.gz /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/UTI89_PostTrim2.fastq.gz --readFilesCommand gunzip -c --outSAMtype BAM SortedByCoordinate Unsorted

--readFilesCommand gunzip -c or --readFilesCommand zcat as you wish, both seems to work

ADD REPLYlink modified 11 months ago • written 11 months ago by Bastien Hervé4.2k

Haha sorry for the tilt, as I mentioned in the header... I am a noob :P .

Besides, I kept both files w/ & w/o gzip in the directory just to try to see if it would result in a memory difference and must have confused them in the original post but..... I tried both options from your new command with the gz files and they still lead back to Segmentation Fault again.

STAR --runThreadN 3 --runMode alignReads --genomeDir /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/ --readFilesIn /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/UTI89_PostTrim1.fastq.gz /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/UTI89_PostTrim2.fastq.gz --readFilesCommand zcat --outSAMtype BAM SortedByCoordinate Unsorted
ADD REPLYlink modified 11 months ago • written 11 months ago by Poetic_Premium60

I got this error once, problem with gzipped file. So last try.

Remove this parameter --readFilesCommand zcat

And decompress your UTI89_PostTrim1.fastq.gz and UTI89_PostTrim2.fastq.gz

Then retry, something like this :

STAR --runThreadN 3 --runMode alignReads --genomeDir /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/ --readFilesIn /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/UTI89_PostTrim1.fastq /home/jonathan7/Documents/Comp_Micro/STAR/Escherchia_coli_UTI89/UTI89_PostTrim2.fastq --outSAMtype BAM SortedByCoordinate Unsorted
ADD REPLYlink modified 11 months ago • written 11 months ago by Bastien Hervé4.2k

Bastien, I tried this change too without any luck. I think that I will have to reach out to the software developer, Alex for help.

Thank again for all of your help and time, it was greatly appreciated!!!

ADD REPLYlink written 11 months ago by Poetic_Premium60

Maybe your fastq files are corrupt.

Good luck for your investigation and don't forget to put the answer below if there is one

ADD REPLYlink modified 11 months ago • written 11 months ago by Bastien Hervé4.2k

Thanks Bastien, I assumed that too and re-did QC to check. I'll post if anything comes up. Best wishes!

ADD REPLYlink written 11 months ago by Poetic_Premium60
0
gravatar for Jeffin Rockey
11 months ago by
Jeffin Rockey1.1k
Karimannoor
Jeffin Rockey1.1k wrote:

Most likely the genome being small would be the problem. I had encountered seg-fault issue while dealing with very small genomes. Pasted below is a snippet from section 2.2.5 of STAR 2.4 manual. Please see whether setting the parameter as advised would solve the issue.

Very small genome.

For small genomes, the parameter --genomeSAindexNbases needs to be scaled down, with a typical value of min(14, log2(GenomeLength)/2 - 1). For example, for 1 megaBase genome, this is equal to 9, for 100 kiloBase genome, this is equal to 7.

ADD COMMENTlink modified 11 months ago • written 11 months ago by Jeffin Rockey1.1k

Hey Jeffin, I just tried this without any success in fixing the segmentation fault. But surely, you are correct...this would be great to keep in my code :P

Best, ~Jonathan J.S

ADD REPLYlink written 11 months ago by Poetic_Premium60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 797 users visited in the last hour