Hello All, I've run STAR on a batach of RNAseq data, I did not have a .gtf file to run with STAR, so I ran the "two pass" mode, which outputs a .SAM file. Everything worked as expected.
I use samtools to convert the .SAM to a sorted .BAM
[tau2@cbsulm04 workdir]$ STAR --runThreadN 64 --genomeDir genome_indicies --readFilesIn All_Ldel_R1s.fq,All_Ldel_R2s.fq --twopassMode Basic —-outSAMtype BAM SortedByCoordinate --outSAMstrandField intronMotif --outFilterIntronMotifs RemoveNoncanonical
STAR --runThreadN 64 --genomeDir genome_indicies --readFilesIn All_Ldel_R1s.fq,All_Ldel_R2s.fq --twopassMode Basic —-outSAMtype BAM SortedByCoordinate --outSAMstrandField intronMotif --outFilterIntronMotifs RemoveNoncanonical
STAR version: 2.7.10b_alpha_220111 compiled: 2023-01-11T10:08:43-05:00 :/home/dobin/data/STAR/STARcode/STAR.master/source
Dec 20 11:32:41 ..... started STAR run Dec 20 11:32:43 ..... loading genome Dec 20 11:33:37 ..... started 1st pass mapping Dec 20 12:29:40 ..... finished 1st pass mapping Dec 20 12:29:41 ..... inserting junctions into the genome indices Dec 20 12:33:42 ..... started mapping Dec 20 13:45:43 ..... finished mapping Dec 20 13:45:46 ..... finished successfully
Next, I ran samtools to convert Aligned.out.sam to Aligned.sortedByCoord.out.bam. This ran sucessfully.
$ samtools view -@ 8 -o Ldel_aligned_out.bam Aligned.out.sam
Now, I'm trying to convert the .BAM to a .GTF using STRINGTIE.
$ stringtie Ldel_aligned_out.bam -p 8 -o Ldel_output.gtf
However, I get the following error for a single entry of the data in my standard output.
Error: the input alignment file is not sorted!
read LH00309:324:22HLW2LT4:2:1101:41380:1028 (start 24386289) found at position 24386289 on CM082554.1 when prev_pos=163416112
The terminal app (Mac) doesn't indicate that the process was automatically terminated.
My questions are 1) whether the process will continue to run despite the error, and if so, 2) will I get the expected resutls. Also, why would there be a sorting error if samtools completed the .SAM to .BAM without an error message.
Hopefully I formated this message properly, apologies in advance if not. I do this rarely.
Best,
Todd
Thanks Pierre, I have the .BAM file from the sort command I mistakenly used. Is there any reason that I can't just sort that .BAM file and avoid having to go back to the .SAM, and under what circumstances is it necessary to index?
show any error message please
random access to genomic regions in the bam