star --readFilesIn error when using unmapped.out.mate1/2 files
11 months ago
Hello I'm very new to star. I am trying to run star by using the star's --outReadsUnmapped Fastx output (Unmapped.out.mate1/2 files). Although they are fastq files, star keeps showing me this error.

EXITING because of fatal input ERROR: could not open readFilesIn=


And this is my command

STAR  --runThreadN 12      \
--genomeSAindexNbases 10  \
--genomeDir  ${PROJECT_DIR}ref_bacteria/ \ --readFilesIn${PROJECT_DIR}align/Sample_1/Sample_1Unmapped.out.mate1 \
--outFileNamePrefix {PROJECT_DIR}align/Sample_1/Sample_1 \ --outFilterMismatchNoverLmax 0.02 \ --outSAMtype BAM SortedByCoordinate \ --outSAMattributes All \ -quantMode GeneCounts \ --outSAMunmapped Within \ --sjdbGTFfile{PROJECT_DIR}ref_bacteria/genes.gtf \
--sjdbOverhang 100;


done

Can you tell me the reason why i cannot use this Unmapped.out.mate1/2 files?

Thanks.

star RNA-Seq
There is no --outReadsUnmapped Fastx in your command line

--readFilesIn {PROJECT_DIR}align/Sample_1/Sample_1Unmapped.out.mate1  Should be a path to fastq file like : --readFilesIn{PROJECT_DIR}align/Sample_1/Sample_1.fastq

I think they are using the --outReadsUnmapped Fastx output as the input for this command.

Is the ${PROJECT_DIR} variable ending in / ? Can you head the Sample_1Unmapped.out.mate1 file ? ADD REPLY 0 Entering edit mode Yes.${PROJECT_DIR} is starting and ending in /. And I can head that file and it was certainly fastq file.

it shows

@A00718:115:HT7HLDSXX:4:1128:26549:26616 0:N:  00
GTCACCATGATGTCAGAGACAGGAATAACCTAAAATCCTCTGAGGGGTAGGTAATTCCAGACCTGGTGTTAAAAGGCCCCTCAGCAACCTTTTGTCATCAC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00718:115:HT7HLDSXX:4:1128:3613:26631 0:N:  00
GTTCAGCACAAACACTCCCTTGTCCACAGCCACTAGCCCAACTCGCGCCCCCTGTTTAACTTCAATACTGAGTGTCGTTTGAAGCCCAGGTGCGAGATGTT
+
::FFFF:F,:,,FF,F,FFFF,F,FFFFFF:F,FF::,:,F:F:,,,F,,::F:,,F,,,,FF:FF,F:F:,F,FF,:,,:,:FFFF,,:,,,::FFF:,F
@A00718:115:HT7HLDSXX:4:1128:6198:26631 0:N:  00
ATCAAATACAAAGCTTTTTACAAAATTTTGAAGGCTGAACTCACTATGCACTAAGAGTTGTGCAAAGGGATTTACATATGTAATCTCAGTTAGTACTCAAA

benformatics is right. I used --outReadsUnmapped Fastx from another running as a input

Hello! I guess I could be late for this topic, but I'm trying to do a very similar analysis and I'm quite desperate, cause I don't really know how to.

Did you finally get to analyze those _unmapped.out.mate1/2 files? Is there any way to convert them to .fastq format?

And, apart from that, which bacteria reference database did you use?

I'm in my first days of analyzing RNAseq data and I just need to go on analyzing those unmapped sequences.

Thank you so much in advanced :)

Hello, the unmapped files from STAR are fastq file. If you do :

head Sample_1Unmapped.out.mate1


You will see there are fastq formatted, but I don't know why STAR does not put the .fastq extension.

Anyways, you can just add the extension modifying the name of the file :

mv Sample_1Unmapped.out.mate1 Sample_1Unmapped.out.mate1.fastq

Thank you so much!! I didn't expect it to be that easy... but indeed it worked :) Thank you!