Empty BAM file upon mapping with STAR
3.4 years ago
makwana.kd ▴ 50

I am using to align RNA sequence reads to the reference genome I created using STAR. I have done the QC of every fastq file before running the mapping job. The problem I am encountering is that some of the fastq files are generating empty BAM files, whereas, other files have no problem. I am hereby attaching Log.final.out files. The first one is for the file for which I had no problem:

more Log.final.out2 Started job on | May 14 16:55:48 Started mapping on | May 14 16:59:56 Finished on | May 14 17:06:24 Mapping speed, Million of reads per hour | 97.45

                      Number of input reads |       10503242
Average input read length |       66
Uniquely mapped reads number |       6887932
Uniquely mapped reads % |       65.58%
Average mapped length |       65.33
Number of splices: Total |       471329
Number of splices: Annotated (sjdb) |       460539
Number of splices: GT/AG |       462632
Number of splices: GC/AG |       3610
Number of splices: AT/AC |       302
Number of splices: Non-canonical |       4785
Mismatch rate per base, % |       0.49%
Deletion rate per base |       0.01%
Deletion average length |       1.93
Insertion rate per base |       0.02%
Insertion average length |       1.83
Number of reads mapped to multiple loci |       2756203
% of reads mapped to multiple loci |       26.24%
Number of reads mapped to too many loci |       38499
% of reads mapped to too many loci |       0.37%
% of reads unmapped: too many mismatches |       0.00%
% of reads unmapped: too short |       7.23%
% of reads unmapped: other |       0.58%
Number of chimeric reads |       0
% of chimeric reads |       0.00%


The second Log.final.out file looks like the following: more Log.final.out Started job on | May 14 17:25:16 Started mapping on | May 14 17:33:08 Finished on | May 14 17:33:10 Mapping speed, Million of reads per hour | 0.00

                      Number of input reads |       0
Average input read length |       0
Uniquely mapped reads number |       0
Uniquely mapped reads % |       0.00%
Average mapped length |       0.00
Number of splices: Total |       0
Number of splices: Annotated (sjdb) |       0
Number of splices: GT/AG |       0
Number of splices: GC/AG |       0
Number of splices: AT/AC |       0
Number of splices: Non-canonical |       0
Mismatch rate per base, % |       -nan%
Deletion rate per base |       0.00%
Deletion average length |       0.00
Insertion rate per base |       0.00%
Insertion average length |       0.00
Number of reads mapped to multiple loci |       0
% of reads mapped to multiple loci |       0.00%
Number of reads mapped to too many loci |       0
% of reads mapped to too many loci |       0.00%
% of reads unmapped: too many mismatches |       0.00%
% of reads unmapped: too short |       0.00%
% of reads unmapped: other |       0.00%
Number of chimeric reads |       0
% of chimeric reads |       0.00%


I have not modified the command for the above two jobs (apart from the fact that these files are in different directories ). Here is the generic command: STAR --genomeDir /users/PFS0231/cls0226/Createindex2/ --runThreadN 1 --readFilesIn /users/PFS0231/cls0226/output/ALzt14-1/232/ file.fastq --outSAMtype BAM Unsorted

I am so perplexed why am I having this problem. Any help is really appreciated.

Thanks

3.4 years ago
--readFilesIn /users/PFS0231/cls0226/output/ALzt14-1/232/ file.fastq


Looks like there's a whitespace between your directory and the file basename.

Try:

--readFilesIn /users/PFS0231/cls0226/output/ALzt14-1/232/file.fastq

Also for future posts, note how easier this is to spot when using the code blocks versus the plain text version you posted.

Thank you so much Manuel. It was such a silly mistake and now I feel stupid. Your time and help is much appreciated.

No problem! :) I've done similar if not exactly the same thing more than once haha.

Mark the answer as accepted if the issue is resolved.