Probably a simple fix but we all have to start somewhere. I am trying to figure out how to align reads to a transcriptome (Trinity generated) using STAR and currently doing some troubleshooting. I ran an alignment with just one of my samples (sample was included in generated transcriptome). The Average input read length was 141 (which intuitively to me should not lead to 99% reads being too short as the output says). These were originally 150bp sequenced.
First was to build the index
Slurm command = ... wrap="STAR --runThreadN 20 --runMode genomeGenerate --genomeDir ...path_to_index --genomeFastaFiles Trinity.fasta --genomeSAindexNbases 14"
Then to align
Slurm command = ... --wrap="STAR --readFilesCommand zcat --readFilesIn <in1> <in2> --genomeDir <.../index> --runThreadN 20 --outSAMtype BAM SortedByCoordinate --outSAMunmapped Within"
Any thoughts? Could this be a problem with building indices or the actual alignment?
Number of input reads | 42852270 Average input read length | 141 UNIQUE READS: Uniquely mapped reads number | 217 Uniquely mapped reads % | 0.00% Average mapped length | 126.74 Number of splices: Total | 0 Number of splices: Annotated (sjdb) | 0 Number of splices: GT/AG | 0 Number of splices: GC/AG | 0 Number of splices: AT/AC | 0 Number of splices: Non-canonical | 0 Mismatch rate per base, % | 7.60% Deletion rate per base | 0.00% Deletion average length | 0.00 Insertion rate per base | 0.01% Insertion average length | 1.00 MULTI-MAPPING READS: Number of reads mapped to multiple loci | 267726 % of reads mapped to multiple loci | 0.62% Number of reads mapped to too many loci | 24294 % of reads mapped to too many loci | 0.06% UNMAPPED READS: Number of reads unmapped: too many mismatches | 0 % of reads unmapped: too many mismatches | 0.00% Number of reads unmapped: too short | 42555398 % of reads unmapped: too short | 99.31% Number of reads unmapped: other | 4635 % of reads unmapped: other | 0.01% CHIMERIC READS: Number of chimeric reads | 0 % of chimeric reads | 0.00%