Hello. everyone
I am trying to count aligned reads using Salmon in alignment-based mode.
My command is:
salmon quant -t transcript.fa -l A -a aligned_result.bam -o salmon_quant
I am encountering the following error:
Transcript chr8 appeared in the BAM header, but was not in the provided FASTA file.
The preprocessing steps I performed are as follows:
- Reference genome indexing (by ncbi).
- Mapping using STAR.
- Converting SAM to BAM (by samtools).
- Sorting (Picard SortSam).
- Checking for duplicate reads (Picard MarkDuplicates).
- Splitting reads that contain Ns in the CIGAR string (GATK4 SplitNCigarReads).
- Recalibration (GATK4 BQSR).
- Sorting (by samtools).
- Quantifying transcript-level expression using the Salmon algorithm.
I obtained the transcript fasta file from https://www.gencodegenes.org/human/ and used it for analysis. I also tried creating the transcript fasta file using gffread, but the same error persists.
Could this be a problem with the transcript fasta file or the BAM file?
Thank you in advance for your help.
Oh, that was my mistake. I accidentally used gencode instead of ncbi in step 1. then, I'll proceed by running salmon for the mapping itself. Thank you