I'm trying to quantify my ONT samples after using minimap2 to align them tho the genome.
The samples were extracted using a direct-RNA protocol and were therefore mapped using the fasta file from the ensembl reporitory, listing the chromosomes, not the transcriptome of the mouse.
REFERENCE='Mmu.GrCm39.fa' # chromosomes
minimap2 -ax splice -k 14 -uf \
--secondary=no -G 25000 -t 24 ${REFERENCE} file.fastq > file.sam
Now, the bam file lists the chromosomes in the header.
@HD VN:1.6 SO:coordinate
@SQ SN:1 LN:195154279
@SQ SN:10 LN:130530862
@SQ SN:11 LN:121973369
@SQ SN:12 LN:120092757
@SQ SN:13 LN:120883175
@SQ SN:14 LN:125139656
@SQ SN:15 LN:104073951
@SQ SN:16 LN:98008968
@SQ SN:17 LN:95294699
...
If I understand it correctly, to run salmon quant
I must have a transcriptome as reference.
Does it mean, I have to re-run minimap2 against the mouse transcriptome, Mus_musculus.GRCm39.cdna.all.fa
instead?
Are there other quantification workflows without re-runnung the mapping against the transcriptome?
Thanks
Assa
In case you were not aware ONT also provides a transcriptome workflow for long reads: https://github.com/epi2me-labs/wf-transcriptomes