I have mouse
RNA-seq data (
single-end stranded - reverse strand) which I
STAR mapped against
GTF, where I ran
STAR in a mode that also generates a
bam file of the reads mapping to the
For my purpose I'd like to retain only reads that map to transcripts annotated as
protein_coding in the
GTF, which would be my total, meaning
TPMs will be calculated based on that slice of the pie rather than based on all reads.
What I did is
bam, and then subset that
bam with a
bed file which only includes the transcripts that are annotated as
protein_coding. This reduces the number of mapped reads from 11,653,865 to 3,483,962.
When I use
Salmon to quantify expression of that subsetted
Salmon crashes (so does
MMSEQ), but it doesn't if I give it the un-subsetted
Does anyone have any idea why it's crashing?