cellranger count aggregated to transcripts
8 weeks ago
jomo018 ▴ 620

I am interested in counting reads over transcripts rather than over genes with SC from 10X.

I guess this can be performed by manipulating the GTF file and creating a modified reference. However I am not sure if cellranger count will properly handle mapping of exons to multiple transcripts.

Is there a standard / recommended way for doing this?

8 weeks ago
benformatics ★ 2.5k

I don't think there is any built-in compatibility. You would probably need to generate your own GTF.

On that note, the standard scRNA-seq 10x kit is 3'-biased. So in the vast majority of cases the only transcripts you could differentiate between would be those that differed in their 3'UTR/last exon. You wouldn't be able to reliably distinguish alternatively spliced transcripts that are uniquely expressed ... unless you have really good coverage of their exon-exon junctions (which is possible in some cases). Assuming your data is 3'-biased, your goal seems like a bit of a reach unless you are hoping to answer a specific question - or looking at alternative poly-adenylation specifically.

EDIT: To answer your question directly I would use something like https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/advanced/references and then give all the transcripts unique IDs so they are treated like genes.

To add, there is coverage outside of the first 1-2kb upstream of the poly-A site but I wouldn't say that I have a lot of confidence in it...

Yes, I am trying to establish if a specific cluster of cells tends to express different transcripts of a gene, whereas other cells are more "targeted".

If you are just wondering about one gene then I would do the mapping manually from the BAM to the transcripts of that gene - using something like GenomicRanges.