Entering edit mode
3 months ago
wooh
•
0
Dear All,
I am trying to estimate transcript counts/abudance on single-cell RNA sequencing dataset. I have used the alevin-fry pipeline however am required to input a transcript to gene mapping file in order to carry out the quantification step and obtain the quant matrix. Does anyone have any advice for how the transcript x cell matrix may be obtained instead of gene x cell matrix? I want to use a fast, less memory intensive protocol and found that alevin-fry seemed to be the most user friendly. My overall aim is to look at the expression of a certain isoform of a gene within different cells. Thanks!
What kind of single cell data are you using? 10x data is 3'-biased so you may not have information needed to decipher data at transcript level.
Hi it is 10x genomic data, I was reading around some other tools that have been developed for transcript level quantification like scasa/scalpel as well if they are better. Thank you!
10x data is 3'-biased unless you are using the specific 5'-kit. So this is a characteristic of data not a question of program to use. You can try the programs you mention but you will need to evaluate the results.
Thank you! Yes its a public dataset so no control over that unfortunately, that hasnt used the specific 5'-kit so wanted to see how these programs work to offset the bias but as you say the characteristic of the dataset drastically limits this.
Do you have experience of using these tools like scasa/scapal or alevin-fry for 10x genomics data?
The main issue is that there will be an inherent lack of information (due to the aforementioned 3' bias) in tagged-end data. Nonetheless, if you want to get estimated transcript level counts from
alevin-fry
, you can simply provide it with a transcript <-> transcript level mapping in place of the transcript <-> gene file.Thank you that makes sense. When previously doing this it seemed the quantification result had assigned exact equal values of two transcripts of the gene in certain clusters and otherwise low level expression. This suggests to me that not mapping correctly because of the bias but I don't understand how there has been an exact 0.5 split.
Apologies, one sequencing protocol describes loading the cell suspension into chips with 3 prime v3 chemistry and barcoded with 10 genomics to capture RNA that reversed transcribed and sequncing libraries made with reagents from Chromium single cell 5 prime v3 kit and single cell VDJ enrichment kit - is this what you mean by the specific 5' kit?
There is a specific 5' kit: https://www.10xgenomics.com/products/universal-five-prime-gene-expression
The difference between the 3' and 5' kits is here: https://kb.10xgenomics.com/hc/en-us/articles/360000939852-What-is-the-difference-between-Single-Cell-3-and-5-Gene-Expression-libraries