Question

Single-cell RNA Sequencing - transcript abundance estimation using Alevin-Fry

0

Entering edit mode

3 months ago

wooh • 0

Dear All,

I am trying to estimate transcript counts/abudance on single-cell RNA sequencing dataset. I have used the alevin-fry pipeline however am required to input a transcript to gene mapping file in order to carry out the quantification step and obtain the quant matrix. Does anyone have any advice for how the transcript x cell matrix may be obtained instead of gene x cell matrix? I want to use a fast, less memory intensive protocol and found that alevin-fry seemed to be the most user friendly. My overall aim is to look at the expression of a certain isoform of a gene within different cells. Thanks!

Alevin-Fry single-cell sequencing Salmon RNA • 836 views

ADD COMMENT • link updated 11 weeks ago by GenoMax 153k • written 3 months ago by wooh • 0

0

Entering edit mode

What kind of single cell data are you using? 10x data is 3'-biased so you may not have information needed to decipher data at transcript level.

ADD REPLY • link 3 months ago by GenoMax 153k

0

Entering edit mode

Hi it is 10x genomic data, I was reading around some other tools that have been developed for transcript level quantification like scasa/scalpel as well if they are better. Thank you!

ADD REPLY • link 3 months ago by wooh • 0

0

Entering edit mode

10x data is 3'-biased unless you are using the specific 5'-kit. So this is a characteristic of data not a question of program to use. You can try the programs you mention but you will need to evaluate the results.

ADD REPLY • link 3 months ago by GenoMax 153k

0

Entering edit mode

Thank you! Yes its a public dataset so no control over that unfortunately, that hasnt used the specific 5'-kit so wanted to see how these programs work to offset the bias but as you say the characteristic of the dataset drastically limits this.

Do you have experience of using these tools like scasa/scapal or alevin-fry for 10x genomics data?

ADD REPLY • link 3 months ago by wooh • 0

0

Entering edit mode

The main issue is that there will be an inherent lack of information (due to the aforementioned 3' bias) in tagged-end data. Nonetheless, if you want to get estimated transcript level counts from alevin-fry, you can simply provide it with a transcript <-> transcript level mapping in place of the transcript <-> gene file.

ADD REPLY • link 3 months ago by Rob 7.1k

0

Entering edit mode

Thank you that makes sense. When previously doing this it seemed the quantification result had assigned exact equal values of two transcripts of the gene in certain clusters and otherwise low level expression. This suggests to me that not mapping correctly because of the bias but I don't understand how there has been an exact 0.5 split.

ADD REPLY • link 3 months ago by wooh • 0

0

Entering edit mode

Apologies, one sequencing protocol describes loading the cell suspension into chips with 3 prime v3 chemistry and barcoded with 10 genomics to capture RNA that reversed transcribed and sequncing libraries made with reagents from Chromium single cell 5 prime v3 kit and single cell VDJ enrichment kit - is this what you mean by the specific 5' kit?

ADD REPLY • link 11 weeks ago by wooh • 0

0

Entering edit mode

There is a specific 5' kit: https://www.10xgenomics.com/products/universal-five-prime-gene-expression

The difference between the 3' and 5' kits is here: https://kb.10xgenomics.com/hc/en-us/articles/360000939852-What-is-the-difference-between-Single-Cell-3-and-5-Gene-Expression-libraries

ADD REPLY • link 11 weeks ago by GenoMax 153k