Question

UMI introduction and deduplication with SMARTer-Seq

0

Entering edit mode

3 months ago

rayanelkholdi • 0

Hello everyone,

I had a question about SMARTer Technology. I'm planning to use SMART-Seq Total RNA Pico Input with UMIs (ZapR Mammalian) where the very first step is fragmentation of RNA. In that case, when the fragmentation is done before the 1st PCR I don't understand how you can deduplicate the UMIs. If we fragment the RNA before the introduction of UMIs, I feel like different fragment of the same RNA molecule will have different UMIs leading to counting them as several original RNA molecules ? Or am I wrong ? How should I deduplicate RNA in that case ?

Thanks in advance !

RNAseq sequencing UMI RNA • 969 views

ADD COMMENT • link updated 3 months ago by i.sudbery 22k • written 3 months ago by rayanelkholdi • 0

0

Entering edit mode

1st PCR I don't understand how you can deduplicate the UMIs

UMI's are only going to be de-duplicated at the level of each fragmented RNA, which undergoes PCR. UMI's thus mark each individual fragment of RNA that was RT'ed. See Figure 2 in manual. Counting will be done with UMI-deduplicated fragments aligned to the reference.

ADD REPLY • link 3 months ago by GenoMax 154k

0

Entering edit mode

Thank you for your answer ! So if for example I have one transcript of Protein A (so 1 RNA molecule) at the beginning and it gets fragmented into 4 fragments, each of these fragments will get a UMI and at the end I will have counted 4 transcripts of Protein A instead of 1?

ADD REPLY • link 3 months ago by rayanelkholdi • 0

0

Entering edit mode

Those 4 fragments will align to the gene for protein A and will be counted as 1 copy, assuming "gene" level summarization (which is what many use).

ADD REPLY • link 3 months ago by GenoMax 154k

0

Entering edit mode

Depends how you do the qualification. If you were to use an EM quantifier, like RSEM or Salmon, then they'd probably all come out in the wash as one transcript, but remember if you that these tools compute relative, not absolutely transcript numbers. If you use straight forward counts based quantification then you'll get 4 counts. However, it's highly unlikely that if a transcript were fragmented into 4 pieces, that the sequencing would capture all 4. Most likely you'll only see one of the fragments in the data. I've always been sceptical of claims that UMIs allow you to calculate absolute transcript numbers.

ADD REPLY • link 3 months ago by i.sudbery 22k