Question

Single cell rna-seq (cel-seq 2) read position shift?

1

Entering edit mode

6.8 years ago

zhe ▴ 10

I have this single cell RNA-Seq data set prepared by CEL-SEQ 2 protocol (Hashimshony et al. 2016). After I assigned the reads to each cell and did the alignment using Bowtie2, I looked at the reads using Integrative Genome Viewer (IGV). I noticed that a lot of the aligned regions have reads with different start positions and at the same time have the same unique molecular identifier (UMI) sequence. They are in close proximity (usually < 5 nt between adjacent reads) and have equal lengths so it's not due to clipping. Should I regard these reads as separate fragments or are they generated during PCR amplification step and are from the same fragment? What is the possible source of these reads? How are they generated?

Thank you.

RNA-Seq alignment next-gen sequencing sequence • 2.2k views

ADD COMMENT • link updated 5.2 years ago by i.sudbery 19k • written 6.8 years ago by zhe ▴ 10

score 0 · Answer 1 · 2019-01-22

Sorry for the delay, but I thought I should add this for the record.

CEL-Seq2 adds the UMI in the RT stage of the sample prep, but fragmentation happens later. Thus multiple copies of the same original fragment can have different mapping co-ordinates. This is a common property of most 3' tagging protocols. When analysing this sort of data the genomic mapping position of the read is not informative as to whether the read is a duplicate or not. Instead all reads mapping to the same gene with the same (or related) UMI should be regarded as a duplicate.

If you use UMI-Tools, you can achieve this by using the --per-gene option. Which gene reads map to can be determined in two ways. First you could map to the transcriptome, where the contig is the transcript (or gene) id. You would then use the --per-contig option to UMI-tools also. Alternatively reads can be assigned to genes using more recent versions of featureCounts, and this encoded in a tag in the BAM file. In this case use the --gene-tag=XT option to UMI-tools.