4.3 years ago by
For a start:
The main difference is that you usually have to divide your reads by some sort of barcode to get which reads belong to which cell.
The barcode will need to be trimmed off before aligning, but from there the basic alignment and quantification steps are the similar with each cell treated like its own sample.
Also, each cell is usually sequenced more shallowly than a library because there are fewer molecules in the input, so you start getting duplicate reads earlier earlier. You may want to remove duplicates, like with Picard.
There might be a lot of runs of Ts if they were sequencing into the poly a tail a lot. It would depend on the protocol. I do not think it is a problem inherent to the fact that it is single cell. That happens sometimes with regular libraries.
The downstream analyses are different. I would look up Rahul Satija's papers. He has got some analyses he has done and he has a package for single cell stuff. I do not know how available it is yet.