Entering edit mode
6.8 years ago
rob.costa1234 ▴ 310
I am working on expression profile of limited number of rare human genome in a disease over control groups. As has been reported in literature the single cell data will have lot of problems of limited data and quality. I was trying to find out if there is any specific aligner and DE procedure is more robust than other. I can use RPKM and Top hat, or us EDGE-R with raw counts but would like to get feedback from the colleagues if anemone has an better suggestion or has compared different aligners. Thanks
The major problem in this analysis is the high amount of noise. One way to reduce it is using UMI's like in here: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4412462/ . Another option is to have a lot of replicates to better estimate dispersion. Use DESeq2 to get DE genes, if neither of the above were applied you will probably get very few DE genes.
Hi Rob, it is unclear what you are asking. Are you looking for an established pipeline for DE analysis of single cell sequencing transcriptomics? Do you have replicates? I think such a well-established pipeline does not exist yet because the data type is quite new. But you might have luck with the standard RNA-seq tools like STAR, edgeR and DEseq. I would avoid RPKM based methods, because it adds additional assumptions about the influence of transcript length, which might or might not hold here.
Yes I have replicates and I was planning to use TopHat, EdgeR/Deseq. Is there any particular advantage of TopHat vs Star aligner.
Star is faster than Tophat, it is also often more sensitive in the presence of sequencing errors or imperfect reference genomes. You should try different tools though, because the analysis of single-cell data is not established enough to warrant clear guide lines.