I'm an R beginner trying to learn how to analyze single-cell seq data using Seurat tools. I wanted to try to work with a published Drop-seq UMI count dataset available via GEO:
I've encountered two errors trying to load this data into R:
1. Using data <- read.table(file = /path/to/file, sep = '\t') results in a memory error Cannot allocate vector of size 125 Kb. To work around this I've tried to use memory.limit() to try to get around this (I have 8Gb of RAM) but R always crashes. 
- To get around the memory issue another way I've tried to use 
read.table.ffdfusing various combinations of parameters i.e.row.names = 1,header = TRUEbut each results in an error. i.e.attempt to set 'rownames' on an object with no dimensionsandmore columns than column names. 
I think the issue comes down to the fact that I do not know what this data file looks like and because it is very large data file (~4Gb) I haven't been able to open it to view it myself, even using LTFviewer. So does anybody have any tips on how to load in large single-cell seq UMI count files for use in Seurat pipeline? Would using read.table.ffdf work if I found the correct parameters to load in the file or is there a better way to go about this all together?
Thank you!
Maybe this helps:
data.tablestend to be a bit more manageable for memory issues.I've put the results of the following lines of code here, let me know if that works for you.
To be precise -- the link above allows you to download a tar archive. You probably need to untar it. Then open up R and read in the three files from that tar archive with any function meant to read in 10X CellRanger data, e.g.
DropletUtils::read10XCounts():This worked perfectly and avoided the memory issues I was having! Thank you for the help!