Entering edit mode
8 months ago
Sky
▴
10
I am trying to process a dataset (https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&page_size=10&acc=SRR16053948&display=data-access) using CellRangerv7. The only problem is that there is only 1 read. From my understanding, CellRanger requires two reads. From what I can see, no BAM file is uploaded only the Fastq. Can I still process this data with CellRanger?
This is the experiment you're looking for. It has 4 FASTQ files.
I understand that there are 4 Fastq files but each fastq file only has one read. CellRanger seemingly requests two reads (https://kb.10xgenomics.com/hc/en-us/articles/115003802691-How-do-I-prepare-Sequence-Read-Archive-SRA-data-from-NCBI-for-Cell-Ranger).
For example, this dataset has one (https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&page_size=10&acc=SRR16053948&display=metadata)
While this one has two and an index (https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR11772848&display=metadata)
I hope this helps to clarify my question
I'm not going to download files so maybe you can show me previews here. When you say "each fastq file only has one read", do you mean each file has only 4 lines in it, because that seems impossible. SRA says they have 1.2G bases per file, so "only one read" does not make any sense to me.
That's why I attached the links as an example. You don't have to download the data to look at the metadata and see that it says "one read per spot"
I don't think "one read per spot" means one read in total. It cannot be - the gzipped file is ~1GB in size.
Then I guess the question changes to how do process only one file with CellRanger. When you unzip the fastq, there is still one one file. Normally it will unzip into 2 or three so you can rename them R1, R2, I1, etc.
I think there's some serious communication gap - unzipping FQ does not yield multiple files. Can you download all 4 files from https://www.ncbi.nlm.nih.gov/sra/SRX12340615 and paste the first 12 lines of each file please?
Sorry for the delayed reply. I have included my code and the output from downloading the four files. Each dataset only resulted in one fastq file when I am expecting -R1 and -R2 so I can input it into cellranger since cellranger requires two inputs, not one. Unless there is a way to get around that.
I did want to note that SRR16053948 and 49 are the exact same size (992.4Mb) while 50 and 51 are both 1,018.8Mb.
Output:
Normally I am used to the output being something along the lines of SRR16053948-R1 and SRR16053948-R2. The files normally split automatically so I am not sure why they are not splitting now.
https://www.ncbi.nlm.nih.gov/sra/?term=SRR16053948
Metadata indicates that this is: