SRA download question
1
0
Entering edit mode
4.0 years ago
awells2 ▴ 20

I am trying to download data from SRA using fastq-dump. The data I am trying to download is from SRR10349891. The sequencing design is 10x Genomics V2, and the layout is Paired. Based on some of the tutorials from cellranger (https://bioinformaticsworkbook.org/dataAnalysis/RNA-Seq/Single_Cell_RNAseq/Chromium_Cell_Ranger.html) I would expect the command:

fastq-dump --split-files SRR10349891

to yield two files SRR10349891_1.fastq and SRR10349891_2.fastq.

However, I only get one file (SRR10349891.fastq). All of the online resources I have seen require both .fastq files from this process in order to analyze this type of data (using methods like scPipe). Is there only file available for download from this experiment?

If so, are there any methods for converting this single .fastq file to two files (one containing the transcripts, and one containing the barcodes/umi)?

The other option that I have thought about is downloading the available .bam files for this experiment, and then using the 10X genomics bamtofastq tool to convert these back to fastq files.

Thank you!

rna-seq scrna-seq sra 10x genomics • 1.0k views
ADD COMMENT
0
Entering edit mode
4.0 years ago
ATpoint 81k

There is indeed only one read: https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR10349891 Either the authors did simply not upload R1 or maybe they included CB and UMI into the headers of R2 somehow but this would be odd. Check if there is a method section somewhere. There are often "orphan" datasets on NCBI with no background information available and I would be careful if no metadata and method texts are available. You can try contacting the author as well.

ADD COMMENT
0
Entering edit mode

Thank you for the response, that helps clear things up a bit. On that site that you linked, there are two bam files (SF11232_1_possorted_genome_bam.bam.1, SF11232_2_possorted_genome_bam.bam.1). I downloaded both of them and used the 10X genomics executable bamtofastq. Using this, for each of the files, I obtained an R1 and R2 fastq file. Do you know why two .bam files were uploaded instead of one (does each .bam file contain unique information)? I am not familiar with any standard practices of uploading these files. Thank you again!

ADD REPLY

Login before adding your answer.

Traffic: 2041 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6