Question: SRA download question
0
gravatar for awells2
5 months ago by
awells220
awells220 wrote:

I am trying to download data from SRA using fastq-dump. The data I am trying to download is from SRR10349891. The sequencing design is 10x Genomics V2, and the layout is Paired. Based on some of the tutorials from cellranger (https://bioinformaticsworkbook.org/dataAnalysis/RNA-Seq/Single_Cell_RNAseq/Chromium_Cell_Ranger.html) I would expect the command:

fastq-dump --split-files SRR10349891

to yield two files SRR10349891_1.fastq and SRR10349891_2.fastq.

However, I only get one file (SRR10349891.fastq). All of the online resources I have seen require both .fastq files from this process in order to analyze this type of data (using methods like scPipe). Is there only file available for download from this experiment?

If so, are there any methods for converting this single .fastq file to two files (one containing the transcripts, and one containing the barcodes/umi)?

The other option that I have thought about is downloading the available .bam files for this experiment, and then using the 10X genomics bamtofastq tool to convert these back to fastq files.

Thank you!

ADD COMMENTlink modified 5 months ago by ATpoint39k • written 5 months ago by awells220
0
gravatar for ATpoint
5 months ago by
ATpoint39k
Germany
ATpoint39k wrote:

There is indeed only one read: https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR10349891 Either the authors did simply not upload R1 or maybe they included CB and UMI into the headers of R2 somehow but this would be odd. Check if there is a method section somewhere. There are often "orphan" datasets on NCBI with no background information available and I would be careful if no metadata and method texts are available. You can try contacting the author as well.

ADD COMMENTlink modified 5 months ago • written 5 months ago by ATpoint39k

Thank you for the response, that helps clear things up a bit. On that site that you linked, there are two bam files (SF11232_1_possorted_genome_bam.bam.1, SF11232_2_possorted_genome_bam.bam.1). I downloaded both of them and used the 10X genomics executable bamtofastq. Using this, for each of the files, I obtained an R1 and R2 fastq file. Do you know why two .bam files were uploaded instead of one (does each .bam file contain unique information)? I am not familiar with any standard practices of uploading these files. Thank you again!

ADD REPLYlink written 5 months ago by awells220
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 860 users visited in the last hour