Question: Uniting SRA Data into a single file
0
gravatar for grecoandreauni
2.3 years ago by
grecoandreauni0 wrote:

i want to write a code in R, that reads 2 or more SRA files, and then unite them into a single file, problem is, i don't understand how to operate the SRA data in R, i read around and it suggested to either convert or directly download them in the fastq format, but even doing that, i still don't understand how am i supposed to load the data into R in the first place even when operating with the fastq format

R fastq sra • 848 views
ADD COMMENTlink modified 2.3 years ago by Santosh Anand4.6k • written 2.3 years ago by grecoandreauni0

I don't think you should do this in R. Please read more about software called fastq-dump (fastq-dump --outdir /opt/fastq/ --split-files /home/[USER]/ncbi/public/sra/SRR925811.sra), download fastq and deal with that - there will be more resources and help available.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Biomonika (Noolean)3.0k

thanks for the reply but this is for a college project and i was asked to specificaly use R... would it be possible to treat the fastq file as txt file and simply unite the two txt file using R? that would be enough for what i was asked to do

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by grecoandreauni0

When you are referring to "unite the two txt file using R" is that concatenation of the files (R1_1,R1_2,R2_1,R2_2) or interleaving of the reads from two files (e.g. R1_1,R2_1,R1_2,R2_2 etc)?

To answer your question: fastq files are plain text (in their uncompressed form) so you could treat them as text files in terms of readability.

ADD REPLYlink written 2.3 years ago by genomax62k

i mean concatenating, anyway someone suggested me to use the shortread library, i'll try to see if i can achieve what i'm trying to do with that otherwise i'll try some to see if i can concatenate them as text files, thanks for your time!

ADD REPLYlink written 2.3 years ago by grecoandreauni0
0
gravatar for Santosh Anand
2.3 years ago by
Santosh Anand4.6k
Santosh Anand4.6k wrote:

SRA queries in R are handled by SRAdb package. See an example and more details here

ADD COMMENTlink written 2.3 years ago by Santosh Anand4.6k

thanks for the reply, i already got the SRAdb package and used it to download some SRA files (specificaly files "SRR000648" and "SRR000657") following the tutorial, thing is after doing that i would like to load them into R as data and unite them into a single file , i don't understand how i can load the SRA file into R using SRAdb, i seems to understand that with SRAdb i can only perform queries in the database and download SRA files

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by grecoandreauni0
1

Could you post the questions as it was asked. It is really not clear to me what is the meaning of uniting two files. And if it is just concatenating two fasta-files, why would someone insist on doing it in R? A basic 'cat' command in shell will do that.

ADD REPLYlink written 2.3 years ago by Santosh Anand4.6k

what was asked to me was to "build a single matrix of profile expression using multiple GEO and SRA data" i have done this with GEO data but i had no idea what to do regarding SRA data, after talking with other people i seems to understand that you can't get profile expression from SRA data directly like you would do with GEO because they contain reads and not profile expression

so i thought that the best thing to do after would be to create a single fastq file from multiple file since it's the closest thing to what i was asked

i talked to the person that asked me to do this and he said that "it's ok if you can't extract profile expression from SRA data as long as you explain why" so i guess part of the confusion stem from the fact that the question itself wasn't formulated correctly

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by grecoandreauni0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1183 users visited in the last hour