how to write some sequences in a (.fq)-like file?
1
0
Entering edit mode
7.8 years ago
Amirosein ▴ 70

Hi

I have a DNAStringset containing some sequences of length 25. i want to make an .fq file of it but first i need to add a "specific" header (which stored in a vector) for each read. the data is like this:

A DNAStringSet instance of length 2
 width seq                             names               
 [1]    25 AGCTTTTCATTCTGACTGCAACGGG    coli
 [2]    25 ATTTCCTTACTCAACCCCGAAACGC    coli

And the result that i want is like:

@coli:1500
AGCTTTTCATTCTGACTGCAACGGG
+
XXXXXXXXXXXXXXXXXXXXXXXX
@coli:1700
ATTTCCTTACTCAACCCCGAAACGC
+
XXXXXXXXXXXXXXXXXXXXXXXX

I can simply make each line and write it to file using cat() function but this is really slow for large datasets. so i found writeXStringSet(data,format=".fq") to use. but i can't find how to prepare my object for using this function.

My question is:

How to prepare my file for this function? or is there any other easy and quick way of doing this?

Thanks all

r writeXStringSet write read io • 1.8k views
ADD COMMENT
0
Entering edit mode

No offense, but can I ask why you want to do this? Because fastq is a specific format for sequencing reads, so for example the rows for each read are important:

@read meta info
read
+
read quality scores

If you convert this data the way you've outlined in your post, I'm not entirely clear what your plan is for using it.

ADD REPLY
0
Entering edit mode

yeah of course :) , because i want to use bowtie to find some fragments of a genome in another one, so for sending this fragments to bowtie i need to prepare them as a read

ADD REPLY
1
Entering edit mode
7.8 years ago
mforde84 ★ 1.4k

Oh, you'd be best off using BLASTN actually. There is a command line version as well, besides the one hosted on the NCBI website.

ADD COMMENT

Login before adding your answer.

Traffic: 809 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6