how to retreive and download sequence information
Entering edit mode
9.1 years ago
Affan ▴ 300

I have created a position weight matrix based on transcription factor binding sites in the FANTOM 4.

In my code (R), I have trained my PWM with TFBS in chr1, chr2. Now, I want to use this PWM to scan chr3 - chr22 to analyze the accuracy of my PWM.

What is the best way to retrieve a "stitched" string of chr3 - chr22. (or even individual ones if a single string is too large).

I tried using the DAS server but it doesn't work without giving coordinates.(

Doing my own homework, I see that both BioConductor and SeqinR package for R can do this. But I can't seem to figure out what the right workflow/code is to retrieve this information.

For what its worth, I do have hg18 downloaded as separate .fa files. I am fairly certain that there exists a function in SeqinR/BioConductor to read these fasta files. Is this the best way to do this?

sequencing databases • 1.9k views
Entering edit mode

What do you mean with 'stitched' string of chr3-chr22? Apparently, you want the genomic sequence of the chromosomes pasted together but I think you want something else.


Login before adding your answer.

Traffic: 2716 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6