Question: Obtaining Random Sequences From Genbank
2
gravatar for Anima Mundi
8.5 years ago by
Anima Mundi2.4k
Italy
Anima Mundi2.4k wrote:

Hello,

I would like to know if there is a way to obtain random sequences from Genbank's RefSeqs. I also would like to know if there is somewhere a list of valid IDs of different classes.

list sequence refseq random genbank • 1.7k views
ADD COMMENTlink written 8.5 years ago by Anima Mundi2.4k
2

Please clarify "random sequences". Do you want to retrieve sequences at random? If so, why? Or do you want to generate random sequence using a RefSeq sequence as a seed?

ADD REPLYlink written 8.5 years ago by Neilfws48k
5
gravatar for Bio_X2Y
8.5 years ago by
Bio_X2Y3.7k
Ireland
Bio_X2Y3.7k wrote:

The most comprehensive way of getting a full list of RefSeq IDs and sequences would be to download the large release files from RefSeq via their FTP site. Beware - the file structure is complex, and you will need to do some background reading to figure it out. You will also need to decide things like:

  • do you want all sequences, or just one species, e.g. human?
  • do you care which version of RefSeq you are looking at?

A simpler (but less comprehensive) method is to download a file from the UCSC Table Browser. e.g. select "RefSeq Genes" in the track field, "refGene" in the table field, and click get output. This will generate a file listing about 40,000 human NM and NR sequences. (You can get other species by selecting different options). This list will be a subset of the full RefSeq release, but should be good enough for most purposes.

Once you have the list of sequences, you can look up the corresponding sequences via the RefSeq interface.

ADD COMMENTlink modified 8.5 years ago • written 8.5 years ago by Bio_X2Y3.7k
0
gravatar for Anima Mundi
8.5 years ago by
Anima Mundi2.4k
Italy
Anima Mundi2.4k wrote:

Yes, I want to retrieve random sequences. Saying random, I mean sequences chosen from a group of RefSeq genes in a way as unbiased as possible. I am interested in generating lists of rundomly grouped genes to use as a control.

The Table Browser method helps, thank you. Indeed, I am interested only in species-specific sequences, and I find easy to repeat the download periodically.

ADD COMMENTlink written 8.5 years ago by Anima Mundi2.4k
3

Hi wiee, two quick tips - if you want to clarify a question, you should either (a) edit your original question, or (b) post a comment under your question. You should not post your clarification as an answer since this is confusing for other readers. Also, I see you have now opened at least 6 accounts in BioStar. The intended way to use BioStar is to reuse a single account - this way you accumulate reputation, and other people can get to know you as part of the community.

ADD REPLYlink written 8.5 years ago by Bio_X2Y3.7k

I have never meant to disturb here. The reason why I have several accounts here is that I have never really registered as user; every time my cookies change, the account does the same, but I thought I was a in legal status: I just filled the "Ask question"'s form, the multiple accounts result from the forum's technical organization. For the same reason, when the site does not recognise me as wiee I loose the faculty of making comments, and that forces me to answer other's comment in a new answer.

ADD REPLYlink written 8.5 years ago by Anima Mundi2.4k

I really appreciate this site, but if you think that registration is a duty, in my opinion, you should not permit the guests to post. Using the same nick repeatedly, I intended to be somehow recognisable, but I feel it to be a courtesy, not a duty. Sorry for the frankness, and thank you anyway for the time you spent helping me.

ADD REPLYlink written 8.5 years ago by Anima Mundi2.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1692 users visited in the last hour