Looking For Fasta Files Of Intergenic Regions For Mouse (Mm9) And Worm (Ce10)
1
0
Entering edit mode
11.0 years ago

I'm looking for a fasta file of all intergenic regions for mouse (mm9) and worm (ce10). Ideally, I want to find the equivalent of the NotFeature.fasta file for yeast (S288c), which can be found here:

http://downloads.yeastgenome.org/sequence/S288C_reference/intergenic/

This file "contains DNA sequences which are not contained within a feature. Features include ORF, ARS, CEN, rRNA, tRNA, snRNA, snoRNA, RNA genes, LTRs, telomeric elements and transposons."

Any pointers?

• 3.1k views
ADD COMMENT
0
Entering edit mode
11.0 years ago
neal.platt ▴ 240

If you can find annotations for all the features you are interested in and convert them to bed format, you should be able to use the complementBed program in Bedtools to produce a bed file of regions lacking annotations. You could then use getfasta (again, from Bedtools) to extract the fasta sequence from your non-annotated regions produced by complementBed.

I am sure there are more efficient ways but this will get you started.

ADD COMMENT
0
Entering edit mode

Thanks Neal, this is a good solution. Bedtools is amazingly useful.

For mus, I've decided to use the 1000 bp segments upstream of all annotated transcription start sites, which can be downloaded directly here:

http://hgdownload-test.cse.ucsc.edu/goldenPath/mm9/bigZips/

ADD REPLY

Login before adding your answer.

Traffic: 2678 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6