Retrieve genes with start and end position located on the mouse chromosome 8 between positions 10000 and 1000000
2
0
Entering edit mode
3.7 years ago

i have to annotate some data. therefore i need all genes with start and end position located on the mouse chromosome 8 between positions 10.000 and 1.000.0000 (this is just an example)

gene genome R • 879 views
ADD COMMENT
0
Entering edit mode

big thanks already - will try this out now

ADD REPLY
0
Entering edit mode
3.7 years ago

Get GFF3-formattted mouse gene annotations, e.g. M25 for mm10 from GENCODE:

$ wget -qO- ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M25/gencode.vM25.annotation.gff3.gz \
    | gunzip --stdout - \
    | awk '$3 == "gene"' - \
    | convert2bed -i gff - \
    > gencode.vM25.genes.bed

Then use BEDOPS bedops to retrieve genes for your ad-hoc range of interest:

$ echo -e 'chr8\t10000\t1000000' | bedops -e 100% gencode.vM25.genes.bed - > answer.bed

The file answer.bed will contain M25/mm10 gene annotations contained entirely within the ad-hoc interval on chr8.

ADD COMMENT
0
Entering edit mode
3.7 years ago
GenoMax 141k

Using EntrezDirect. You can filter as needed.

$ esearch -db gene -query "Mus musculus [ORGN]" | efetch -format tabular | grep "NC_000074" | awk -F "\t" '{OFS="\t"}{print $6,$13,$14,$15}'
Casp3   46617291        46639698        plus
Cdh1    106603350       106670247       plus
Hmox1   75093618        75100593        plus
Tubb3   123411553       123422015       plus
Itgb1   128685654       128733579       plus
Mmp2    92827290        92853421        plus
Il15    82331624        82403252        minus
Insr    3150922 3279649 minus
ADD COMMENT

Login before adding your answer.

Traffic: 2302 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6