Question: Gene symbol list
0
gravatar for siyavash_damdar
16 months ago by
siyavash_damdar20 wrote:

Hi, I have many coordinates (approximatly1000), I want to know how can I find the gene symbols that are in these coordinates? Thanks, Siavash

snp chip-seq next-gen sequence • 438 views
ADD COMMENTlink modified 16 months ago by Alex Reynolds29k • written 16 months ago by siyavash_damdar20

See this post, it may help.

Is There An Easy Way Of Getting Gene Symbols From Genomic Coordinates?

ADD REPLYlink written 16 months ago by natasha.sernova3.6k
2
gravatar for Alex Reynolds
16 months ago by
Alex Reynolds29k
Seattle, WA USA
Alex Reynolds29k wrote:

You can use BEDOPS bedmap --echo-map-id-uniq to map IDs from a BED file of gene annotations to a list of intervals-of-interest:

$ bedmap --echo --echo-map-id-uniq coordinates.bed genes.bed > answer.bed

You will provide the sorted BED file coordinates.bed.

To generate genes.bed, this will depend on your organism and reference genome. Here's an example of how to get this file for human hg38:

$ wget -qO- ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_27/gencode.v27.annotation.gff3.gz \
    | gunzip -c - \
    | awk '$3 == "gene"' - \
    | convert2bed -i gff - \
    | awk -vOFS="\t" '{ match($0, /gene_name=(.*);level/, a); $4=a[1]; print $0; }' - \
    > genes.bed

Going back to the bedmap --echo-map-id-uniq command, the file answer.bed will have each coordinate from coordinates.bed, and the HGNC symbol names of Gencode v27 gene annotations that overlap those coordinates.

ADD COMMENTlink modified 16 months ago • written 16 months ago by Alex Reynolds29k
1
gravatar for cpad0112
16 months ago by
cpad011212k
India
cpad011212k wrote:

intersect with bedtools.

ADD COMMENTlink written 16 months ago by cpad011212k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1856 users visited in the last hour