Find closest gene to chromosome location
2
0
Entering edit mode
8.4 years ago
D H ▴ 20

Hello!

I started a gene enrichment analysis (I haven't done this before), and I have a dataset that contains the gene expression data. This data set has a column with the gene names.

However, there are some entries in that columns, which represent chromosome locations instead of gene names. I want to find the gene closest to these chromosomes locations.

I'm using R 2.15.2 (if that helps).

What are my options?

Thank you in advance!

gene R • 4.8k views
ADD COMMENT
2
Entering edit mode
8.4 years ago
igor 13k

bedtools closest

ADD COMMENT
1
Entering edit mode
ADD COMMENT
0
Entering edit mode

I see that this utility takes as an input a BED file. I only have a csv file though.

Is it possible to convert csv to BED?

(I'm sorry for the questions but I'm very new to this)

ADD REPLY
1
Entering edit mode

The most simple version of a BED file is three tab-separated columns (chr, start pos, end pos). You can use Excel to extract those three columns and save as tab-delimited text file.

ADD REPLY
1
Entering edit mode

[Disparaging comment about Excel deleted]

ADD REPLY
0
Entering edit mode

If you use Excel, be sure to clean it up. It can save tab-delimited text files, but with non-Linux line endings.

You can do the following fix, in the case of exporting from Excel on Mac:

$ tr '\r' '\n' < input.fromExcelForMac.txt | sort-bed - > input.fixed.bed

If you exported your data from Excel on Windows, apply this post-save fix:

$ tr -d '\r' < input.fromExcelForWindows.txt | sort-bed - > input.fixed.bed

Then run closest-features to query features of interest:

$ closest-features --closest input.fixed.bed features.bed > answer.bed
ADD REPLY

Login before adding your answer.

Traffic: 2467 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6