Question: Identify the nearest common SNP to a given rare variant
0
gravatar for bisansamara
15 months ago by
bisansamara10
bisansamara10 wrote:

I have a list of rare single nucleotide variations (SNVs), and I would like to identify the nearest common SNP to each from both ends.


The input information would be the coordinates of the SNVs (something like this: chr7:127991052, chr18:321720, chr5:76174154, etc.). The output I need is the nearest SNP to each SNV (both upstream and downstream).


What tools could perform this task? Your help is much appreciated!

ADD COMMENTlink modified 15 months ago • written 15 months ago by bisansamara10
0
gravatar for Pierre Lindenbaum
15 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum121k wrote:

assuming the file is sorted

grep -A 1 -B 1 -F 'chr7:127991052' input.txt
ADD COMMENTlink written 15 months ago by Pierre Lindenbaum121k
0
gravatar for Alex Reynolds
15 months ago by
Alex Reynolds28k
Seattle, WA USA
Alex Reynolds28k wrote:

Via BEDOPS:

$ vcf2bed < list.vcf > list.bed
$ vcf2bed < snps.vcf > snps.bed
$ closest-features list.bed snps.bed > answer.bed
ADD COMMENTlink written 15 months ago by Alex Reynolds28k

This seems like a good tool. But it's taking forever to convert the dbSNP file (downloaded from NCBI) from csv to bed....any suggestions? Note that I'm using Cygwin on Windows to run the commands.

ADD REPLYlink written 15 months ago by bisansamara10

To get around limitations in /tmp you could specify an alternative temporary directory for sorting. Add --sort-tmpdir=<dir> and (for example) --max-mem=2g to do sorting in a non-temporary (non /tmp) directory and to sort with 2GB of system memory. See --help or the online documentation for more detail.

ADD REPLYlink written 15 months ago by Alex Reynolds28k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1525 users visited in the last hour