No overlap of Chrome and Chrome End Position in RSID Update
0
0
Entering edit mode
20 months ago
hi.there • 0

So I've moved this question to a new post. I am new to genetic data preprocessing so forgive me if this is a novice mistake.

I've been trying to update rsids on a bim file in ADNI based on chromosome and chromosome end positions.

I've grabbed the RSIDs and chrome and chrome end positions from the ADNI bim file and put them in a separate file.

I have additionally gone to UCSC and gotten every rsid with chrome and chrome positions via command:

curl -O https://hgdownload.gi.ucsc.edu/goldenPath/hg38/database/snp151Common.txt.gz

gunzip -c snp151Common.txt.gz | cut -f 2,4,5,12,17 | grep single.exact | cut -f 1-3 > onlySNPs.tsv

sort -k3 -u onlySNPs.tsv | sort -k1,2 -u > onlySNPs.uniqLocAndId.tsv

I have then tried to join the two files via chrome and chrome positions by concatenating the two fields in both files.

awk 'FNR==NR{a[$1]=$2 FS $3;next}{ print $0, a[$1]}' updatedonlySNPS.uniqLocAndId.txt fromOriginalBim.txt > combined.txt

Unfortunately, there seems to be no third column generated leading to the assumption that there is no overlap of chrome and chrome endposition. I tried a sanity check and searched for a match of the first 30 rows but indeed there are no matches. Does anybody know what I may be doing wrong? I have noticed that in addition to RSIDs in the bim id column there are ids with common variant numbers with a preface of 'CNVI' and ids with a preface of 'MITO'. Any help/education would be deeply appreciated.

plink chrom update rsid • 477 views
ADD COMMENT
0
Entering edit mode

I have then tried to join the two files

use join https://linux.die.net/man/1/join

ADD REPLY

Login before adding your answer.

Traffic: 2145 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6