How to convert dnsnp in UCSC to gtf
1
0
Entering edit mode
8 months ago
octpus616 ▴ 100

Hi,

I am trying to use a software require full dbsnp annotation from UCSC with gtf format.

The uscs table browser offer a online server, so that we can export its with gtf format, but Its not work when file size is large

I also try to download dbsnp from UCSC, eg: https://hgdownload.cse.ucsc.edu/goldenpath/hg38/database/snp151.txt.gz, and convert it by genePredToGtf, but its also not work because of

invalid unsigned integer: "A"

So, how can I get the gtf format dbsnp data from UCSC?

UCSC dnsnp gtf • 736 views
ADD COMMENT
0
Entering edit mode

Gtf is usually for genes. It’s not a very efficient solution for SNPs. Which software requires gtf for SNPs ?

ADD REPLY
0
Entering edit mode

REDItools, a software for detect RNA editing, see: https://github.com/BioinfoUNIBA/REDItools/blob/master/README_1.md#filtertablepy for more detail

ADD REPLY
2
Entering edit mode
8 months ago

genePredToGtf cannot work, as it convertes the genePred format to GTF so the input must be in genePred format. The dbSNP tables are not in genePred format as they are not gene predictions. GTF is a bad match for this type of data, REDItools made a weird choice, as GTF has fields for exons and frames that don't make sense for SNPs. Either way you can use awk to convert the file to a very bare-bones GTF file which is probably good enough for this software:

curl -s https://hgdownload.cse.ucsc.edu/goldenpath/hg38/database/snp151.txt.gz | zcat | awk '{OFS="\t"; print ($2,"ucsc", "snp", $3,$4, "1", "+", "0", "")}' > snp151.gtf
ADD COMMENT
0
Entering edit mode

Its seems work, Thanks for help, I also think that using GTF format to complete these filtering steps is strange, so I am considering whether to write a program to use VCF or BED files to perform the filtering work on the results.

ADD REPLY

Login before adding your answer.

Traffic: 1774 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6