UCSC Table Browser Filter Constraints for MAF > 5%
1
1
Entering edit mode
3.7 years ago
dumeir ▴ 20

Hi,

I've been trying to obtain SNPs that have a MAF > 5% with the UCSC Table Browser. I tried using a free-form query of:

minorAlleleFreq > 0.05

but the SNP output is the same each time, and does not seem to have been filtered (2nd last row of sample has a MAF < 0.05).

chrom   chromStart  chromEnd    name    minorAlleleFreq
Filtering on 0 columns
chr1    10177   10179   rs367896724 0.425319,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,
chr1    10352   10353   rs555500075 0.4375,-inf,-inf,-inf,-inf,0.00381679,-inf,-inf,-inf,-inf,-inf,-inf,
chr1    11007   11008   rs575272151 0.0880591,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,
chr1    11011   11012   rs544419019 0.0880591,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,
chr1    13109   13110   rs540538026 0.0267572,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,
chr1    13115   13116   rs62635286  0.0970447,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,-inf,

Screenshot of Filter section:

Screenshot of Filter section

UCSC Table Browser MAF • 1.4k views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.
code_formatting

Please see How to add images to a Biostars post to add your images properly. You need the direct link to the image, not the link to the webpage that has the image embedded (which is what you have used here)

ADD REPLY
0
Entering edit mode

Thanks for your help!

ADD REPLY
0
Entering edit mode

The table schema shows that minorAlleleFreq is not a numeric column that can be filtered using the sort of simple free form query that you're using. I'm not sure how to achieve what you want.

dbSNP153 table schema from UCSC

ADD REPLY
0
Entering edit mode
3.6 years ago
Luis Nassar ▴ 650

Hello,

Unfortunately, there is no way to complete this query on the Table Browser. As RamRS pointed out, the minorAlleleFreq field dbSnp153Common.bb is not a numeric column. It is an array indexed by frequency-reporting project, with 1000Genomes first, which should comprise most all of the > 5% variants.

You can, however, use the bigBedToBed utility from the command line, supplying the URL to the UCSC download server directly to get the information you are looking for. Our utilities can be found here: http://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads

Once you have downloaded the utility for your respective operating system, you can run the following on your terminal to extract all the variants from the 1000 Genomes project with minor allele frequency > 5%:

bigBedToBed https://hgdownload.soe.ucsc.edu/gbdb/hg38/snp/dbSnp153Common.bb stdout | \
    awk -F"\t" '
                { tgpFreq = $10;
                  sub(/,.*/, "", tgpFreq);
                  if (tgpFreq > 0.05) {print;}
                }' > dbSnp153_1000Genomes_Gt5Pct.bed

If you have any follow up questions, our help desk can always be reached at genome@soe.ucsc.edu. You may also send questions to genome-www@soe.ucsc.edu if they contain sensitive data. For any Genome Browser questions on Biostars, the UCSC tag is the best way to ensure visibility by the team.

ADD COMMENT
1
Entering edit mode

Please use the 101010 button instead of <pre> tags - the tags take away from the site's functionality.

code_formatting

ADD REPLY

Login before adding your answer.

Traffic: 3012 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6