Get GC Content from UCSC with perl script (mysql)

Tutorial:Get GC Content from UCSC with perl script (mysql)

3

Entering edit mode

7.3 years ago

Shicheng Guo ★ 9.5k

Hi All,

I have a bed file contents hundred of human genomic regions. I want to get some basic characteristics for these genomic regions, like GC contents et.c. Any perl script could do it without download the fastq files for these regions.

I know if you download the fasta files for these regions, you can use the following script to calculate GC contents:

http://alrlab.research.pdx.edu/aquificales/scripts/get_gc_content.pl

or like this:

mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e "select chrom, size from  hg19.chromInfo" > hg19.chrom.sizes

Sincerely,

myql ucsc gc-content perl • 3.2k views

ADD COMMENT • link updated 14 months ago by Ram 44k • written 7.3 years ago by Shicheng Guo ★ 9.5k

1

Entering edit mode

The post in the current version seems to be missing a valid question or is incomplete. Can you take a look and amend as needed?

ADD REPLY • link 7.3 years ago by GenoMax 144k

1

Entering edit mode

I am trying to understand your question. Is it as follows:

How can I find nucleotide composition (GC content and such) of genomic regions from bed file using online tools that do not download the reference fasta file to my server/computer. Preferably using ucsc server for the computation.

I believe you might find public versions of Galaxy as the best way to handle such projects. You can get data to Galaxy (like bed files for particular regions) directly from UCSC (no need to download to your server or computer).

ADD REPLY • link 7.3 years ago by Petr Ponomarenko ★ 2.8k

Login before adding your answer.