Where can i download the refseq gene coding regions data? Is it from UCSC?
I would be much appreciated if you gave me the related ftp links.
Figure it out here: http://genome.ucsc.edu/cgi-bin/hgTables
I have downloaded the file. But the third column showed exon,CDS,start_codon and stop_codon regions. I wanted to got all exon and UTR regions coordinate. So is there a direct way to download the file without converting files like above?
If you click the 'describe table schema' button it will show you exactly what data will be in the downloaded file. The 'RefSeq Genes' table includes two comma-separated lists of exon start and exon end coordinates. It's relatively straightforward to take this and split it into a list of just exonic regions (in BED file format or something).
Download Human and Mouse refGene from UCSC with bash wget
wget -c -O mm9.refGene.txt.gz http://hgdownload.soe.ucsc.edu/goldenPath/mm9/database/refGene.txt.gz
wget -c -O mm10.refGene.txt.gz http://hgdownload.soe.ucsc.edu/goldenPath/mm10/database/refGene.txt.gz
wget -c -O hg19.refGene.txt.gz http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/refGene.txt.gz
wget -c -O hg38.refGene.txt.gz http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/refGene.txt.gz
BTW, they are same with hg19.refseq.bed12 when you use RSeQC to infer library type (unstranded or stranded )
infer_experiment.py -r knowngene.hg19.bed12 -i L007Aligned.sortedByCoord.out.bam
Using the link provided by Ashutosh in the comment to your question, select your genome using the top row of dropdown menus.
On the second row, make sure you have "Genes and Gene Prediction" selected. Then, choose "RefSeq Genes"
Any SQL or Perl script to do it?
This might work
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy