How To Download All The Introns From Ucsc
2
3
Entering edit mode
12.2 years ago
Anima Mundi ★ 2.9k

Hello,

I would to know how to download the FASTAs of all the introns from a given UCSC genomic assembly.

fasta intron assembly ucsc • 8.2k views
ADD COMMENT
13
Entering edit mode
12.2 years ago

The following command line prints all the introns of the ucsc/knownGene table.

$ curl -s "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/knownGene.txt.gz" |\
gunzip -c  |\
awk -F '   ' '{ exonCount=int($8);split($9,exonStarts,"[,]"); split($10,exonEnds,"[,]"); for(i=1;i<exonCount;i++) {printf("%s\t%s\t%s\t%s\t%s\tIntron_%d\n",$1,$2,$3,exonEnds[i],exonStarts[i+1],($3=="+"?i:exonCount-i));}}'

uc001aaa.3    chr1    +    12227    12612    Intron_1
uc001aaa.3    chr1    +    12721    13220    Intron_2
uc010nxq.1    chr1    +    12227    12594    Intron_1
uc010nxq.1    chr1    +    12721    13402    Intron_2
uc010nxr.1    chr1    +    12227    12645    Intron_1
uc010nxr.1    chr1    +    12697    13220    Intron_2
uc009vis.2    chr1    -    14829    14969    Intron_3
uc009vis.2    chr1    -    15038    15795    Intron_2
uc009vis.2    chr1    -    15942    16606    Intron_1
uc009vit.2    chr1    -    14829    14969    Intron_8
(...)
ADD COMMENT
0
Entering edit mode

Very useful answer, thanks. I accepted Wen's solution because it is "ready to go".

ADD REPLY
10
Entering edit mode
12.2 years ago
Wen.Huang ★ 1.2k

UCSC table browser, select the appropriate track (e.g. some gene annotation), select output format "sequence", click "get output", select "genomic", there will be options to download "introns".

ADD COMMENT
0
Entering edit mode

Thanks. Download started ;).

ADD REPLY

Login before adding your answer.

Traffic: 3297 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6