I want to download and select specified No. of introns sequence from UCSC. eg. introns:2,7,8,12,16,19,20 of gene:BRCA1 from FoundationOneCDx.
Here is my steps.
download all introns sequence from UCSC.
Tools -> Table Browser -> clade:Mammal;gemome:Human;assemble:hg19;grop:Genes and Gene Predictions;track:NCBI RefSeq;table:RefSeq All -> paste list: BRCA1(for example) -> get output -> Sequence Retrieval Region Options:Introns(One FASTA record per region (exon, intron, etc.) with...) -> get sequence:fasta file
select specified introns.
The fasta file includes 130 Introns sequence from different NM on "strand=-", and some introns have the same chromosome coodinates. Here is my thought to get specified introns:
extract all introns chromosome coordinate -> remove duplicated ones -> sort by chromosome coordinate The filter result: 1 range=chr17:41197800-41199679 2 range=chr17:41199701-41201157 3 range=chr17:41199701-41203099 4 range=chr17:41201192-41203099 5 range=chr17:41203115-41209088 6 range=chr17:41209133-41215369 7 range=chr17:41215371-41215910 8 range=chr17:41215949-41219644 9 range=chr17:41219693-41222964 10 range=chr17:41223236-41226367 11 range=chr17:41226519-41228524 12 range=chr17:41228609-41231370 13 range=chr17:41228609-41234440 14 range=chr17:41228612-41234440 15 range=chr17:41231397-41234440 16 range=chr17:41234573-41242980 17 range=chr17:41243030-41243471 18 range=chr17:41243030-41246780 19 range=chr17:41246858-41247882 20 range=chr17:41247920-41249280 21 range=chr17:41249287-41251811 22 range=chr17:41251875-41256158 23 range=chr17:41251878-41256158 24 range=chr17:41256259-41256904 25 range=chr17:41256954-41258492 26 range=chr17:41256954-41258514 27 range=chr17:41258531-41267762 28 range=chr17:41258531-41276053 29 range=chr17:41267777-41276053 30 range=chr17:41276113-41277218 31 range=chr17:41276113-41277307 32 range=chr17:41276113-41277313
My question is that:
(1) line 2,7,8,12,16,19,20 is the Introns 2,7,8,12,16,19,20 of BRCA1
(2) is there difference between strand=+ and strand=- to get introns sequence
(3) how do deal with overlap chromosome coordinate from above result? eg line 2,3,4
Maybe my thought and step are totally wrong, any help will be appreciated.