I have a bed file (chr start end) and I want to know how to annotate them with CCDS and get the following format:
chr start end
10 100177320 100177483 HPS1|NM_000195_cds_0
10 100177931 100178014 HPS1|NM_000195_cds_0
You could download CCDS as BED from the UCSC Genome Browser, and then perform a BEDOPS bedmap operation over your regions of interest (e.g., roi.bed) and the CCDS dataset (e.g., ccds.bed):
$ bedmap --echo --echo-map-id-uniq roi.bed ccds.bed > answer.bed
The file answer.bed will have each region of interest (ROI). The final column of each ROI will be a semi-colon-delimited string of CCDS ID values, for CCDS annotations that overlap each ROI.
Other --echo-map-* options are available, if you need more than the ID value. See the documentation for more information.
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy