How to annotate ChIP peaks base on NCBI sequence name?
0
0
Entering edit mode
3.4 years ago
iridha • 0

I have chip-seq peaks based on NCBI genome which looks like the following: seqnames ranges strand | Conc <Rle> <IRanges> <Rle> | <numeric> X0001.0806584 NC_000003.12 75668920-75669635 * | 11.10671 X0002.1092998 NC_000005.10 34190588-34192289 * | 8.45169 X0002.1092999 NC_000005.10 34190588-34192289 * | 8.45169 X0003.1726283 NC_000009.12 137101991-137103797 * | 8.30861 Although I used human_NCBI_GRCh38p12 for alignment when I get its annotation file in bioconductor the sequence names are based on chromosome name like the following:

TxDb.Hsapiens.UCSC.hg38.knownGene
ucsc.hg38.knownGene <- genes(TxDb.Hsapiens.UCSC.hg38.knownGene)

seqnames              ranges strand |     gene_id
               <Rle>           <IRanges>  <Rle> | <character>
          1    chr19   58345178-58362751      - |           1
         10     chr8   18391282-18401218      + |          10
        100    chr20   44619522-44652233      - |         100
       1000    chr18   27950966-28177130      - |        1000
  100009613    chr11   70072434-70075433      - |   100009613

the annotation is not working because of the difference in seqnames.

peaks_annotated<- annotatePeakInBatch(Peaks, AnnotationData=ucsc.hg38.knownGene)

using GCF_000001405.39_GRCh38.p13_genomic.gtf directly result in about 1 million gene for only 2000 peaks.

Any help that I can get this problem solved is highly and deeply appreciated.

ChIP-Seq alignment annotation sequence • 425 views
ADD COMMENT

Login before adding your answer.

Traffic: 2722 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6