How to convert hg38 narrowpeak file to hg19 narrowpeak file
1
1
Entering edit mode
8 months ago
koushik.vf09 ▴ 30

I have a narrowPeak file that is in hg38 format and I need to convert it to hg19 to use it in finding differential looping of HiChIP pipeline. My general idea was to convert the narrowpeak file into bed file by the following command

cut -f 1-6 MCF10A_H3k27ac_hg38.narrowPeak > MCF10A_H3k27ac_hg38_edited.bed


and using this bed file to liftover to hg19 but I will be losing the remaining narrowpeak data. But again I can't add the remaining narrowpeak file columns to my converted bed file as some of the records are failed to convert in the liftover tool.

Is there a better way to solve this problem?

Thank you!

hg38 liftOver ChIPseq narrowPeak • 558 views
3
Entering edit mode
8 months ago
Luis Nassar ▴ 610

Hello,

narrowPeak is an extended bed format. In this case the first 6 fields conform to bed, and then there are an additional 4 fields: https://genome.ucsc.edu/FAQ/FAQformat.html#format12

You can pass the bedPlus=6 argument to the liftOver util in order to preserve all columns in the file.

Some of this you may be familiar with, but you would need the hg38 to hg19 chain file: https://hgdownload.soe.ucsc.edu/goldenPath/hg38/liftOver/hg38ToHg19.over.chain.gz

Then you can invoke liftOver on the narrowPeak files directly:

cat exampleHg38.narrowPeak
chr1    9356548 9356648 .       0       .       182     5.0945  -1  50
chr1    9358722 9358822 .       0       .       91      4.6052  -1  40
chr1    9361082 9361182 .       0       .       182     9.2103  -1  75

./liftOver -bedPlus=6 exampleHg38.narrowPeak hg38ToHg19.over.chain exampleHg19.narrowPeak unmapped.txt
Mapping coordinates

cat exampleHg19.narrowPeak
chr1    9416607 9416707 .   0   .   182 5.0945  -1  50
chr1    9418781 9418881 .   0   .   91  4.6052  -1  40
chr1    9421141 9421241 .   0   .   182 9.2103  -1  75


If you have any follow up questions, our public help desk can always be reached at genome@soe.ucsc.edu. You may also send questions to genome-www@soe.ucsc.edu if they contain sensitive data. For any Genome Browser questions on Biostars, the UCSC tag is the best way to ensure visibility by the team.

1
Entering edit mode

Thank you very much! This solved my issue.