Input file requirements for hg38
2
1
Entering edit mode
6.5 years ago
ekram ▴ 10

I'm trying to run the 20/20+ tool using a set of somatic mutations that have been called from the hg38 reference genome. Could someone please let me know whether or not the pre-computed scores and the training classifiers (the Rdata) are dependent on the reference genome version?

next-gen SNP sequencing • 1.6k views
ADD COMMENT
2
Entering edit mode
6.5 years ago
Collin ▴ 1000

The trained classifier would work, but the pipeline for fetching and computing features for you would not. This is because as you kind of suspected the pre-computed scores are based on information from hg19. Likewise the gene annotation in the BED file are also based on hg19 coordinates. I recommend you use UCSC's liftover and change the start and end position coordinates of your mutations to hg19, and then run 20/20+. Although I haven't done this yet, the files will likely get updated in the future so 20/20+ works with hg38.

ADD COMMENT
0
Entering edit mode

Thanks a lot for the reply! It helps a lot. I didn't mention about the gene annotation BED file since that's perhaps easier to get. Shouldn't just simple reformatting of any hg38 transcript file to a BED12 format work in addition to replacing the transcript name with gene name and the keeping only the longest transcript by CDS length? It would be great to actually have the hg38 versions of the required files. Looking forward to that. Thanks once again.

ADD REPLY
0
Entering edit mode

The issue is that the precomputed scores are annotated against the transcript chosen in the BED file I provide. So I really do have to provide updated data files for all of them together for it to work. Our lab has updated some of the score information for hg38, but it hasn't been made yet into a format useable by 20/20+.

ADD REPLY
0
Entering edit mode

Thanks a lot! That's helpful to know.

ADD REPLY
0
Entering edit mode
5.4 years ago
james • 0

Hi Colin,

any update on hg38 file for 2020+?

thanks

ADD COMMENT

Login before adding your answer.

Traffic: 2998 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6