What does a bp column in gene level data mean?
0
0
Entering edit mode
13 months ago
DN99 ▴ 20

Sorry if this might be a simple question. I'm looking in the ExAC database, specifically in the Gene constraint scores TSV (from here https://gnomad.broadinstitute.org/downloads) and I see for each row per gene there is a bp column.

What does the bp column mean in relation to the gene? Is it the number of base pairs that gene has? Or is it a specific position of that gene on the chromosome, like the start point?

The first row of the data I'm looking at looks like:

transcript          gene    chr n_exons tx_start    tx_end      bp      mu_syn  mu_mis  mu_lof  n_syn   n_mis   n_lof   exp_syn exp_mis exp_lof syn_z   mis_z   lof_z   pLI n_cnv   exp_cnv cnv_z
ENST00000263100.3   A1BG    19  8       58858387    58864803    1488    1.22623810613e-05   2.31370910656e-05   1.00149904809e-06   87  170 8   104.728743317   199.807808895   12.3013823748 1.07397341153102  1.03143095067218    1.21484488615106    9.0649236354772e-05 3   3.60990172920741    0.111439851405077
ENST00000373995.3   A1CF    10  11      52566488    52610547    1785    6.39891945771e-06   1.54440933739e-05   1.8987381109e-06    86  168 9   76.6988402846   178.585954564   25.9365837039   -0.658403707874478  0.387458005554534   3.29427011505961    0.00361970078438154 NaN NaN NaN
ENST00000318602.7   A2M     12  36      9220418     9268445     4425    1.76240458841e-05   4.04871757669e-05   3.98398665823e-06   187 393 16  187.602696614   414.516709098   51.7060915327   0.0272791650795749  0.516917311081667   4.9188222793601 0.000540114865271392    3   8.70631909864876    0.833503390443042
ENST00000299698.7   A2ML1   12  35      8975247     9027607     4365    1.7870125509e-05    4.01510386566e-05   3.7123159726e-06    226 502 42  216.075661755   467.040245986   56.0645988342   -0.41854907644017   -0.791240082514084  1.86068429240947    1.32902210264609e-22    63  11.8468312922777    -2.28143080359217

gene exac gnomad database genetics • 493 views
0
Entering edit mode

We can only guess what the column means right now. Can you show us the first few lines of the file? Edit your post and add it in there. See this post for formatting tips: How to Use Biostars Part-3: Formatting Text and Using GitHub Gists

0
Entering edit mode

Thank you for your response, I've had a go at adding in the first line of the data. I've been trying to find their README file for more information too but I haven't found it yet

0
Entering edit mode

It doesn't seem to be described in their Supplementary Material pages 74-77 (as mentioned on the webpage). You may want to email them to be sure, but I think bp corresponds to the number of exonic bases in the transcript. The first transcript ID seems to match a GRCh37-annotated transcript, and grch37 EnsEMBL seems to be down right now so I'm unable to verify.

0
Entering edit mode

I agree I will get in touch with them, but also agree that exonic bases sounds correct - thank you for looking into it!