Hi all,
( https://en.wikipedia.org/wiki/Upstream_open_reading_frame) An upstream open reading frame (uORF) is an open reading frame (ORF) within the 5' untranslated region (5'UTR) of an mRNA. uORFs can regulate eukaryotic gene expression.Translation of the uORF typically inhibits downstream expression of the primary ORF.
I wonder if there an easy way to find evidences that an uORF is actually translated. For example in : https://www.bioinformatics.uni-muenster.de/tools/uorfdb/download/index.hbi?lang=en there is:
>>> 279332
$1 Taxon : Homo sapiens
$2 Assembly : hg38
$3 Chr : chr10
$4 Symbol : CREM
$5 GeneID : 1390
$6 GenBankID :
$7 SymbolAliases : CREM-2; ICER; hCREM-2
$8 GeneNames : cAMP responsive element modulator; cAMP-responsive element modulator; CREM 2alpha-b protein; CREM 2beta-a protein; cAMP response element modulator; inducible cAMP early repressor ICER
$9 NCBIID : NM_001394623.1
$10 TranscrStart : 35126845
$11 TranscrEnd : 35212953
$12 Strand : +
$13 TranscrLength : 2518
$14 TLSlength : 549
$15 CDSstart : 35188276
$16 CDSend : 35211398
$17 TranscrKozakContext : TTAACAATGA
$18 TranscrKozakStrength : adequate
$19 ExonStarts : 35126845; 35148367; 35188199; 35206894; 35211253
$20 ExonEnds : 35127193; 35148491; 35188388; 35207051; 35212953
$21 uORF_ID : NM_001394623.1_GTG.7
$22 uORFstart : 35188214
$23 uORFend : 35188274
$24 uORFstartCodon : GTG
$25 uORFstopCodon : TAA
$26 uORFlength : 60
$27 uORFCDSdistance : 2
$28 uORF5'-capDistance : 487
$29 uORFkozakContext : CCCAAGGTGG
$30 uORFkozakStrength : strong
$31 uORFtype : non-overlapping
$32 uORFreadingFrame : 3
$33 uORFnucleotideSeq : GTGGAACAATCCAGATTTCTAACCCAGGATCTGATGGTGTTCAGGGACTGCAGGCATTAA
$34 uORFaminoSeq : MEQSRFLTQDLMVFRDCRH*
$35 SharedStartCodon :
<<< 279332
here the uORF is translated to MEQSRFLTQDLMVFRDCRH
.
is there any database where I can find that this peptide was found "in-vivo". I would search for something like a mass-spectrometry database, but I don't really know this field of research.
For example, I found: http://pepquery2.pepquery.org/
but I don't know what is the correct way of searching such sequence (dataset ? task type ? ...) or to interpret the output.
>>> 2
$1 peptide : MEQSRFLTQDLMVFRDCRH
$2 modification : Oxidation of M@12[15.9949];Carbamidomethylation of C@17[57.0215];TMT 10-plex of peptide N-term@0[229.1629]
$3 n : 341
$4 spectrum_title : f06183_Prot_10_F06:24104:3
$5 charge : 3
$6 exp_mass : 2713.340475287064
$7 tol_ppm : -11.605709723058858
$8 tol_da : -0.03148987647364265
$9 isotope_error : 0
$10 pep_mass : 2713.30898541059
$11 mz : 905.4541015625
$12 score : 33.13133690871885
$13 n_db : 0
$14 total_db : 173
$15 n_random : 0
$16 total_random : 1000
$17 pvalue : 9.99000999000999e-4
$18 rank : 1
$19 n_ptm : 1
$20 confident : No
$21 ref_delta_score : 9.171662518568212
$22 mod_delta_score : -34.25203993290825
$23 mod_filtering : Removed
<<< 2
or is there any other method to find a peptide in a proteomic database ?
(cross-posted: https://github.com/bzhanglab/PepQuery/issues/63 )
thanks.
looks interesting but it's broken. the Firefox console shows a 404 for https://www.ebi.ac.uk/pride/ws/archive/v2/peptidesummary?keyword=MEQSRFLTQDLMVFRDCRH&page=0&pageSize=20
There is an API that you could try: https://www.ebi.ac.uk/pride/markdownpage/prideutilities