enter code here
Dear ALL,
I am trying to get all SNPs from the 5' and 3' UTRs of the RNAs of the genes "BRCA1". I found a way to get some information regarding this using entrez utilities: (If someone knows any other way to do this, please share with me. )
esearch -db gene -query "BRCA1" | elink -target snp | efetch -format xml
This gives me an xml file with rs entries like this below (this entry is for one rsID 864622723); In the following xml file output for rs id "864622723", could you please help me understand the doubts below:
<Rs rsId="**864622723**" snpClass="snp" snpType="notwithdrawn" molType="genomic" bitField="050060080A05000010020100" taxId="9606"><Validation/><Create build="146" date="2016-02-05 11:14"/ ><Update build="146" date="2017-02-06 13:30"/><Sequence exemplarSs="1966658404"><Seq5>AGAATCCTAGAGATACTGAAGATGTTCCTTGGATAACACTAAATAGCAGC</Seq5><Observed>A/G</Observed><Seq3>TTCAGAAAGT TAATGAGTGGTTTTCCAGAAGTGATGAACTGTTAGGTTCT</Seq3></Sequence><Ss ssId="1966658404" handle="CLINVAR" batchId="1062407" locSnpId="SCV000262150" subSnpClass="snp" orient="forward" strand="t op" molType="genomic" buildId="146" methodClass="sequence" validated="by-submitter"><Sequence><Seq5>AGAATCCTAGAGATACTGAAGATGTTCCTTGGATAACACTAAATAGCAGC</Seq5><Observed>A/G</Observed><S eq3>TTCAGAAAGTTAATGAGTGGTTTTCCAGAAGTGATGAACTGTTAGGTTCT</Seq3></Sequence></Ss><Assembly dbSnpBuild="150" genomeBuild="38.3" groupLabel="GRCh38.p7" current="true" reference="true"><Comp onent componentType="contig" accession="NT_010783.16" chromosome="17" start="26935980" end="81742541" orientation="fwd" gi="568802181" groupTerm="NC_000017.11" contigLabel="GCF_000001405.33"><MapLoc asnFrom="16158415" asnTo="16158415" locType="exact" alnQuality="1" orient="reverse" physMapInt="43094395" leftContigNeighborPos="16158414" rightContigNeighborPos="16158416" refAllele="T"><FxnSet geneId="672" symbol="BRCA1" mrnaAcc="NM_007294" mrnaVer="3" protAcc="NP_009225" protVer="1" fxnClass="missense" readingFrame="1" allele="G" residue="V" aaP osition="378" soTerm="non_synonymous_codon"/><FxnSet geneId="672" symbol="BRCA1" mrnaAcc="NM_007294" mrnaVer="3" protAcc="NP_009225" protVer="1" fxnClass="reference" readingFrame="1" allele="A" residue="I" aaPosition="378"/><FxnSet geneId="672" symbol="BRCA1" mrnaAcc="NM_007297" mrnaVer="3" protAcc="NP_009228" protVer="2" fxnClass="reference" readingFrame="1" alle le="A" residue="I" aaPosition="331"/><FxnSet geneId="672" symbol="BRCA1" mrnaAcc="NM_007297" mrnaVer="3" protAcc="NP_009228" protVer="2" fxnClass="missense" readingFrame="1" allele="G " residue="V" aaPosition="331" soTerm="non_synonymous_codon"/><FxnSet geneId="672" symbol="BRCA1" mrnaAcc="NM_007298" mrnaVer="3" fxnClass="intron-variant" soTerm="intron_variant"/><F xnSet geneId="672" symbol="BRCA1" mrnaAcc="NM_007299" mrnaVer="3" fxnClass="intron-variant" soTerm="intron_variant"/><FxnSet geneId="672" symbol="BRCA1" mrnaAcc="NM_007300" mrnaVer="3 " protAcc="NP_009231" protVer="2" fxnClass="reference" readingFrame="1" allele="A" residue="I" aaPosition="378"/><FxnSet geneId="672" symbol="BRCA1" mrnaAcc="NM_007300" mrnaVer="3" pr otAcc="NP_009231" protVer="2" fxnClass="missense" readingFrame="1" allele="G" residue="V" aaPosition="378" soTerm="non_synonymous_codon"/><FxnSet geneId="672" symbol="BRCA1" mrnaAcc=" NR_027676" mrnaVer="1" fxnClass="nc-transcript-variant" allele="-" soTerm="nc_transcript_variant"/><FxnSet geneId="672" symbol="BRCA1" mrnaAcc="U14680" mrnaVer="1" protAcc="AAA73985" protVer="1" fxnClass="reference" readingFrame="1" allele="A" residue="I" aaPosition="378"/><FxnSet geneId="672" symbol="BRCA1" mrnaAcc="U14680" mrnaVer="1" protAcc="AAA73985" protVer= "1" fxnClass="missense" readingFrame="1" allele="G" residue="V" aaPosition="378" soTerm="non_synonymous_codon"/></MapLoc></Component><SnpStat mapWeight="unique-in-contig" chromCount=" 1" placedContigCount="1" unplacedContigCount="0" seqlocCount="1" hapCount="0"/></Assembly><PrimarySequence dbSnpBuild="150" gi="555931" source="hgvs" accession="U14680.1"><MapLoc asnF rom="1253" asnTo="1253" locType="exact" alnQuality="1" orient="forward" leftContigNeighborPos="1252" rightContigNeighborPos="1254" refAllele="A"/></PrimarySequence><PrimarySequence db SnpBuild="150" gi="237681118" source="hgvs" accession="NM_007300.3"><MapLoc asnFrom="1366" asnTo="1366" locType="exact" alnQuality="1" orient="forward" leftContigNeighborPos="1365" ri ghtContigNeighborPos="1367" refAllele="A"/></PrimarySequence><PrimarySequence dbSnpBuild="150" gi="237681120" source="hgvs" accession="NM_007297.3"><MapLoc asnFrom="1274" asnTo="1274" locType="exact" alnQuality="1" orient="forward" leftContigNeighborPos="1273" rightContigNeighborPos="1275" refAllele="A"/></PrimarySequence><PrimarySequence dbSnpBuild="150" gi="2376 81126" source="hgvs" accession="NR_027676.1"><MapLoc asnFrom="1270" asnTo="1270" locType="exact" alnQuality="1" orient="forward" leftContigNeighborPos="1269" rightContigNeighborPos="1 271" refAllele="A"/></PrimarySequence><PrimarySequence dbSnpBuild="150" gi="237757283" source="hgvs" accession="NM_007294.3"><MapLoc asnFrom="1366" asnTo="1366" locType="exact" alnQua lity="1" orient="forward" leftContigNeighborPos="1365" rightContigNeighborPos="1367" refAllele="A"/></PrimarySequence><PrimarySequence dbSnpBuild="150" gi="262359905" source="hgvs" ac cession="NG_005905.2"><MapLoc asnFrom="123587" asnTo="123587" locType="exact" alnQuality="1" orient="forward" leftContigNeighborPos="123586" rightContigNeighborPos="123588" refAllele= "A"/></PrimarySequence><RsLinkout resourceId="1" linkValue="864622723"/><hgvs>AAA73985.1:p.Ile379Val</hgvs><hgvs>NC_000017.10:g.41246413T>C</hgvs><hgvs>NC_000017.11:g.43094396T> C</hgvs><hgvs>NG_005905.2:g.123588A>G</hgvs><hgvs>NM_007300.3:c.1135A>G</hgvs><hgvs>NM_007297.3:c.994A>G</hgvs><hgvs>NM_007298.3:c.787+348A>G</hgvs><hgvs>NM_007299.3:c.787 +348A>G</hgvs><hgvs>NM_007294.3:c.1135A>G</hgvs><hgvs>NP_009228.2:p.Ile332Val</hgvs><hgvs>NP_009225.1:p.Ile379Val</hgvs><hgvs>NP_009231.2:p.Ile379Val</hgvs><hgvs>NR_027676.1:n.1 271A>G</hgvs><hgvs>U14680.1:c.1135A>G</hgvs><Phenotype><ClinicalSignificance>Uncertain significance</ClinicalSignificance></Phenotype></Rs>
a) Is "<seq5>AGAATCCTAGAGATACTGAAGATGTTCCTTGGATAACACTAAATAGCAGC</seq5>" the 5' UTR? If not, can you please let me know in this entry which is the 5' UTR or how 5' UTR can be extracted?
b) Is "<seq3>TTCAGAAAGTTAATGAGTGGTTTTCCAGAAGTGATGAACTGTTAGGTTCT</seq3>" the 3' UTR? If not, can you please let me know in this entry which is the 3' UTR or how 3' UTR can be extracted??
c) In this entry, which allele(s) in 5' UTR is mutated to what allele, and at what position?
d) In this entry, which allele(s) in 3' UTR is mutated to what allele, and at what position?
This info will hep me to retrieve correct data for the respective rs number.
Thank you so much!
Regards, DK
For some reason, the lines have gotten strikethrough. Pease ignore the strikethroughs and help me with your comments/solutions. Thanks
I added markup to your post for increased readability and fixing the strikethrough problem. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below: