Question

Looking For Frequency Of A Very Specific Rna Editing Event In Publicly Available Rna-Seq Data

1

Entering edit mode

11.0 years ago

alpha2zee ▴ 120

I am working on a ubiquitously and well expressed gene whose ~1.5 kb-sized transcripts get mutated at one and only one site. The mutation is a nonsense one that changes an ORF codon of the mRNA to a stop codon. The underlying biology and whether this mutation occurs in all or specific cells is unknown. The mutation frequency, disregarding the kinetics of transcript degradation, is estimated to vary between 0% and 5% in the only cell-type that it has been studied in.

I am interested in examining publicly available raw RNA sequencing data to get an idea of the types of tissues (e.g., a specific cancer tissue) or cells that this mutation occurs in (as well as the mutation frequency).

Can anyone suggest how I should go about it?

I have looked for but cannot find some site where I might be able to simply perform a similarity search against raw RNA sequencing data. It seems I will have to download such raw data and perform a similarity search.

I am still looking for a way to get raw RNA sequencing data for the Cancer Genome Atlas (TCGA) project – perhaps they are not released to the public? – but I can get data for the Human Body Map and an ENCODE project as .sra files – see http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE30611 and http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE26284. These projects have examined a wide variety of cells/tissues.

Once I download such raw data, how do I actually program my search? I am fairly familiar with using R (and shell scripts). Note that I don't care if the variation I am looking for in the mRNA might be a genomic one (at the DNA level).

Thanks.

rna sequencing mutation tcga blast • 3.2k views

ADD COMMENT • link 10.9 years ago by alpha2zee ▴ 120

score 4 · Answer 1 · 2013-05-13

You're right that you're not going to find a pre-loaded web tool that you can just query. You'll almost certainly have to download the bam files and look for then yourself. TCGA bams are available though CGHub:https://cghub.ucsc.edu/

Once you have the bams you're interested in, you can do it in the manual way and just open them in a genome browser, (like IGV) to look for evidence of your event. Alternately, you could write a little script that runs samtools mpileup on that small region of bam file and grabs readcounts from there.

score 0 · Answer 2 · 2013-05-23

0

Entering edit mode

10.9 years ago

alpha2zee ▴ 120

Thank you for the suggestions. I was able to use samtools mpileup and a Python script for my analyses.

ADD COMMENT • link 10.9 years ago by alpha2zee ▴ 120