Question: How to analyze unannotated lncRNA using RNA-seq data?
gravatar for xiaoyonf
3.7 years ago by
Baylor College of Medicine, Houston, Texas, USA
xiaoyonf10 wrote:

Hi all,

I have a ~2kb sequence of unannotated lncRNA acquired from published literature. Since it is unannotated, I can not search by its name in any Genome Browser (i.e. TCGA, UCSC) and check its expression in RNA-seq datasets.

How to analyze such unannotated lncRNA using RNA-seq data? e.g., its expression across different subtypes of BC in TCGA dataset?

Thanks, Xiaoyong

rna-seq • 1.5k views
ADD COMMENTlink modified 3.6 years ago by tiago2112871.2k • written 3.7 years ago by xiaoyonf10
gravatar for tiago211287
3.6 years ago by
tiago2112871.2k wrote:

If this feature is not annotated, the programs for counting and measuring will not 'see' it. I would first visualize the expression by looking into the coordinates of this unannotated lncRNA using IGV or any other visual tool. If you have no reads mapping to this position, there is nothing you can do because it is not being expressed in your dataset.

If it is being expressed, you can create some 'fake' row at the annotation file (GTF file) using the coordinates of this lncRNA you have so HTSeq or Kallisto could see it.

Afterwards, you can use any statistics program (DESeq2, EdgeR) for telling if it is over or under expressed.

For Kallisto, you can transform your modified GTF to a transcriptome fasta file using gffread from the cufflinks package like this:

gffread -w transcriptome.fa -g Reference.genome.fa annotation.gtf

Afterwards, you can use this transcriptome.fa in Kallisto index and perform the counting with kallisto quant.

PS: Kallisto give you both normalized data and raw counts estimation. If you are going to use DESeq2 keep in mind that you must give only raw counts as input and never normalized data.

ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by tiago2112871.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2056 users visited in the last hour