How can I get a count of mRNA + lncRNAs from Gencode + lncipedia?
0
0
Entering edit mode
3.2 years ago
njk639 • 0

Hi all,

I'm working with some high-depth, 100bp PE RNA-Seq data and we'd like to look at both mRNA and lncRNA.

Right now my workflow looks like the following:

  1. Align reads to human genome (GRCh38.primary_assembly.genome.fa) via STAR.

    STAR --runMode alignReads --runThreadN 8 --genomeDir INDEX_DIR_HERE --outSAMtype BAM Unsorted --readFilesIn FASTQ_FILEPATHS_HERE

  2. Generate count tables via featureCounts, I have been doing this twice for my annotations, once to generate a count table from the gencodev36 primary assembly annotation, and again to generate a count table from the lncipedia 5.2 annotations.

    featureCounts -T 8 -a GENCODE_OR_LNCIPEDIA_GTF -t exon -s 2 -p -g gene_id -o Counts.txt BAM_FILES

  3. I then use DESeq2 to get differential genes.

The issue I'm running into right now is cutting down on redundancy between the gencode dataset and lncipedia. Since some of the lncRNAs are also in the gencode annotations, those get included twice. I've tried using biomaRt to convert ensembl gene IDs to HGNC symbols, but this is not proving very effective as not all of the ensemble lncRNAs IDs in gencode have hgnc symbols.

What would be the easiest way for me to ensure I get accurate counts of mRNA and lncRNA in one table?

rna-seq lncRNA • 720 views
ADD COMMENT
0
Entering edit mode

GENCODE GTF does have lncRNA's in it. Are you excluding those during counting?

ADD REPLY
0
Entering edit mode

No, I'm not sure how to exclude those from gencode. That's sort of the problem. I want the more extensive listing of the lncRNAs provided by lncipedia while still getting all the "standard" genes from Gencode.

ADD REPLY
0
Entering edit mode

You could simply grep -v those entries out

$ grep -v lncRNA gencode.v36.primary_assembly.annotation.gtf > gencode_minus_lncRNA.gtf
ADD REPLY

Login before adding your answer.

Traffic: 2330 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6