Question: Arabidopsis thaliana RNA-Seq analysis: Is 68% transcript annotation acceptable/expected with Ensembl ref and new tuxedo pipeline
gravatar for arctic
3 months ago by
arctic10 wrote:

Dear all, I am new to the field. I have recently been using the new tuxedo pipeline (HISAT2 aligner and StringTie Assembler with "de novo" assembly) for RNA-Seq data of Arabidopsis thaliana (more details below). The pipeline in my hand has identified ~26K transcripts with ~15K being assigned a Gene Symbol from the reference gtf. I wonder if this ratio (68% of transcripts being assigned gene symbols) is within expected range? If you have experience with Arabidopsis RNA-Seq data, your input is appreciated.

Thank you for your reply beforehand.

More details on the data (if needed): - Samples: 18 - RNA Prep: SMART-Seq® v4 Ultra® Low Input RNA Kit for Sequencing (Clontech) - Library Prep: Nextera® DNA Library Prep (Illumina) - Seq: NextSeq500 sequencing - Cycles: 75Cycles(paired-end) - Sample Num: 18 - Ensemble References Used: Arabidopsis_thaliana.TAIR10.dna.toplevel.fa Arabidopsis_thaliana.TAIR10.45.gtf

ADD COMMENTlink modified 3 months ago • written 3 months ago by arctic10
gravatar for lieven.sterck
3 months ago by
VIB, Ghent, Belgium
lieven.sterck7.2k wrote:

Yes, I would say that is according to expectations (70% "known" genes is about the point we are at in arabidopsis indeed)

ADD COMMENTlink written 3 months ago by lieven.sterck7.2k

Great. Thank you for replying.

ADD REPLYlink written 3 months ago by arctic10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1711 users visited in the last hour