Question: Arabidopsis thaliana RNA-Seq analysis: Is 68% transcript annotation acceptable/expected with Ensembl ref and new tuxedo pipeline
0
gravatar for arctic
3 months ago by
arctic10
arctic10 wrote:

Dear all, I am new to the field. I have recently been using the new tuxedo pipeline (HISAT2 aligner and StringTie Assembler with "de novo" assembly) for RNA-Seq data of Arabidopsis thaliana (more details below). The pipeline in my hand has identified ~26K transcripts with ~15K being assigned a Gene Symbol from the reference gtf. I wonder if this ratio (68% of transcripts being assigned gene symbols) is within expected range? If you have experience with Arabidopsis RNA-Seq data, your input is appreciated.

Thank you for your reply beforehand.

More details on the data (if needed): - Samples: 18 - RNA Prep: SMART-Seq® v4 Ultra® Low Input RNA Kit for Sequencing (Clontech) - Library Prep: Nextera® DNA Library Prep (Illumina) - Seq: NextSeq500 sequencing - Cycles: 75Cycles(paired-end) - Sample Num: 18 - Ensemble References Used: Arabidopsis_thaliana.TAIR10.dna.toplevel.fa Arabidopsis_thaliana.TAIR10.45.gtf

ADD COMMENTlink modified 3 months ago • written 3 months ago by arctic10
3
gravatar for lieven.sterck
3 months ago by
lieven.sterck7.2k
VIB, Ghent, Belgium
lieven.sterck7.2k wrote:

Yes, I would say that is according to expectations (70% "known" genes is about the point we are at in arabidopsis indeed)

ADD COMMENTlink written 3 months ago by lieven.sterck7.2k
1

Great. Thank you for replying.

ADD REPLYlink written 3 months ago by arctic10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1711 users visited in the last hour