Question

Discover novel transcripts

0

Entering edit mode

9 months ago

Emy Alade ▴ 40

Hello everyone,

I am working on a project aimed at reconstructing the RNAseq transcriptome (59 bp, paired-end) to discover new transcripts in mice. I have six different tissues, and for each tissue, I have two WT replicates and two Mutant replicates. I will briefly describe my transcript assembly method.

1- Transcript reconstruction with StringTie

stringtie -p 8 -c 2 -j 2 "$BAM_FILE" -G $GTF_FILE -o "$OUTPUT_GTF"

2- Merging assembly files with StringTie, with and without using a reference

stringtie --merge -G $GTF_FILE -o $OUT_DIR/assembly_merged.gtf $INPUT_DIR/assembly_Stringtie*.gtf
stringtie --merge -o $OUT_DIR/assembly_merged.gtf $INPUT_DIR/assembly_Stringtie*.gtf

3- Transcript quantification

stringtie -p 12 -e -B $BAM_FILE -G $GTF_merged -o $OUTPUT_GTF

Questions:

What do you think of this approach? ( Is the method used suitable for discovering new transcripts? Are there any steps or parameters that could be optimized to improve the results?)
Should I merge the replicates before assembly (Step 1)?
How should I choose between merging with or without a reference (Step 2)?

transcriptomic assembly stringtie stringtie--merge • 1.1k views

ADD COMMENT • link updated 9 months ago by Istvan Albert 103k • written 9 months ago by Emy Alade ▴ 40

1

Entering edit mode

59 bp reads ?? auwch :-) , that's ancient data, is that possible?

and on a more constructive note: it's quite hard to get something meaningful with reads of that size I'm afraid (especially in assembly context)

ADD REPLY • link 9 months ago by lieven.sterck 16k

0

Entering edit mode

Yes, I have RNA-seq data with 59 bp read length.

So, do you think it’s not possible with this read length? enter image description here

ADD REPLY • link 9 months ago by Emy Alade ▴ 40

1

Entering edit mode

it is possible (technically speaking) but I fear a bit the results might be disappointing though

ADD REPLY • link 9 months ago by lieven.sterck 16k

score 0 · Answer 1 · 2025-01-23

Take your transcripts and align them to the genome, and see what you get.

I would venture to say that you should merge the replicates. That way transcripts that are rare in each sample would be able to be detected because you've increased their coverage by adding samples together.

I don't know what the effect of reference is on merging, it should help, frankly, as long as the reference is of decent quality.