Question

Critique my Pipeline

0

Entering edit mode

5.2 years ago

sdbaney ▴ 10

I have a non-model organism so no reference genome. Trying to identify differentially expressed genes contributing to muscle function (force over length extended). We performed RNASeq, then de novo assembled (Trinity) and also constructed a supertranscript to use as a reference for alignments.

We identified some rRNA contamination and removed those sequences but we are now seeing some rather disappointing alignment rates when using HISAT2. Our lowest being in the 60s majority in the high 70s %

We are using StringTie and then CuffDiff for differential expression. I know, I know, it's deprecated and everyone says not to use it. When I brought this up to my advisor he said "We aren't looking for minute differences, we're looking for large structural differences so the program we use to find those shouldn't matter too much since almost every program worth its salt should be able to pick up on those differences."

What are your thoughts, is he right? Or should I push to use a salmon/kallisto/sleuth pipeline? With these new HISAT2 alignments I'm feeling more apprehensive about this current pipeline.

Yes, we hope to publish.

RNA-Seq differential expression • 956 views

ADD COMMENT • link 5.2 years ago by sdbaney ▴ 10

1

Entering edit mode

Our lowest being in the 60s majority in the high 70s %

That is not necessarily a bad thing considering you don't have a well characterized transcriptome. You should try STAR or bbmap to see if your results stay in the same ballpark.

ADD REPLY • link 5.2 years ago by GenoMax 141k

0

Entering edit mode

If

"the program we use to find those shouldn't matter too much since almost every program worth its salt should be able to pick up on those differences."

Then there's no justification not to use something up to date. If your supervisor needs more convincing, tell them a reviewer would probably make the same criticism, so why not nip this problem in the bud.

ADD REPLY • link 5.2 years ago by Joe 21k

0

Entering edit mode

So you think I should avoid cuffdiff? I keep reading about this and I see a pattern. Transcriptome mapping (which is what I’m doing here, no?) focuses on transcripts present and follows the kallisto/sleuth of salmon/DESeq2/EdgeR. Whereas cuffdiff is better suited for genome mapping.

The question I have and the reads that I have with no genome to align to but rather supertranscripts of a De novo assembled transcriptome would be better suited for the transcriptome mapping. I just want to make sure I can explain this to my advisor and know what I’m talking about.

ADD REPLY • link 5.2 years ago by sdbaney ▴ 10