Question: Critique my Pipeline
0
gravatar for sdbaney
27 days ago by
sdbaney0
sdbaney0 wrote:

I have a non-model organism so no reference genome. Trying to identify differentially expressed genes contributing to muscle function (force over length extended). We performed RNASeq, then de novo assembled (Trinity) and also constructed a supertranscript to use as a reference for alignments.

We identified some rRNA contamination and removed those sequences but we are now seeing some rather disappointing alignment rates when using HISAT2. Our lowest being in the 60s majority in the high 70s %

We are using StringTie and then CuffDiff for differential expression. I know, I know, it's deprecated and everyone says not to use it. When I brought this up to my advisor he said "We aren't looking for minute differences, we're looking for large structural differences so the program we use to find those shouldn't matter too much since almost every program worth its salt should be able to pick up on those differences."

What are your thoughts, is he right? Or should I push to use a salmon/kallisto/sleuth pipeline? With these new HISAT2 alignments I'm feeling more apprehensive about this current pipeline.

Yes, we hope to publish.

ADD COMMENTlink written 27 days ago by sdbaney0
1

Our lowest being in the 60s majority in the high 70s %

That is not necessarily a bad thing considering you don't have a well characterized transcriptome. You should try STAR or bbmap to see if your results stay in the same ballpark.

ADD REPLYlink modified 27 days ago • written 27 days ago by genomax62k

If

"the program we use to find those shouldn't matter too much since almost every program worth its salt should be able to pick up on those differences."

Then there's no justification not to use something up to date. If your supervisor needs more convincing, tell them a reviewer would probably make the same criticism, so why not nip this problem in the bud.

ADD REPLYlink written 27 days ago by jrj.healey10k

So you think I should avoid cuffdiff? I keep reading about this and I see a pattern. Transcriptome mapping (which is what I’m doing here, no?) focuses on transcripts present and follows the kallisto/sleuth of salmon/DESeq2/EdgeR. Whereas cuffdiff is better suited for genome mapping.

The question I have and the reads that I have with no genome to align to but rather supertranscripts of a De novo assembled transcriptome would be better suited for the transcriptome mapping. I just want to make sure I can explain this to my advisor and know what I’m talking about.

ADD REPLYlink written 27 days ago by sdbaney0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1296 users visited in the last hour