Critique my Pipeline
0
0
Entering edit mode
5.2 years ago
sdbaney ▴ 10

I have a non-model organism so no reference genome. Trying to identify differentially expressed genes contributing to muscle function (force over length extended). We performed RNASeq, then de novo assembled (Trinity) and also constructed a supertranscript to use as a reference for alignments.

We identified some rRNA contamination and removed those sequences but we are now seeing some rather disappointing alignment rates when using HISAT2. Our lowest being in the 60s majority in the high 70s %

We are using StringTie and then CuffDiff for differential expression. I know, I know, it's deprecated and everyone says not to use it. When I brought this up to my advisor he said "We aren't looking for minute differences, we're looking for large structural differences so the program we use to find those shouldn't matter too much since almost every program worth its salt should be able to pick up on those differences."

What are your thoughts, is he right? Or should I push to use a salmon/kallisto/sleuth pipeline? With these new HISAT2 alignments I'm feeling more apprehensive about this current pipeline.

Yes, we hope to publish.

RNA-Seq differential expression • 956 views
ADD COMMENT
1
Entering edit mode

Our lowest being in the 60s majority in the high 70s %

That is not necessarily a bad thing considering you don't have a well characterized transcriptome. You should try STAR or bbmap to see if your results stay in the same ballpark.

ADD REPLY
0
Entering edit mode

If

"the program we use to find those shouldn't matter too much since almost every program worth its salt should be able to pick up on those differences."

Then there's no justification not to use something up to date. If your supervisor needs more convincing, tell them a reviewer would probably make the same criticism, so why not nip this problem in the bud.

ADD REPLY
0
Entering edit mode

So you think I should avoid cuffdiff? I keep reading about this and I see a pattern. Transcriptome mapping (which is what I’m doing here, no?) focuses on transcripts present and follows the kallisto/sleuth of salmon/DESeq2/EdgeR. Whereas cuffdiff is better suited for genome mapping.

The question I have and the reads that I have with no genome to align to but rather supertranscripts of a De novo assembled transcriptome would be better suited for the transcriptome mapping. I just want to make sure I can explain this to my advisor and know what I’m talking about.

ADD REPLY

Login before adding your answer.

Traffic: 2739 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6