Question

Strategy to mapping RNAseq to a de novo transcriptome assembly?

0

Entering edit mode

4.8 years ago

frolic_fern • 0

Hello,

I am new to bioinformatics (and biostars...). Recently I built 2 de novo assemblies for the 2 species I am working on (no genome or transcriptome data available out there). I am using these assemblies to get expression data for both species. The goal is to compare expression of certain orthogroups across species and look for differences (I know, not the most accurate or best practice, but we are giving it a shot).

It was recommended to me that I :

Trim my reads (originally PE, 150 bp each) down to 50 bp and map them in single end mode
Use Hisat2 to map, and HTseq-count to get counts (rather than the automated Bowtie, RSEM pipeline that most people use after running Trinity)
NOT map the reads to the longest_orfs.cds file I got from Transdecoder (though I have to use the gff3 file output by Transdecoder when running hisat2), but that I should map them to my transcriptome assembly fasta file (which has been filtered some)

I do not understand the reasoning behind #1 and #3, can anyone explain why I should do things this way?

Does anyone know why I might use Hisat2/HTseq over Bowtie/RSEM for a transcriptome assembly? I have not been able to find examples of Hisat2/HTseq being used with de novo assemblies and am concerned.

Thank you in advance for your help!

rna-seq transcriptome assembly mapping • 1.1k views

ADD COMMENT • link 4.8 years ago by frolic_fern • 0