Differentially gene expression multispecies
10 weeks ago

Hi friends, how are you? I need your help with my study project. I have RNAseq data for 3 species (6 replicates per species), species form a genus within bedbugs. I would like to analyze the differential expression of these species, however I have doubts if the answers I get are real or methodological biases. At the moment I am using all assemblies (18 = 6 per species) to map the expression and I have interesting results, however I don't know if it could be technical bias.

forward my results

enter image description here enter image description here


#build reference with all assemblies (n =18)

kallisto index -i reference ( all assemblies) 

#analysed by sample (pairend)

kallisto quant -i reference.idx -o output --rf-stranded -b 100 r1.fasta r2.fasta

## estimetes 
Trinity/util/abundance_estimates_to_matrix.pl \
 --est_method kallisto --gene_trans_map reference.fasta.gene_trans_map \
 --name_sample_by_basedir --cross_sample_norm TMM --out_prefix outdir \
sample1, sample2 ...sample18

Trinity/Analysis/DifferentialExpression/run_DE_analysis.pl --matrix  gene.counts.matrix --method edgeR --output out --dispersion 0.1

Trinity/Analysis/DifferentialExpression/analyze_diff_expr.pl --matrix gene.TMM.EXPR.matrix --max_genes_clust 1000000 -P 1e-3 -C 4

Trinity/Analysis/DifferentialExpression/define_clusters_by_cutting_tree.pl -R / diffExpr.P1e-3_C4.matrix.RData --Ptree 60

What do you guys think will be next to trust us in my results or improve them?

do you know if each taxon was sequenced as a separate batch or all the samples were sequenced together? Samples of each taxon cluster together, this can be biologically meaningful but also could be due to a batch effect.

Hi, buddy, sorry for the delay. On sequencing they were sequenced in the same batch and the conditions were the same. My doubt is that because I don't have a genome or transcriptome as a reference, I may be obtaining "non-real" data about the expression. I chose to use all assemblies (n=18 (6 per species)) to obtain the reference and analyze against this "super reference". My collaborators are unsure about the results, however the methodology is consistent with "good practices".

I would like to be certain that I could continue with this study.


