Hi friends, how are you? I need your help with my study project. I have RNAseq data for 3 species (6 replicates per species), species form a genus within bedbugs. I would like to analyze the differential expression of these species, however I have doubts if the answers I get are real or methodological biases. At the moment I am using all assemblies (18 = 6 per species) to map the expression and I have interesting results, however I don't know if it could be technical bias.
On sequencing they were sequenced in the same batch and the conditions were the same. My doubt is that because I don't have a genome or transcriptome as a reference, I may be obtaining "non-real" data about the expression. I chose to use all assemblies (n=18 (6 per species)) to obtain the reference and analyze against this "super reference". My collaborators are unsure about the results, however the methodology is consistent with "good practices".
I would like to be certain that I could continue with this study.
#build reference with all assemblies (n =18) kallisto index -i reference ( all assemblies) #analysed by sample (pairend) kallisto quant -i reference.idx -o output --rf-stranded -b 100 r1.fasta r2.fasta ## estimetes Trinity/util/abundance_estimates_to_matrix.pl \ --est_method kallisto --gene_trans_map reference.fasta.gene_trans_map \ --name_sample_by_basedir --cross_sample_norm TMM --out_prefix outdir \ sample1, sample2 ...sample18 Trinity/Analysis/DifferentialExpression/run_DE_analysis.pl --matrix gene.counts.matrix --method edgeR --output out --dispersion 0.1 Trinity/Analysis/DifferentialExpression/analyze_diff_expr.pl --matrix gene.TMM.EXPR.matrix --max_genes_clust 1000000 -P 1e-3 -C 4 Trinity/Analysis/DifferentialExpression/define_clusters_by_cutting_tree.pl -R / diffExpr.P1e-3_C4.matrix.RData --Ptree 60!