I created a reference for my species of interest using the cDNA file from ensembl and then aligned my data to it using kallisto. Then, i was asked to add the ncRNA to the reference and a egfp sequence. I concatenated the 3 files using cat and then created the reference. I aligned my samples again, used tximport to get gene counts and now I have a few questions.
- When I aligned to the cDNA only I had gene 18SrRNA-Psi with 150560 counts. After aligning to the cDNA+ncRNA that gene now has 90 counts. However, in the new reference there are more genes that start with 18SrRNA. Could some of the counts be assigned to the other 18srRNA genes and that is why the new reference has less counts? I only checked the first 10 rows of my files and noticed that. The other 9 genes have the same counts.
- The txi.kallisto$abundance and txi.kallisto$counts files following the alignment with the cDNA+ncRNA have an extra line at the beginning that has only 0. This was not happening with the reference made only with cDNA. What could be causing that?