Hi everyone,
I am trying to generate counts using PDX RNA paired end sequencing data, and I don't know if my counts have been done correctly. Here are the results:
__no_feature 4086152
__ambiguous 1018383
__too_low_aQual 355475
__not_aligned 482193
__alignment_not_unique 5079336
For creating the reference genome, I merged the human and mouse reference genome together.
Since both human and mouse reference genomes had the "chr #" naming scheme, I renamed all the mouse chromosomes to "m_chr #".
Then, I merged them like so:
cat GCF_000001405.40_GRCh38.p14_genomic.fna renamed_renamed_GCF_000001635.27_GRCm39_genomic.fna > merged_genome.fna
I also merged their corresponding gtf files in the same way.
I used the hisat2 / htseq-count pipeline. My number of not_aligned and other numbers look very concerning. Does anybody have experience with PDX RNA seq data processing that can provide any input?
Any advice would be greatly appreciated. Thank you
I also used fastp to QC the reads beforehand.