I have used trimmomatic software for trimming my paired-end RNA-Seq data, Now I have four output results viz.: sample_1p.fq.gz, sample_1u.fq.gz, sample_2p.fq.gz, and sample_2u.fq.gz. I followed this Question Which suggests using both samples paired and unpaired separately for alignment and then merge it using samtools
Then I did alignment procedure for paired end first,
hisat2 -p 8 --dta -x "/path/to/Index_files/indexfile" -1 sample_1p.fastq.gz -2 sample_2p.fastq.gz -S sample_paired.sam
8763987 reads; of these: 8763987 (100.00%) were paired; of these: 1431464 (16.33%) aligned concordantly 0 times 7029760 (80.21%) aligned concordantly exactly 1 time 302763 (3.45%) aligned concordantly >1 times ---- 1431464 pairs aligned concordantly 0 times; of these: 60100 (4.20%) aligned discordantly 1 time ---- 1371364 pairs aligned 0 times concordantly or discordantly; of these: 2742728 mates make up the pairs; of these: 1504242 (54.84%) aligned 0 times 1055477 (38.48%) aligned exactly 1 time 183009 (6.67%) aligned >1 times 91.42% overall alignment rate
and then for unpaired sample,
hisat2 -p 8 --dta -x "/path/to/Index_files/indexfile" -1 sample_1u.fastq.gz -2 sample_2u.fastq.gz -S sample_unpaired.sam
which resulted as
Error, fewer reads in the file specified with -2 than in file specified with -1 terminate called after throwing an instance of 'int' Aborted (core dumped) (ERR): hisat2-align exited with value 134
Now since my sample reads after trimming are not equal in both of the samples. My question: Is it fine If I only use paired sample data for alignment, quantification, and downstream Differential gene expression analysis? Are unpaired reads important for downstream processing? Or my trimming was more aggressive which retain unequal no. of reads?
Any help in this regard will be deeply appreciated. Thank you in advance.