I generated multiple bam files with hisat2 using following commands:
for single-end reads:
hisat2 --max-intronlen 20000 -q -x hisat2_index/rheMac8_index -U myfile.fastq -S myfile.sam
for paired-end reads:
hisat2 --max-intronlen 15000 --score-min L,0,-0.1 --no-discordant -p 5 -q -x hisat2_index/rheMac8_index -1 myfile_1.fastq -2 myfile_2.fastq -S myfile.sam
Now, I would like to extract uniquely mapped (aligned exactly 1 time) reads from my bam files to be able to differentiate between reads that map to a gene and its pseudogene. I tried a few things with awk to select reads with mapq value of 60, but could not succeed.
I also changed the
--score-min formula slightly to make the mapping more stringent and differentiate between reads that map to the original gene and to the pseudogene. I was wondering if anyone would have any tips about options that might help to differentiate between original vs. pseduogene mapped reads.
I would be more than happy if anyone could help me with these issues.