Question: Extracting uniquely mapped reads from bams
gravatar for gokberk
10 months ago by
gokberk60 wrote:

Hi everyone!

I generated multiple bam files with hisat2 using following commands:

for single-end reads:

hisat2 --max-intronlen 20000 -q -x hisat2_index/rheMac8_index -U myfile.fastq -S myfile.sam

for paired-end reads:

hisat2 --max-intronlen 15000 --score-min L,0,-0.1 --no-discordant -p 5 -q -x hisat2_index/rheMac8_index -1 myfile_1.fastq -2 myfile_2.fastq -S myfile.sam

Now, I would like to extract uniquely mapped (aligned exactly 1 time) reads from my bam files to be able to differentiate between reads that map to a gene and its pseudogene. I tried a few things with awk to select reads with mapq value of 60, but could not succeed.

I also changed the --score-min formula slightly to make the mapping more stringent and differentiate between reads that map to the original gene and to the pseudogene. I was wondering if anyone would have any tips about options that might help to differentiate between original vs. pseduogene mapped reads.

I would be more than happy if anyone could help me with these issues.

Best, Gökberk

hisat2 rna-seq • 450 views
ADD COMMENTlink modified 8 months ago by Biostar ♦♦ 20 • written 10 months ago by gokberk60

I think this biostars thread might be helpful in your case.

ADD REPLYlink written 10 months ago by Nitin Narwade440

Also see How to select uniquely and concordantly reads from hisat2 alingment for raw read count and please use google and the search function. This has been adressed before.

ADD REPLYlink written 10 months ago by ATpoint32k

Thanks for your responses Nitin and ATpoint. I've already seen the thread ATpoint mentioned, however the suggested code there gave me an empty bam file for some reason. That's why I opened a new issue. And on top of that, I was hoping to find some suggestions for mapping RNA-seq reads to a gene and its pseudogenes. Cheers.

edit: I've just tried the samtools view -bq 1 file.bam > unique.bam command in the thread Nitin has pointed. However, after I do that I still have some reads with mapq score other than 60 (as far as I know, uniquely mapped reads have a mapq score of 60).

ADD REPLYlink modified 10 months ago • written 10 months ago by gokberk60

If you read the samtools view manual, you will see that that filter takes a quality score as a filter. If you want MAPQ scores of 60 or greater you should change the value from 1 to 60.

-q INT   only include reads with mapping quality >= INT [0]
ADD REPLYlink modified 8 months ago • written 8 months ago by benformatics1.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1725 users visited in the last hour