Upstream pseudogene causing MAPQ 0 and exclusion during variant calling
1
0
Entering edit mode
14 months ago

Hello,

just upstream of GBA1 is a pseudogene that is quite similar to GBA1, this makes most of my mapped reads have MAPQ 0 across this region (I think? I use bwa-mem2 with default settings), and the variants are not listed in the corresponding VCF file.

Can I remedy this somehow? Mask the pseudogene and map again?

Sincerely, Joel

Mapping Variant-calling masking • 799 views
ADD COMMENT
0
Entering edit mode

Technically you can mask it, but what if the variant comes from the pseudogene and not the gene you "want"?

ADD REPLY
0
Entering edit mode

Yes, well... surely it's better with some FPs and TPs than nothing at all, eh. Is there another way to do this? Hmm...

ADD REPLY
2
Entering edit mode
13 months ago

Hey, I found the solution: Use a hg38 reference that does not include alt contigs, e.g.: https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.27_GRCh38.p12/GRCh38_major_release_seqs_for_alignment_pipelines/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz

Don't ask me about the 000/001/405 parent folders. I have no idea what they mean. See here for more(?) info: https://lh3.github.io/2017/11/13/which-human-reference-genome-to-use

ADD COMMENT
0
Entering edit mode

Don't ask me about the 000/001/405 parent folders. I have no idea what they mean.

They are simply parts of the accession number GCA_000 / 001 / 405 used to create a file system hierarchy.

ADD REPLY

Login before adding your answer.

Traffic: 2734 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6