what's the difference between mapping PE fastqs to genome and to specified target region?
2
0
Entering edit mode
8.3 years ago
winter_li ▴ 60

Hi,

I built a new reference with a gene sequence.

Step 1: Using PE fastqs to map complete genome (hg19)

Step 2: Using PE fastqs to map part genome sequence ,eg: gene region

What 's the difference between Step 1 and Step 2 ? I wanna use bwa mem way.

Best

genome sequence alignment gene • 1.5k views
ADD COMMENT
1
Entering edit mode
8.3 years ago
agata88 ▴ 870

It depends if your gene has a pseudogene - in that case mapping to human genome is more correct, otherwise your all reads (for gene and pseudogene) will be mapped to your specified region. If your next step include variant detection, step2 will produce a lot of mistakes (false positives). It also depends how much pseudogene differ from gene - if the difference is small, also mapping to human genome can be wrong. Best, Agata

ADD COMMENT
0
Entering edit mode
8.3 years ago

Option 1: whole genome mapping is obviously slower, but it is more accurate as the reads will map onto the most similar region it finds.

Option 2: a way faster, but may provide wrong results (i.e. reads of a sequence originated from an other part of the genome).

It really depends on how was designed the sample preparation step, but if you can afford the whole genome mapping in terms of duration, I would recommend this option strongly to avoid false positive findings.

ADD COMMENT

Login before adding your answer.

Traffic: 6756 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6