Question: Low mapping rate for human NGS PE reads to hs37d5 genome
0
gravatar for Ginsea Chen
5 weeks ago by
Ginsea Chen130
Chinese Academy of Tropical Agricultural Sciences, Danzhou, China
Ginsea Chen130 wrote:

Dear all.

I sequenced DNA samples of a human being by using NGS technology and mapping reads (length is 90bp) to the human genome (version:hs37d5). Now I detected so low mapping rate (normal sample is higher than 99%, while my value is 88%). I collected all unmapped reads (243118 reads, flag of bam is 0) and tried to find their origins, while I can' t find any hits in NCBI nr database and only 2430 reads contained index sequences and only 210 reads containing adapter sequences.

So, my question is how should I do to find any reason which causes this low mapping rate? If you have some suggestions, please tell me.

Thanks.

ADD COMMENTlink written 5 weeks ago by Ginsea Chen130

Did you run fastqc to check if you might have carryover of adapters or other overrepresented sequences?

ADD REPLYlink written 5 weeks ago by ATpoint26k

I have used cutadapter to cut adapter sequences and used our in house script to filter low-quality reads. I have never used fastqc. Thanks for you suggestion, I will try it.

ADD REPLYlink written 5 weeks ago by Ginsea Chen130

I have used fastqc to treat all unmapped reads, and I get base sequence quality like:enter image description here and get base sequence content like:enter image description here

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Ginsea Chen130

This unmapped data does not appear to be of great quality (median values around Q24 ). As others have said 88% is not bad alignment rate by any means. You may want to take some of the unmapped reads and blast them to see if they are contaminants.

ADD REPLYlink written 5 weeks ago by genomax74k

I have mapped all unmapped reads to NCBI nr database and not find any matching record.

ADD REPLYlink written 5 weeks ago by Ginsea Chen130

There is not much you can do in that case. These could simply be sequencing artifacts.

Note: Did you do a translated blast search since you mention nr? How about a blastn search with nt?

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by genomax74k

Agreed. Given that base quality the results appear to be fine, I've seen worse mapping rates. I suggest you proceed with downstream analysis and see if this goes without issues. If so, don't bother yourself with the mapping rate.

ADD REPLYlink written 5 weeks ago by ATpoint26k

Thanks for @ATpoint and @genomax. These samples with low mapping rates have been analyzed, and we observed samples with mapping rate lower than 95% always contained some abnormal SNP/indel variations which around with may soft clip bases like follows.

enter image description here

Now, I am not sure there was a direct relationship among low mapping rates and much soft clip reads around snp/indel variations, while I always observed lots of SNP/indel variations around may soft clip bases in a sample which mapping rates lower than 95%.

ADD REPLYlink written 5 weeks ago by Ginsea Chen130

I would say 88% is not actually very low, but within acceptable limits, we usually get 85-95%. But anyway, you also might try to run FASTQC on your raw data to see if you have any adapters or overrepresented sequences.

ADD REPLYlink written 5 weeks ago by grant.hovhannisyan1.8k

For samples which mapping rate lower than 95%, we will observe much soft clip reads around a SNP or indel variations like supplement figure(https://photos.app.goo.gl/FTxpyvn2qZJDGhnC8 ), So we think the unknown reason which causes low mapping rate may influence the accuracy of variations detecting in target samples.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Ginsea Chen130

The link is not functional. Please upload the image to a public image hoster such as ImgBB and then paste the full link including the prefix (e.g. .png) into the image field:

enter image description here

ADD REPLYlink written 5 weeks ago by ATpoint26k
1

Please try again, I have fixed it

ADD REPLYlink written 5 weeks ago by Ginsea Chen130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1412 users visited in the last hour