Illumina Custom amplicon data issue
1
0
Entering edit mode
7.6 years ago
Mamta ▴ 460

Hi all,

Recently we got a Illumina custom amplicon prepared for our project. The illumina provided us score as regards to their sequence coverage where anything with greater 60% being good. We got the data running around 24 samples per flow cell. The coverage is highly variable plus the variant calls looks suspicious with many same variant calls called in most of the samples.

I need suggestions if this data can be cleaned up anyhow or is it in this condition still usable.

Thanks in advance!

Mamta

custom-amplicon illumina sequencing coverage • 2.0k views
ADD COMMENT
1
Entering edit mode
7.6 years ago
rbagnall ★ 1.7k

Hello Mamta,

I have been using the Trusight cardiomyopathy sequencing panel from Illumina, which is an off-the-shelf amplicon enrichment kit. Like you, I also sequenced 24 indexed samples per lane on a MiSeq.

I find that after de-multiplexing the 24 samples, the percent of reads per individual is quite variable, such that 1% of reads came from one individual whereas up to 6% of the reads came from another individual. This is quite reproducible too; samples amplified with N307 and E501index primers yielded the fewest reads. Illumina suggested that "it may be index specific, especially since the pattern is reproducible across the runs. Indexes will have varying amounts of PCR efficiency and demultiplexing efficiency and it's just natural variation that we would expect to see."

Coverage of different amplicons within a sample is variable too, but I'd say this is expected. With an average coverage of ~300 fold, almost all regions have sufficient coverage.

I'm not sure why you have the same variant calls in most samples. Apart from the obvious errors like contamination and a problem with de-multiplexing, I guess if there is a partial duplicate, or similar homologous sequence, to your intended target, you may have sequenced that too? I see this for some regions when exome sequencing and aligning to GRCh37 and there is a pseudogene copy that gets sequenced, but is not represented in the reference sequence; i.e. a misalignment. You could check how 'unique' your intended target is by looking at the region in UCSC browser with the mappability option displayed.

ADD COMMENT
0
Entering edit mode

Hi rbagnall,

can we talk about this, I have some few questions and don't know anyone who has used custom amplicon (spoke to illumina, but didn't help much). Is it possible to chat(email) sometime?

Thanks!

ADD REPLY
0
Entering edit mode

Sure, send me an email. my address is my biostars username followed by @hotmail.com

ADD REPLY

Login before adding your answer.

Traffic: 2567 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6