Question: Strange Pattern in Bam File
4
gravatar for Can Holyavkin
3.0 years ago by
Can Holyavkin240
Turkey
Can Holyavkin240 wrote:

We have encountered strange pattern in bam file that is generated from amplicon sequencing. (Nextera XT, illumina MiSeq).

As you can see at the middle of bam file, rectangular-shaped coverage is formed. Half of the reads finished at the right side of the rectangular; while other half of reads finished at the left side of rectangular.

What can cause such abnormal coverage distribution? Any structural variant?

enter image description here enter image description here enter image description here enter image description here

ADD COMMENTlink modified 3.0 years ago • written 3.0 years ago by Can Holyavkin240
7
gravatar for Can Holyavkin
3.0 years ago by
Can Holyavkin240
Turkey
Can Holyavkin240 wrote:

We solved the real cause of this pattern. It's duplication event of 200 bp area in that rectangular area. We BLAST the unmapped parts of reads at the ends of rectangular area. And we found out that they are perfectly matched to region inside of this area.

enter image description here

ADD COMMENTlink written 3.0 years ago by Can Holyavkin240
1

thanks for following up, interesting

ADD REPLYlink written 3.0 years ago by Istvan Albert ♦♦ 80k
4
gravatar for John
3.0 years ago by
John12k
Germany
John12k wrote:

You have an indel. Realign yo' reads with IndelRealigner. enter image description here

ADD COMMENTlink modified 3.0 years ago • written 3.0 years ago by John12k
2
gravatar for igor
3.0 years ago by
igor7.6k
United States
igor7.6k wrote:

If it's amplicon sequencing, wouldn't you expect uneven coverage that corresponds to your amplicons?

Also, these are not randomly sheared libraries. Nextera transposase cuts at certain site. You should expect to see more fragments at specific sequences.

ADD COMMENTlink written 3.0 years ago by igor7.6k
1

Dear igor, Actually you are right about uneven coverage of Nextera kits. We see such changes in especially in GC rich sites. However we did not come across such pattern in exome sequencing. As you know both of exome sequencing kits (Nextera Rapid Capture Exome) and Nextera kits use same transposese.

ADD REPLYlink written 3.0 years ago by Can Holyavkin240
2
gravatar for Can Holyavkin
3.0 years ago by
Can Holyavkin240
Turkey
Can Holyavkin240 wrote:

I repeated the alignment and realignment steps with BWA and GATK. The results are quite different now.

The upper image was taken after alignment+realignment of CLC genomics Workbench. The middle image was taken after alignment with BWA and realignment with GATK. The image in bottom was taken after alignment with BWA.

Could it be due to partial duplication of this segment to somewhere else?

enter image description here

ADD COMMENTlink written 3.0 years ago by Can Holyavkin240
1

Hmm, well that certainly did something, but your pileup still looks weird.

Now i'm thinking perhaps it wasn't an indel, but some contamination. I would definitely start by taking the sequence of DNA that mapped there and BLASTing it. I also would consider throwing up a mappability track for your reference genome to see if mappability in that region is lower than usual.

ADD REPLYlink written 3.0 years ago by John12k

Thank you John. Now I will check all rest of these trimmed reads and see that if they are aligned to somewhere else. However, I couldn't understand that what kind of contamination may cause this pattern.

ADD REPLYlink written 3.0 years ago by Can Holyavkin240
1
gravatar for Chris Miller
3.0 years ago by
Chris Miller20k
Washington University in St. Louis, MO
Chris Miller20k wrote:

Could it be cDNA contamination, coupled with an isoform that isn't in your gene track? Check ensembl to view known isoforms, and look for soft-clipping that matches up with the previous or next exons.

Otherwise, yeah, a repeat pileup is a good guess.

ADD COMMENTlink written 3.0 years ago by Chris Miller20k
1

Dear Chris, thank you for your reply. But we are not expecting cDNA in our sample. It's only PCR amplified (long-range primer set) products from genomic DNA. Also, I tried to remove repeats with built-in module of CLC Genomics Workbench. Unfortunately, it didn't change the coverage pattern at all.

ADD REPLYlink written 3.0 years ago by Can Holyavkin240
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1046 users visited in the last hour