What can cause broken read-pairs (chimeric read-pairs) in a sequencing run?
0
1
Entering edit mode
11 weeks ago
William ★ 5.3k

What can cause broken read-pairs (chimeric read pairs) in a sequencing run?

We are finding an unexpected high % of chimeric read pairs. Based on:

  • lower than expected % of proper read pairs
  • higher than expected % of unique mapped read mapping to different chromosomes within the same read pair
  • color coding of reads in IGV on insert size

Difference v.s. expected % values for proper pars and mapping to different chromosomes is just a few % . But this is enough to cause a significant increase in (SV) analysis time.

This is unrelated to barcode hopping I think. Because:

  • There is only 1 barcode per read pair?
  • Multiple species were sequenced in same sequencing run, if broken read pair reads came from different species, they would be unmapped.

See also this IGV screenshot. Colored reads indicating unique mapped reads with unexpected insert size to it's pair. This is not a local pattern, but genome wide.

IGV_chimeric_read_pairs

QC • 378 views
ADD COMMENT
1
Entering edit mode

Reads are not out-of-sync in the input R1/R2 files, correct?

Multiple species were sequenced in same sequencing run

Have you done something to bin/separate the reads before alignment that could have potentially caused above to happen?

ADD REPLY
0
Entering edit mode

Will double check the FASTQ files and if the IDs of reads displayed as mates in IGV make sense. No custom process has been done to the FASTQ files.

ADD REPLY
1
Entering edit mode

What is the relationship between reference genome and samples? I would expect something like this in the case of a species with high levels of mobile genome elements and a reasonably large evolutionary time between sample and reference genome populations.

Alternatively, the reference genome collapsed or expanded lots of repetitive regions. Have you tried mapping to a repeat masked reference?

ADD REPLY
0
Entering edit mode

Sample and reference are closely related. Also also indicated by the 1 obvious SNP in a multiple KB region.

ADD REPLY

Login before adding your answer.

Traffic: 1783 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6