Finding telomere sequences in exome-sequencing data
1
0
Entering edit mode
7.9 years ago
marki ▴ 60

Hi,

The exome sequencing output typically contains some fraction (typically 10–50%) of sequence that is off-target. I was wondering how I may check if the off-target sequences in a given exome-sequencing data (BAM format) contains telomere sequences or not (For instance, by checking repeats of TTAGGG hexamers).

sequencing next-gen • 1.9k views
ADD COMMENT
1
Entering edit mode

Have you tried making a longish sequence of the hexamers and aligning to that? Presumably that'd work. Alternatively, I wouldn't be surprised if the kmer part that FastQC does would show this.

ADD REPLY
0
Entering edit mode

I would like someone to confirm this, but I think telomeres are masked in genome files and as such no reads will map to these in your bam file. However, there might be unmapped reads in your bam file corresponding to the telomeric sequences.

ADD REPLY
2
Entering edit mode

Yup, they're almost always hardmasked. This is true of most regions of constitutive heterochromatin.

ADD REPLY
0
Entering edit mode

Thank you WouterDeCoster and Devon Ryan.

ADD REPLY
0
Entering edit mode
7.9 years ago
jotan ★ 1.3k

I like Repeat Enrichment Estimator. It includes telomere TTAGGG as part of the standard database.

As an aside, telomeric transcripts are pretty low abundance and not easy to detect. I've found only low levels of telomeric sequences even when using only rRNA depleted preps. Depending on what you are trying to do, you might want to consider looking at other repeats.

ADD COMMENT
0
Entering edit mode

OP is talking about exome sequencing, not transcripts ;)

ADD REPLY
0
Entering edit mode

Whoops. Sorry, my bad!

ADD REPLY

Login before adding your answer.

Traffic: 1992 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6