How to create a pooled normal exome sample?
1
0
Entering edit mode
8.8 years ago
Dataman ▴ 380

Hi

I am performing copy number analysis on some exome sequencing data where some of the samples does not have the matched normal (the tool that I am using - ADTEx - requires a matched normal sample for the copy number analysis). However, I have 3 normal samples (bam files) which are prepared and sequenced the same way as the tumor samples (those that do not have the normal samples). I was wondering what the best practices are in order to create a pooled normal exome sample?

Currently, I am thinking of merging 3 normal bam files using 'samtools merge' and then find the coverage for each exon using 'coverageBed' and then divide the coverage for each exon by 3. However, I am not sure if this way is correct and whether I need to do some kind of normalization. In addition, I have noticed that 'ExomeCNV' R package has a function called 'pool.coverage()' which is meant for this purpose but unfortunately this package has been removed from CRAN!

I would like to thank you in advance for your thoughts and answers.

copy-number-analysis exome next-gen-sequencing • 3.2k views
ADD COMMENT
0
Entering edit mode

Hi I would like to know if there was alternate solution for your normalization because I am trying to do similar type of work. Thank you

ADD REPLY
0
Entering edit mode
8.8 years ago
Dataman ▴ 380

I found the answer to my question in an article (EXCAVATOR). They use the following strategy:

In the pooling scheme, each test sample is compared with a pooled reference obtained by summing the total number of reads for each exon across all the control samples.

So, what I do is that I add the number of reads for each exon across all the normal samples. This constitutes the pooled normal coverage file which can be used as the input to the tool for the normal sample. I do not need to worry about the normalization part since ADTEx performs 'mean coverage normalization' meaning that the tool divides the number of reads at each exon by the mean number of reads before calculating the tumor/normal ratios.

ADD COMMENT
0
Entering edit mode
What is the purpose of pooling? Couldn't you just choose any of the three normals as control and then filter germline CNVs by screening against public databases?
ADD REPLY

Login before adding your answer.

Traffic: 2058 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6