Svdetect-Structure And Cnv Detection
4
2
Entering edit mode
13.4 years ago
Hmm ▴ 500

I have patient tumor/normal HiSeq paired end data. I have 2 bam files which were cleaned/sorted/duplicates removed.One bam for tumor and other for normal. These bams are very large.

I am trying to use svdetect to find structural and cnv. I have run svdetect more than a day ago and the Software seems to slow down dramatically. Not sure how long does the processing of 1 bam file take.

One issue that I came across was mu_length,sigma_length.The sigma seems very high. I get the mu and sigma by running the script “BAM_preprocessingPairs.pl” provided by software. This script gives me A bam file output but before it converts sam to bam I can download the sam and view the mu and sigma.

///////////////////////////////////////////////////////////////////////////////////////////
//Tumor
///////////////////////////////////////////////////////////////////////////////////////////
<detection>
read1_length=104
read2_length=104
window_size=405
step_length=102
mates_file=tumor.ab.bam
cmap_file=....hs18.len
</detection>

<filtering>
strand_filtering=1
order_filtering=1
insert_size_filtering=1
nb_pairs_threshold=2
nb_pairs_order_threshold=2
indel_sigma_threshold=3
dup_sigma_threshold=2
singleton_sigma_threshold=4
final_score_threshold=0.8

***mu_length=141
sigma_length=132***

</filtering>
//////////////////////////////////////////////////////////////////////////////////////////

My read length for paired end is 104.

  1. Is the window size and step length correct?
  2. Is sigma suppose to be that large? (does it affect the analysis algorithm a lot) I tried using other tools (breakway: http://sourceforge.net/apps/mediawiki/breakway/index.php?title=The_Breakway_Compendium#How_ReadClusters_works)

And tried running their perl script to find the mean paired-end distance and standard dev of paired-end, and I get a mean of ~140 and sd of ~18.

cnv • 6.3k views
ADD COMMENT
0
Entering edit mode

SVdetect manual says : "To detect large SVs, a window_size value of 2σ from the mean has to be set ("µ+2σ” for a confidence interval of ~95%). To identify balanced translocations, a window size equal to at least “2µ+2√2σ” should be set." BTW : how many reads are there in your processed *ab.bam file.?

ADD REPLY
0
Entering edit mode

yup..i looked at the manual and changed my window size to 1000 with 500 step but still the program is too slow. After four days it crashes itself.

read 773192631 test reads read 815356938 ref reads

ADD REPLY
0
Entering edit mode

I am sorry but may I ask how did you obtain the insert size after running the BAM_preprocessingPairs.pl script. It only gave me some counts of mapped and unmapped reads and produced a BAM file at the end. It didn't output anything on mu or sd length. I would really appreciate your help.

Thank you

ADD REPLY
0
Entering edit mode
12.5 years ago

Hi,

I have one question, where do you see sigma and mu parameters in SAM file???? I cannot find they and it is driving me mad!

Thanks in advance.

ADD COMMENT
0
Entering edit mode
11.6 years ago

What type of your data is, if it is Mate pair reads, the mu_length shouldn't be so small, in this case, it maybe your library which contain lots of PE data which should not be existed in you Mate Pair data. With a so small window size, there would be too many genome fragments, of course, more time are needed.

ADD COMMENT
0
Entering edit mode
11.4 years ago

Hi,

To get the sigma and mu value for the data, the the bam/sam file on bampreprocessingpairs.pl. It will give the appropriate mu and sigma value

ADD COMMENT
0
Entering edit mode
11.4 years ago

use threading as well

ADD COMMENT

Login before adding your answer.

Traffic: 1372 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6