Question: How To Determine Structural Variant Type Or Reformat Breakpoint Data To Bam
gravatar for secretjess
5.5 years ago by
secretjess170 wrote:

I have some breakpoint data and I would like to determine type / class of each rearrangement. I believe BreakDancer can do this.

Could someone please show me how a BAM file is formatted so I know if my data has enough information to be reformated for use with this tool?


Example of typical data -

Germline/Somatic    Evidence    #Solexa reads    Chr L    Pos L    Strand L    Chr H    Pos H    Strand H    Microhomology length (bp)    Microhomology seq    Non-templated sequence length (bp)    Non-templated sequence
Somatic    Seq    1    18    19092052    +    18    30289323    +    0        0

I already have a script which pulls back the sequence observed across a breakpoint if this is helpful.

sv bam breakdancer • 1.6k views
ADD COMMENTlink modified 5.5 years ago by Chris Miller20k • written 5.5 years ago by secretjess170

Many bam files are available here: . Typically, a BAM file is produced by an aligner after a next-generation sequencing experiment. I suspect that your "breakpoint data" are not from such a pipeline and that breakdancer is not the right tool. You'll need to clarify what type of data you have if you have more questions.

ADD REPLYlink written 5.5 years ago by Sean Davis25k

What do you mean by "breakpoint data"?

ADD REPLYlink written 5.5 years ago by PoGibas4.7k

Thank you for the information. I'm looking at the sort of data published here: (e.g. Supplementary Table 3). In this case there is column "variantClass" but not all data has this.

ADD REPLYlink written 5.5 years ago by secretjess170
gravatar for Irsan
5.5 years ago by
Irsan6.7k wrote:

A BAM-file is the binary (read zipped/archived/compressed) format of a SAM-file. You can see the SAM-format specification here. So if you want to make a BAM-file, first make a SAM-file than use

samtools view -bS yourFile.sam > yourFile.bam

You can also download a BAM-file from 1000g ftp server:

Pick one of the subjects, then, alignment, then a .bam file

EDIT: Oh I see Sean Davis also just pointed you to 1000g repository

ADD COMMENTlink modified 5.5 years ago • written 5.5 years ago by Irsan6.7k
gravatar for Chris Miller
5.5 years ago by
Chris Miller20k
Washington University in St. Louis, MO
Chris Miller20k wrote:

As others have said, a bam file is a file containing alignments of raw sequencing reads. BreakDancer takes a bam file as input and discovers breakpoints of structural variants. The table you have is the result of running an algorithm like BreakDancer on sequencing data. That table already has a "variant class" column which tells you whether the event is a duplication or deletion.

Their raw data is probably deposited in dbGaP or elsewhere, and you could download it and run BreakDancer or another algorithm on the data to try to recreate their results, but those are large files and that's a significant undertaking.

ADD COMMENTlink written 5.5 years ago by Chris Miller20k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1874 users visited in the last hour