How To Determine Structural Variant Type Or Reformat Breakpoint Data To Bam
2
0
Entering edit mode
11.0 years ago
secretjess ▴ 210

I have some breakpoint data and I would like to determine type / class of each rearrangement. I believe BreakDancer can do this.

Could someone please show me how a BAM file is formatted so I know if my data has enough information to be reformated for use with this tool?

EDIT:

Example of typical data -

Germline/Somatic    Evidence    #Solexa reads    Chr L    Pos L    Strand L    Chr H    Pos H    Strand H    Microhomology length (bp)    Microhomology seq    Non-templated sequence length (bp)    Non-templated sequence
Somatic    Seq    1    18    19092052    +    18    30289323    +    0        0

I already have a script which pulls back the sequence observed across a breakpoint if this is helpful.

bam breakdancer sv • 3.0k views
ADD COMMENT
1
Entering edit mode

Many bam files are available here: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/ . Typically, a BAM file is produced by an aligner after a next-generation sequencing experiment. I suspect that your "breakpoint data" are not from such a pipeline and that breakdancer is not the right tool. You'll need to clarify what type of data you have if you have more questions.

ADD REPLY
1
Entering edit mode

What do you mean by "breakpoint data"?

ADD REPLY
0
Entering edit mode

Thank you for the information. I'm looking at the sort of data published here: http://genome.cshlp.org/content/early/2013/02/13/gr.143677.112/suppl/DC1 (e.g. Supplementary Table 3). In this case there is column "variantClass" but not all data has this.

ADD REPLY
3
Entering edit mode
11.0 years ago
Irsan ★ 7.8k

A BAM-file is the binary (read zipped/archived/compressed) format of a SAM-file. You can see the SAM-format specification here. So if you want to make a BAM-file, first make a SAM-file than use

samtools view -bS yourFile.sam > yourFile.bam

You can also download a BAM-file from 1000g ftp server.

Pick one of the subjects, then, alignment, then a .bam file

EDIT: Oh I see Sean Davis also just pointed you to 1000g repository

ADD COMMENT
2
Entering edit mode
11.0 years ago

As others have said, a bam file is a file containing alignments of raw sequencing reads. BreakDancer takes a bam file as input and discovers breakpoints of structural variants. The table you have is the result of running an algorithm like BreakDancer on sequencing data. That table already has a "variant class" column which tells you whether the event is a duplication or deletion.

Their raw data is probably deposited in dbGaP or elsewhere, and you could download it and run BreakDancer or another algorithm on the data to try to recreate their results, but those are large files and that's a significant undertaking.

ADD COMMENT

Login before adding your answer.

Traffic: 2926 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6