Question: How to convert my files(cnv seg, refseq) to .bed format ?
0
gravatar for taegyunlee
2.3 years ago by
taegyunlee0
taegyunlee0 wrote:

Hi.

I had downloaded TCGA CNV level 3 data(nocnv, hg19). I hope to map this CNV data to each genes. So, I had searched information about this issue and I could find some.

I got recommendation, using the bedtools.

I had downloaded refseq file from UCSC table browser. refseq file's content is as follows.

[refseq]

bin name chrom strand txStart txEnd cdsStart cdsEnd exonCount exonStarts

3 NR_130130 chr1 + 150980866 151008189 151008189 151008189 4 150980866,150997990,150999708,151006281,

3 NR_130132 chr1 + 150980866 151008189 151008189 151008189 4 150980866,150990287,150999708,151006281,

and cnv seg file's content is as follows.

[cnv seg]

Sample Chromosome Start End Num_Probes Segment_Mean BREAD_p_TCGAb_430_431_NSP_GenomeWideSNP_6_D11_1538030 1 3218610 247813706 128998 0.0014 BREAD_p_TCGAb_430_431_NSP_GenomeWideSNP_6_D11_1538030 2 484222 207696262 110158 0.0067 BREAD_p_TCGAb_430_431_NSP_GenomeWideSNP_6_D11_1538030 2 207696273 207701151 2 -1.5215

As far as I know, I have to convert to my files(cnv seg, refseq) to .bed format. But I don't know how to deal it. What should I do?

Can you give me a hand?

cnv convert seg bed • 1.3k views
ADD COMMENTlink modified 2.3 years ago by Eric T.2.5k • written 2.3 years ago by taegyunlee0

I am trying the same thing. But having diffculty in converting the seg.txt to a proper .bed format and hence the files could not be read at subsequent steps. Can you help me out on how to proceed with this? How have you managed to get the conversion done?

ADD REPLYlink written 9 months ago by r.bhowmick0
0
gravatar for Eric T.
2.3 years ago by
Eric T.2.5k
San Francisco, CA
Eric T.2.5k wrote:

Are you downloading gene annotations from UCSC? In the table browser, look for the page/option to export a table in another format, i.e. BED. You don't necessarily need to use the files as they are on the FTP site.

The general answer here is that these are all tabular formats, so you can extract the columns you need using standard Unix tools or a short script in R or Python. The format of BED is chromosome/start/end, while the UCSC RefSeq table and and the SEG format both have these columns along with others. So you select the chromosome, start, and end columns from the input format using cut or awk, and subtract 1 from the 'start' position for SEG because SEG uses 1-based indexing while BED and RefSeq use 0-based indexing.

ADD COMMENTlink written 2.3 years ago by Eric T.2.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1061 users visited in the last hour