I have txt file for genome gap assembly like below:
585    chr10    0    50000    1    N    50000    clone    no
78    chr10    5627110    5677110    51    N    50000    clone    yes
722    chr10    18014681    18064681    161    N    50000    clone    yes
881    chr10    38858841    38908841    337    N    50000    contig    no
884    chr10    39194941    39244941    340    N    50000    contig    no
13    chr10    39244941    41624941    341    N    2380000    centromere    no
902    chr10    41624941    41674941    342    N    50000    contig    no
904    chr10    41866693    41916693    344    N    50000    contig    no
116    chr10    45746970    45896970    375    N    150000    contig    no
Program said I should convert this to BED files.
So just do cat XXX.txt > XXX.bed ?
If so, why should we bother to use bed, why not just use txt?
What's the point of BED file?
thx
BED is a simple text file. Tools such as BEDOPS will do all sorts of logic and other computations for you (what elements overlap between these N input files? What's the trimmed mean of all ChIP-seq scores falling in every 100 kb window across the genome? etc.). The actual BED format has a fairly strict definition, but various tool suites allow for a more relaxed set of constraints such that only the first 3 fields (chrom, start, end) need to be specified for many operations, while all other columns are essentially free to be whatever you need. This allows for interactions between a tool suite and standard unix commands to manipulate data on the fly without losing any information. In fact, this very simple relaxation of the BED format can encode the information kept in any of the other 20 or so formats you'll commonly encounter in 'bioinformatics' (VCF, GFF, GTF, SAM, WIG, BEDGRAPH, etc). That is, a small extension to the usual BED format can represent anything that any of these other formats offer with no loss of data (see the conversion scripts offered in BEDOPS). However, conversions in the other direction often do not exist in the general case. For example, SAM/BAM is unable to hold signal data. The better question, imo, is why do we have so many file formats and tool suites to operate on each kind of format, when these formats are hardly more than shuffled-column versions of each other?