How To Remove Nonstandard Lines In Bed File
1
0
Entering edit mode
12.9 years ago
Phuong ▴ 10

Hi

does anybody know how to remove lines that are not in the format

chr    5759    6184    ORF5759    1    -

I have bed file that contain

chr    5759    6184    ORF5759    1    -

sirna some number

How can remove all the lines that does not follow the format so I can upload file to ucsc. Thank you so much appreciate it

bed • 4.2k views
ADD COMMENT
2
Entering edit mode

This is not clear. What do the badly formatted lines look like? How did you come by this badly formatted BED file? What do you mean by "sirna some number"?

ADD REPLY
0
Entering edit mode

@phuong: could you be little descriptive in your question, it will help in answering it. SECONDLY, WHO SO EVER DOWN VOTED this question should have given a chance to phuong to ammend her question or a chance to describe it from a more clear perspective..... I am completely against of down voting questions... unless they are exact copies..

ADD REPLY
0
Entering edit mode

it appears that "sirna some number" is the bad line.

ADD REPLY
0
Entering edit mode

Questions are down-voted when unclear or not useful - as it is explained in the mouse-over tooltip. This one was unclear. I am in fact helping out.

ADD REPLY
0
Entering edit mode

If "sirna some number" is the problem you can remove those with grep -v "sirna".

ADD REPLY
2
Entering edit mode
12.9 years ago
brentp 24k

If all you want is to remove lines that

  1. Dont have at least 3 columns (BED must have at least Chr, start, end)
  2. have integers in the 2nd and 3rd columns.

Then you can use this python script filter_bed.py):

import sys
for toks in (line.rstrip("\r\n").split("\t") for line in open(sys.argv[1])):
    if len(toks) < 3: continue
    if not (toks[1].isdigit() and toks[2].isdigit()): continue
    print "\t".join(toks)

Called like:

$ python filter_bed.py my.bad.bed > my.good.bed

UCSC may have other requirements, I can't recall, but this should give you a start.

ADD COMMENT

Login before adding your answer.

Traffic: 1980 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6