Question: bedtools, linux, rnaseq
0
gravatar for Daniel James
3.1 years ago by
Daniel James10
California
Daniel James10 wrote:

I have a huge file in bedfile format and I have to extract only the chr22 using the bedtools. I tried using the sort option but I don't understand how to do it ?

rna-seq bedtools • 1.0k views
ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by Daniel James10
2

You tried in terminal?

grep "chr22" fileA.bed > fileB.bed
ADD REPLYlink written 3.1 years ago by Floris Brenk890

I tried this but I have 6 files and I need to store the chr22 from all the files in one file

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by Daniel James10

cat fileB.bed fileC.bed fileD.bed > all_chr22_files.bed

but the options below works as well

ADD REPLYlink written 3.1 years ago by Floris Brenk890
1
gravatar for James Ashmore
3.1 years ago by
James Ashmore2.6k
UK/Edinburgh/MRC Centre for Regenerative Medicine
James Ashmore2.6k wrote:

You can either explicitly list the files:

grep -h "chr22" A.bed B.bed C.bed > Result.bed

or, use a wildcard, which uses all the files ending with ".bed" in the current directory:

grep -h "chr22" *.bed > Result.bed

Don't forget to coordinate sort the BED file afterwards, as many programs require this:

sort -k1,1 -k2,2n Result.bed > Result.sorted.bed
ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by James Ashmore2.6k

Thank you, works fine now how would you make it like a tab delineated file using coverage bed options ? can we use hist ?

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by Daniel James10
0
gravatar for Alex Reynolds
3.1 years ago by
Alex Reynolds28k
Seattle, WA USA
Alex Reynolds28k wrote:

If you're not averse to using BEDOPS, generate the sorted union of N BED files with sort-bed, and use bedextract to pull out elements of the chromosome-of-interest from the set union:

$ sort-bed A.bed B.bed ... N.bed > all.bed
$ bedextract chr22 all.bed > chr22.bed

Our BEDOPS bedextract application uses a binary search approach to jump to the start position of the chromosome-of-interest, and so extraction is much faster than grep or awk, which have to waste time reading through the entire file.

For multi-GB, whole-genome scale files, and especially for extraction of elements at the end of a file, using awk or grep to read through the entire file can be (is) a significant waste of time. Even more so if you have to repeat the extraction for other chromosomes.

The output of BEDOPS tools will be sorted, as well, so it will be ready to use for downstream set operations.

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by Alex Reynolds28k
0
gravatar for Daniel James
3.1 years ago by
Daniel James10
California
Daniel James10 wrote:

How do I create a tab delineated file using coverage Bed options ?

ADD COMMENTlink written 3.1 years ago by Daniel James10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 969 users visited in the last hour