Question: Find cooccurences of factors across multiple files
gravatar for tirichl
9 weeks ago by
tirichl20 wrote:


I have several hundred files that look like this:

#genomic positions
1, 3, 4, 9, 10
a, b, d, g

Each file holds multiple genomic positions (numbers) and factors (characters). I want to investigate, whether there are genomic positions that frequently co-occur with factors across all files, but I have no idea on how to approach this. Can someone point me into the right direction? Is there a tool or a library that might help? Thank you!

sequencing chip-seq gene • 113 views
ADD COMMENTlink modified 9 weeks ago by jordi.planells330 • written 9 weeks ago by tirichl20
gravatar for jordi.planells
9 weeks ago by
jordi.planells330 wrote:

bedtools intersect accepts multiple file to be intersected. Have you tried with it? You can report the number of occurrences with -c flag.

bedtools intersect -c -a your_file -b factor1 factor2 factorN

Then you could print the lines with more than X occurrences with awk.

awk 'BEGIN{FS="\t";OFS="\t"}{if($4 > X) print $0}'

Hope it helps!

ADD COMMENTlink written 9 weeks ago by jordi.planells330
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2432 users visited in the last hour