Question: Find cooccurences of factors across multiple files
0
gravatar for tirichl
9 weeks ago by
tirichl20
W├╝zrburg
tirichl20 wrote:

Hey,

I have several hundred files that look like this:

a.file
#genomic positions
1, 3, 4, 9, 10
#factors
a, b, d, g

Each file holds multiple genomic positions (numbers) and factors (characters). I want to investigate, whether there are genomic positions that frequently co-occur with factors across all files, but I have no idea on how to approach this. Can someone point me into the right direction? Is there a tool or a library that might help? Thank you!

sequencing chip-seq gene • 113 views
ADD COMMENTlink modified 9 weeks ago by jordi.planells330 • written 9 weeks ago by tirichl20
0
gravatar for jordi.planells
9 weeks ago by
jordi.planells330 wrote:

bedtools intersect accepts multiple file to be intersected. Have you tried with it? You can report the number of occurrences with -c flag.

bedtools intersect -c -a your_file -b factor1 factor2 factorN

Then you could print the lines with more than X occurrences with awk.

awk 'BEGIN{FS="\t";OFS="\t"}{if($4 > X) print $0}'

Hope it helps!

ADD COMMENTlink written 9 weeks ago by jordi.planells330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2432 users visited in the last hour
_