Question: Identify Overlapping And Non Overlapping Regions For Paired-End Data
0
gravatar for ancient_learner
6.1 years ago by
India
ancient_learner620 wrote:
gene1            gene2
chr1    25    30    chr1    34    37
chr1    15    20    chr1    25    28
chr1    80    90    chr1    10    13

gene1            gene2
chr1    25    30    chr1    36    39
chr1    15    20    chr1    18    20
chr1    80    90    chr1    19    22

common gene1 uniq gene2 (when we compare file 1 with file2)
chr1    15    20    chr1    25    28
chr1    80    90    chr1    10    13

common gene1 uniq gene2 (when we compare file2 with file1)

chr1    15    20    chr1    18    20
chr1    80    90    chr1    19    22

common gene1 common gene2 
 chr1    25    30    chr1    34     37  chr1    25    30    chr1    36    39

common in gene1 gene2 i was able to do with bedtools pairToPair. buth i have problem with common gene1 and uniq gene2

bedtools perl awk • 1.8k views
ADD COMMENTlink written 6.1 years ago by ancient_learner620

It's really unclear what you're trying to do. What does any of this have to do with paired-end data? Why are your "genes" 5-10bp? What is the input and what is the goal?

ADD REPLYlink written 6.1 years ago by Devon Ryan94k

its an example not the real data. so need not to be a problem at all. i know genes cannot be of 10 bps. if you consider the 2 files the gene1 and gene2(of which dummy positions were given) are interacting partners. The positions for gene1 are all common in both files only varying ones are positions of gene2.

ADD REPLYlink written 6.1 years ago by ancient_learner620
1

Please learn how to actually ask a coherent question in the future.

Reading between the lines, it seems that the first and second 3 lines of coordinates you posted are from two different files, for which you want to look at various types of intersections. However, the coordinates specified by the first 3 columns of each file are the same between the two (but are they repeated?), so should presumably by ignored other than in output. It seems that coordinates intersect if they overlap by at least 1bp.

If that's correct, this would seem to be a trivial perl/python/whatever program to write. Just parse things line by line for each file and print output dependent upon the comparison. If that's not sufficient for your needs, then you'll need to provide more information. We don't read minds here.

ADD REPLYlink written 6.1 years ago by Devon Ryan94k
4

"Please learn how to actually ask a coherent question in the future." ... "We don't read minds here."

I'm not sure what purpose these words serve. Why not respond with kinder, encouraging and respectful words--even if an OP's question may be inherently problematic? (It clearly goes without saying that my lack of understanding an OP's question doesn't strictly imply that the problem lies with the OP's question.)

I'm certainly guilty of uttering many obtuse statements--and will, most likely, continue to do so. Perhaps, however, I've just been lucky to have said them to knowledgeable individuals who have constructively and courteously replied with words which encouraged me to carefully refactor these statements.

ADD REPLYlink modified 6.1 years ago • written 6.1 years ago by Kenosis1.2k

Yeah, I could have been much nicer in my reply. Having said that, a good bit of insolence can also help people along, since it decreases needless back-and-forth (though I used more insolence than I should have in this case).

ADD REPLYlink written 6.1 years ago by Devon Ryan94k

Take a look at Quick Programming Challenge: Calculate Common And Unique Regions From A List Of Chromosome Segments on computing common and unique interval ranges. In particular, Quick Programming Challenge: Calculate Common And Unique Regions From A List Of Chromosome Segments using the IRanges R package that seems to do exactly what you want, if I understand correctly.

ADD REPLYlink written 6.1 years ago by SES8.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1330 users visited in the last hour