Question: calculating distances between DNA markers
0
gravatar for milton.andy
7 days ago by
milton.andy10
milton.andy10 wrote:

I am seeking help with what possibly is a trivial problem but so far I have not been able to locate suitable resource on the web so any pointers would be very appreciated. I have two uneven but sorted numerical arrays representing chromosomal positions of two non-overlaping sets of markers (SNPs). They can be represented as: [A1, A2, ..An] and [B1, B2,..Bz]. I need to automate calculation of differences between each member of set A and each member of set B:

A1 - B1, A1 - B2 ... A1 - Bz
A2 - B1, A2 - B2 ... A2 - Bz
..........................................
An - B1, An - B2 ... An - Bz

Output needs to be a sorted list of absolute values.

Given large numbers of markers involved I cannot imagine doing it manually. As I am not familiar with programming, bash or python script would be best.

If this has been already answered in the forum, please post a link. Many thanks, Andy

bash snp • 91 views
ADD COMMENTlink modified 7 days ago by finswimmer9.9k • written 7 days ago by milton.andy10

Hello milton.andy ,

  • Could you please provide an example of how your input looks like exactly?

  • Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
    code_formatting

Thank you!

ADD REPLYlink modified 7 days ago • written 7 days ago by finswimmer9.9k

Hi finswimmer, Thank you very much for your replay. A small example of the input can be found here:

https://drive.google.com/file/d/17DR6ZUtXP0jpGxf35E6PqAEERpNz7wpi/view?usp=sharing

The desired output should be as in this textfile: https://drive.google.com/file/d/1R_6duEybnIKn_3P9AI4ixPqlrzYcdI4l/view?usp=sharing

Being a novice I am having trouble conforming to some of the requirements, such as using the appropriate formatting conventions (aluded to in your second commet). I apologize for this and will try to learn these things in the near future.

Thanks again for your help Andy

ADD REPLYlink written 6 days ago by milton.andy10

Hello milton.andy ,

thanks for providing the example. Unfortunately this is not clear to me. Does the setA contain 3 Variants and the setB 6? Or is this a format issue?

The distances you show in your example output seems to be the absolute value for difference, otherwise A1-B1 would be negative. Am I right?

fin swimmer

ADD REPLYlink written 6 days ago by finswimmer9.9k

What does the input data actually look like?

ADD REPLYlink written 7 days ago by jrj.healey10k

An example of the input can be found here:

https://drive.google.com/file/d/17DR6ZUtXP0jpGxf35E6PqAEERpNz7wpi/view?usp=sharing

The desired output should be as in this textfile: https://drive.google.com/file/d/1R_6duEybnIKn_3P9AI4ixPqlrzYcdI4l/view?usp=sharing

Thanks jrj.healey

ADD REPLYlink written 6 days ago by milton.andy10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2049 users visited in the last hour