Question: Summing Chromosome Sizes
0
gravatar for selplat21
4 weeks ago by
selplat2120
selplat2120 wrote:
File1_Col1 <- c("Chr1", "Chr2", "Chr3", "Chr4", "Chr5")
File1_Col2 <- c(10000, 8000, 5000, 2000, 500)
File1 <- data.frame(File1_Col1, File1_Col2)
File2_Col1 <- c("Chr1", "Chr1", "Chr1", "Chr2", "Chr2", "Chr2","Chr3", "Chr3", "Chr3"
                ,"Chr4", "Chr4", "Chr4","Chr5", "Chr5", "Chr5")
File2_Col2 <- c(1,5,7,2,3,5,3,4,5,1,3,6,2,4,5)
File2 <- data.frame(File2_Col1, File2_Col2)

I have two files: File 1 contains chromosomes and their sizes and File 2 contains a list of SNPs for each chromosome.

I need to have the SNPs consecutive by position, so for example:

Chr3 SNP 3 should actually be 3+(the size of both preceding chromosomes) = 3+10000+8000= 18003

Can someone help me write a loop in R that will just sum the sizes of preceding chromosomes in File2_Col2?

sequencing R • 105 views
ADD COMMENTlink modified 4 weeks ago by rpolicastro2.3k • written 4 weeks ago by selplat2120
1

What have you tried? You should look at calculating cumsum for the first data frame followed by a merge, then you can create a derived field that is the sum of the the Col2 field.

ADD REPLYlink written 4 weeks ago by _r_am31k

Thank you so much that answers my question!

ADD REPLYlink written 4 weeks ago by selplat2120
0
gravatar for rpolicastro
4 weeks ago by
rpolicastro2.3k
rpolicastro2.3k wrote:

I'm not completely sure I understand, but here is a tidyverse answer of what I thought you meant, based on @RAmRS's comment.

library("tidyverse")

result <- File1 %>%
  mutate(cumsum_chr=cumsum(File1_Col2)) %>%
  right_join(File2, by=c("File1_Col1"="File2_Col1")) %>%
  mutate(newcol=cumsum_chr+File2_Col2) %>%
  select(!c(File1_Col2, cumsum_chr))

> result
   File1_Col1 File2_Col2 newcol
1        Chr1          1  10001
2        Chr1          5  10005
3        Chr1          7  10007
4        Chr2          2  18002
5        Chr2          3  18003
6        Chr2          5  18005
7        Chr3          3  23003
8        Chr3          4  23004
9        Chr3          5  23005
10       Chr4          1  25001
11       Chr4          3  25003
12       Chr4          6  25006
13       Chr5          2  25502
14       Chr5          4  25504
15       Chr5          5  25505
ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by rpolicastro2.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1852 users visited in the last hour