Question: Plotting Density Of Snps On Chromosomes
10
gravatar for Leszek
7.4 years ago by
Leszek4.0k
IIMCB, Poland
Leszek4.0k wrote:

Do you know any tools for plotting SNPs density on chromosomes?

snp plot • 27k views
ADD COMMENTlink modified 2.1 years ago by kaixian1100 • written 7.4 years ago by Leszek4.0k

HI, I just have a bed file ,not snp . I wanna view genes and bed region genome distribuition . my bed file is like this : chr1 222 333 Gene1 chr2 223 344 Gene2 .....

ADD REPLYlink written 2.1 years ago by kaixian1100
22
gravatar for Irsan
6.5 years ago by
Irsan6.9k
Amsterdam
Irsan6.9k wrote:

I prefer a ggplot2-solution:

Download snp-table from ucsc

And then in R:

snps<-read.table("snp132_ucsc_hg19.bed",sep="\t",header=F)
colnames(snps)<-c("chr","start","end","id","score","strand")

# put the chromosomes in the good order: chr1, chr2, chr22, chrX
goodChrOrder <- paste("chr",c(1:22,"X","Y"),sep="")
snps$chr <- factor(snps$chr,levels=goodChrOrder)

# Plot the densities of snps in the bed file for each chr seperately
library(ggplot2)
snpDensity<-ggplot(snps) + 
geom_histogram(aes(x=start),binwidth=1e6) + # pick a binwidth that is not too small 
facet_wrap(~ chr,ncol=2) + # seperate plots for each chr, x-scales can differ from chr to chr
ggtitle("Density of SNP-132 across hg19") +
xlab("Position in the genome") + 
ylab("SNP density") + 
theme_bw() # I prefer the black and white theme

# save the plot to .png file
png("snp132_density.png",500,750)
print(snpDensity)
dev.off()

enter image description here

ADD COMMENTlink modified 4.8 years ago • written 6.5 years ago by Irsan6.9k

Hi Irsan If I have 2 conditions normal and diseased and i want to plot the same way the densities chr(1-Y) normal on one side and chr(1-Y) diseased on other side will it be possible? (in your plot its chr1 on one side and chr10 on other side)

ADD REPLYlink written 5.4 years ago by ancient_learner610
2

Yes, prepare your data in such a way that you have a data frame where each row represents a feature (gene/transcript/variant) containing columns: chromosome, position, group. Group contains information whether it is normal or disease. Then do:

ggplot(yourData) + 
geom_histogram(aes(x=position),binwidth=1e6) +
facet_grid(group ~ chr)
ADD REPLYlink modified 5.4 years ago • written 5.4 years ago by Irsan6.9k

Wow! thank you. ill do that

ADD REPLYlink written 5.4 years ago by ancient_learner610
2

and if you want your chromosomes to be sorted in the right way make sure your chromosome column is a factor with levels in the right order:

data$chromosome <- factor(data$chromosome, levels = paste("chr",c(1:22,"X","Y"),sep=""))
ADD REPLYlink modified 5.4 years ago • written 5.4 years ago by Irsan6.9k

Hi Irsan,
I know that this is a old post but, could you think that could be possible to perform this kind of plot but having as input the coverage for each position for each chromosome? Or maybe the files are too big to upload to R? Also I would like to plot a using geom_line instead of histogram, what do you think?
Thanks in advance

ADD REPLYlink written 4.4 years ago by iraun3.6k
it doesnt make sense to make a histogram for each base pair in the genome because you cant see it. Before plotting and importing data in R/python or anything, first aggregate your coverage into bins of for example 100 kbps
ADD REPLYlink written 4.4 years ago by Irsan6.9k

Yes, you're right. I'm just trying to figure out how to plot the coverage across all bases for each chromosome using a line plot. And of course, I'm having problems of memory. So I tried to use your approach for SNPs but it does not work due to the "infinite" number of points in the data.

ADD REPLYlink written 4.4 years ago by iraun3.6k

Hello,

am wondering whether this way is useful also for my data. Basically I have a bed file with Genomic Regions. I gave a score to all of these regions and its value range from -10 to 10. So the bed file looks like this

Chr Start End Score
chr1 1 400 -3
chr1 5000 5400 +10
etc..

I would like to plot the score position for position. Can I use this script?

Thanks for any reply.

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by fusion.slope200

I know this is old and something could change in libraries but is there anybody to help with that error message when trying to recreate this: Error: StatBin requires a continuous x variable the x variable is discrete. Perhaps you want stat="count"? Thanks in advance

ADD REPLYlink written 15 months ago by Pawel Osipowski20

hello isran, i am vinay kumar reddy, i have vcf file of samples with different snp positions on different chromosomes, so now i want to plot this positions along chromosomes, i am trying to do it with synbreed r package and i could not succeed, can you please help me out. my vcf files consists several columns but first column is of different chromosomes and second column is of different positions on all chromosomes.

thanks, vinay kumar reddy

ADD REPLYlink written 13 months ago by vinayreddynannuru20
18
gravatar for Philippe
7.4 years ago by
Philippe1.9k
Barcelona, Spain.
Philippe1.9k wrote:

Hi,

I don't know if such a program exists but I created a function from a small R script I wrote to address such problems of visualization. I think this can cover your need. Here is the function with some examples to test it.

ChrDensity <- function(ChrLength, Positions, Values, ChromosomeBgColor="gray90", ChromosomeBorderColor="black", ValuesColors="black", PlottingGenes=F, GeneColors="black", ylab="Density", xlab="Chromosomal position (Mb)", ...) {

    PositionsMb <- Positions / 1000000
    ChrLengthMb <- ChrLength / 1000000

    plot(NA,NA, xlab=xlab, ylab=ylab , bty="n", xlim=c(0,ChrLengthMb), ylim=c(-1.1,max(Values)), ...)
    rect(0, -0.1, ChrLengthMb,-1, col=ChromosomeBgColor, border=ChromosomeBorderColor)
    nbG <- length(Positions)
    segments(PositionsMb, rep(0,nbG), PositionsMb, Values, col=ValuesColors)
    if(PlottingGenes) {
        segments(PositionsMb, rep(-0.1,nbG), PositionsMb, rep(-1,nbG), col=GeneColors)
    }

}


### Example
ChrLength <- 100000000 # Chromosome of 100 Mb
Positions <- sample(1:100000000, 1000, replace=F)
Values <- sample(1:100 / 10, 1000, replace=T)

# Simple Call
ChrDensity(ChrLength, Positions, Values, main="Test plot")


# Plotting genes and adding specific colors for subset of genes

Colors <- rep("black", length(Positions))
Colors[order(Positions)[200:400]] <- "red"
ChrDensity(ChrLength, Positions, Values, PlottingGenes=T, ValuesColors=Colors, GeneColors=Colors, main="Test plot", ylab="Nb of SNPs")

And here is an example of plots it generates.

Basic plot alt text

Enhanced plot, plotting gene positions and coloring a subset of those. alt text

This can be of course improved. Let me know if you have some suggestions or encounter some bugs.

ADD COMMENTlink written 7.4 years ago by Philippe1.9k

+1 that's useful. I'm python freak, so I will try to code similar stuff using pyplot. thanks for the idea!

ADD REPLYlink written 7.4 years ago by Leszek4.0k

The figures are gone!

ADD REPLYlink written 3.9 years ago by SmallChess490
9
gravatar for Pierre Lindenbaum
6.5 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum122k wrote:

ascii art and mysql :-)

   $ mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19
    mysql> select chrom,chromStart,count(*),repeat('#',(count(*)/50000.0)*10) from snp137
        where chrom in("chr1","chr2","chr3","chr4","chr5","chr6")  group by chrom,ROUND(chromStart/10E6) ;
    +-------+------------+----------+------------------------------------------------------+
    | chrom | chromStart | count(*) | repeat('#',(count(*)/50000.0)*10)                    |
    +-------+------------+----------+------------------------------------------------------+
    | chr1  |      10144 |   115192 | #######################                              |
    | chr1  |    5000009 |   195026 | #######################################              |
    | chr1  |   15000014 |   203190 | #########################################            |
    | chr1  |   25000076 |   172068 | ##################################                   |
    | chr1  |   35000040 |   177274 | ###################################                  |
    | chr1  |   45000036 |   167085 | #################################                    |
    | chr1  |   55000027 |   182151 | ####################################                 |
    | chr1  |   65000016 |   183073 | #####################################                |
    | chr1  |   75000010 |   182679 | #####################################                |
    | chr1  |   85000178 |   179926 | ####################################                 |
    | chr1  |   95000113 |   176064 | ###################################                  |
    | chr1  |  105000003 |   182362 | ####################################                 |
    | chr1  |  115000059 |   132044 | ##########################                           |
    | chr1  |  142535438 |   115694 | #######################                              |
    | chr1  |  145000015 |   205771 | #########################################            |
    | chr1  |  155000030 |   188877 | ######################################               |
    | chr1  |  165000030 |   181378 | ####################################                 |
    | chr1  |  175000030 |   177862 | ####################################                 |
    | chr1  |  185000003 |   183398 | #####################################                |
    | chr1  |  195000018 |   182054 | ####################################                 |
    | chr1  |  205000033 |   209352 | ##########################################           |
    | chr1  |  215000047 |   179814 | ####################################                 |
    | chr1  |  225000055 |   187002 | #####################################                |
    | chr1  |  235000110 |   195962 | #######################################              |
    | chr1  |  245000033 |    93682 | ###################                                  |
    | chr2  |      10132 |   105475 | #####################                                |
    | chr2  |    5000051 |   190281 | ######################################               |
    | chr2  |   15000021 |   177880 | ####################################                 |
    | chr2  |   25000059 |   188206 | ######################################               |
    | chr2  |   35000011 |   212628 | ###########################################          |
    | chr2  |   45000016 |   206472 | #########################################            |
    | chr2  |   55000150 |   181265 | ####################################                 |
    | chr2  |   65000062 |   180104 | ####################################                 |
    | chr2  |   75000066 |   198265 | ########################################             |
    | chr2  |   85000017 |   168042 | ##################################                   |
    | chr2  |   95326292 |   167528 | ##################################                   |
    | chr2  |  105000006 |   185158 | #####################################                |
    | chr2  |  115000210 |   185350 | #####################################                |
    | chr2  |  125000054 |   235282 | ###############################################      |
    | chr2  |  135000003 |   182175 | ####################################                 |
    | chr2  |  145000126 |   172680 | ###################################                  |
    | chr2  |  155000007 |   167622 | ##################################                   |
    | chr2  |  165000065 |   192062 | ######################################               |
    | chr2  |  175000114 |   175360 | ###################################                  |
    | chr2  |  185000008 |   167461 | #################################                    |
    | chr2  |  195000029 |   162383 | ################################                     |
    | chr2  |  205000055 |   179149 | ####################################                 |
    | chr2  |  215000027 |   184243 | #####################################                |
    | chr2  |  225000030 |   188300 | ######################################               |
    | chr2  |  235000034 |   167767 | ##################################                   |
    | chr3  |      60112 |   120153 | ########################                             |
    | chr3  |    5000009 |   204211 | #########################################            |
   (...)
    | chr6  |   55000001 |   149141 | ##############################                       |
    | chr6  |   65000008 |   194800 | #######################################              |
    | chr6  |   75000078 |   198436 | ########################################             |
    | chr6  |   85000026 |   190050 | ######################################               |
    | chr6  |   95000043 |   174975 | ###################################                  |
    | chr6  |  105000058 |   175326 | ###################################                  |
    | chr6  |  115000032 |   182749 | #####################################                |
    | chr6  |  125000145 |   177503 | ####################################                 |
    | chr6  |  135000000 |   175307 | ###################################                  |
    | chr6  |  145000003 |   189212 | ######################################               |
    | chr6  |  155000006 |   195229 | #######################################              |
    | chr6  |  165000019 |   132148 | ##########################                           |
    +-------+------------+----------+------------------------------------------------------+
    128 rows in set (2 min 17.24 sec)
ADD COMMENTlink written 6.5 years ago by Pierre Lindenbaum122k
1
gravatar for Houkto
7.4 years ago by
Houkto210
Houkto210 wrote:

Hello there,

I am trying to plot SNP density using circos which can do the same thing. However, both circos and your plot are asking for an input called value i.e

ChrLength, Positions, Values and in circos (Chromosome) hsX (genomic position) 70700000 70799999 (value) 0.00156?

I am not sure what is this value and how to obtain it, any suggestion ?

Thanks

ADD COMMENTlink written 7.4 years ago by Houkto210
1
gravatar for Ric
3.9 years ago by
Ric250
Australia
Ric250 wrote:

Hi, I just wonder whether any updates about SNP density plots have been published.

I have few chromosome VCF files i.e. each chromosome is in a separately VCF file. I like @Irsan solution, but just wonder whether it would be possible:

  • to put real numbers on the x-axis
  • to stretch the y-axis so more details would be visible
  • overall to increase the width of the plots so that 2 plots fill an A4 (portrait) page
  • to replace BED file by VDF file?

Thank you in advance

Mic

ADD COMMENTlink written 3.9 years ago by Ric250
0
gravatar for Ulrich
6.5 years ago by
Ulrich0
Ulrich0 wrote:

Hej, does anyone know an R package that can plot SNP densities like in the attached paper (figure 2, but there used for FLcDNA density), i.e. as a kind of density heat-map with variable window sizes.

http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1000740

Cheers Uli

ADD COMMENTlink written 6.5 years ago by Ulrich0

Hi Uli, you can use the autoplot VCF option implemented in the ggbio [R] package. Check this out : section 6.2.15 : http://www.bioconductor.org/packages/2.11/bioc/vignettes/ggbio/inst/doc/ggbio.pdf Pretty easy to handle Best, C.

ADD REPLYlink written 6.5 years ago by Bioch'Ti1000
0
gravatar for Vitis
6.5 years ago by
Vitis2.2k
New York
Vitis2.2k wrote:

I used betools to intersect the SNP list and a file delimits the sliding windows, then plot the SNP density with simple line graph in R. It's not as sophisticated as the R code provided above but it works pretty well except you have to prepare the window interval files.

ADD COMMENTlink written 6.5 years ago by Vitis2.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 689 users visited in the last hour