Question: enhanced volcano plot
0
gravatar for javanokendo
27 days ago by
University of Cape Town
javanokendo0 wrote:

I am trying to visualize the volcano plot with the EnhancedVolcano plot package and the plot looks strange and I do not know what is causing this problem. I sent my data to the sequencing facility and they just gave me the BAM files and I do not know which version of the mouse reference genome plus the annotation file. I generated the count matrix from the BAM file using featureCounts.

The figure with the details can be found here: volcanoplot.png.

rna-seq enhancedvolcano R • 133 views
ADD COMMENTlink modified 26 days ago by zx87549.4k • written 27 days ago by javanokendo0
4

You can use Samtools to figure out the command used by the facility to generate the bam file. Look at the @PG tag might see the genome/annotation file names which might give you a clue of the reference used for alignment.

samtools view -H file.bam

You generated a Count matrix and then what? What did you use to perform the gene expression stats to create the volcano plot? Is very hard to know what is going on with only this information. More information is needed to reproduce your error and understand the problem.

ADD REPLYlink written 27 days ago by tiago2112871.2k

I have tried to extract the gene counts from BAM file using featureCounts and I keep on getting an error: "address (nil), cause 'memory not mapped". I have tried to run this both on HPC and on my local machine and I get this error. I have a total of 20 BAM files and I used the following command'

setwd("/home/jokendo/sabeloproject/")
library(Rsubread)

bam.files = list.files(path="/home/jokendo/sabeloproject/",pattern = ".bam$",full.names = T) #load bam files

gtffile = "/home/jokendo/sabeloproject/annotations/Mus_musculus.GRCm38.100.chr.gtf"

fc <- featureCounts(files=bam.files,
                    annot.ext=gtffile,
                    isGTFAnnotationFile=TRUE,
                    isPairedEnd=T,
                    GTF.attrType="gene_id",
                    nthreads=3)

Details of the error:
*** caught segfault ***
address (nil), cause 'memory not mapped'

Traceback:
 1: featureCounts(files = bam.files, annot.ext = gtffile, isGTFAnnotationFile = TRUE,     isPairedEnd = TRUE, GTF.attrType = "gene_id")
An irrecoverable exception occurred. R is aborting now ...
/var/tmp/slurmd.spool/job715581/slurm_script: line 12:  2350 Segmentation fault      Rscript /scratch/oknjav001/sabeloproject/soft/anar.R

Are there tools out there which I can use to extract the count matrix from these bam files?

ADD REPLYlink modified 26 days ago by Kevin Blighe63k • written 26 days ago by javanokendo0
  1. featureCounts will generate a table of read counts per gene per sample. That means 1 value per gene per sample.
  2. EnhancedVolcano typically expects exactly 1 logFC value and 1 p-value per gene in total, which is usually calculated by comparing the read counts of subsets of samples to each other, e.g. via DESeq2.
  3. It is not clear how you ever were able to produce the volcano plot that you link to in the original question given that you don't even seem to get past step 1.
  4. If memory is an issue, perhaps try to run featureCounts on each BAM file individually. You can then combine the count matrices later on. (I have to admit that I've never run featureCounts from within R, I always use it straight on the command line, as detailed in the documentation)
ADD REPLYlink written 26 days ago by Friederike5.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1138 users visited in the last hour