enhanced volcano plot
0
0
Entering edit mode
3.8 years ago
javanokendo ▴ 60

I am trying to visualize the volcano plot with the EnhancedVolcano plot package and the plot looks strange and I do not know what is causing this problem. I sent my data to the sequencing facility and they just gave me the BAM files and I do not know which version of the mouse reference genome plus the annotation file. I generated the count matrix from the BAM file using featureCounts.

The figure with the details can be found here: volcanoplot.png.

RNA-Seq EnhancedVolcano R • 1.8k views
ADD COMMENT
4
Entering edit mode

You can use Samtools to figure out the command used by the facility to generate the bam file. Look at the @PG tag might see the genome/annotation file names which might give you a clue of the reference used for alignment.

samtools view -H file.bam

You generated a Count matrix and then what? What did you use to perform the gene expression stats to create the volcano plot? Is very hard to know what is going on with only this information. More information is needed to reproduce your error and understand the problem.

ADD REPLY
0
Entering edit mode

I have tried to extract the gene counts from BAM file using featureCounts and I keep on getting an error: "address (nil), cause 'memory not mapped". I have tried to run this both on HPC and on my local machine and I get this error. I have a total of 20 BAM files and I used the following command'

setwd("/home/jokendo/sabeloproject/")
library(Rsubread)

bam.files = list.files(path="/home/jokendo/sabeloproject/",pattern = ".bam$",full.names = T) #load bam files

gtffile = "/home/jokendo/sabeloproject/annotations/Mus_musculus.GRCm38.100.chr.gtf"

fc <- featureCounts(files=bam.files,
                    annot.ext=gtffile,
                    isGTFAnnotationFile=TRUE,
                    isPairedEnd=T,
                    GTF.attrType="gene_id",
                    nthreads=3)

Details of the error:
*** caught segfault ***
address (nil), cause 'memory not mapped'

Traceback:
 1: featureCounts(files = bam.files, annot.ext = gtffile, isGTFAnnotationFile = TRUE,     isPairedEnd = TRUE, GTF.attrType = "gene_id")
An irrecoverable exception occurred. R is aborting now ...
/var/tmp/slurmd.spool/job715581/slurm_script: line 12:  2350 Segmentation fault      Rscript /scratch/oknjav001/sabeloproject/soft/anar.R

Are there tools out there which I can use to extract the count matrix from these bam files?

ADD REPLY
0
Entering edit mode
  1. featureCounts will generate a table of read counts per gene per sample. That means 1 value per gene per sample.
  2. EnhancedVolcano typically expects exactly 1 logFC value and 1 p-value per gene in total, which is usually calculated by comparing the read counts of subsets of samples to each other, e.g. via DESeq2.
  3. It is not clear how you ever were able to produce the volcano plot that you link to in the original question given that you don't even seem to get past step 1.
  4. If memory is an issue, perhaps try to run featureCounts on each BAM file individually. You can then combine the count matrices later on. (I have to admit that I've never run featureCounts from within R, I always use it straight on the command line, as detailed in the documentation)
ADD REPLY

Login before adding your answer.

Traffic: 2975 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6