output file conversion from FeatureCount to DESeq2 - no duplicate for samples
Entering edit mode
22 months ago
Angelina_G ▴ 10

Hello, I did a bulkRNA-seq and now have an output gene count file from: featureCounts -s 0 -p -P -d 0 -D 1000 -B --primary -t exon -g gene_name -a gtf -T 6 -o output bam1 bam2 bam3 (I did it via hisat2 then samtools sort then featurecounts using linux command line)

The three bam files belong to 3 cell lines and I want to do a differential analysis on their RNA gene expression, see which cell line expresses higher level of what genes.

The problem is, I did not do any duplicates, so I only have one sample per each cell line, and when I tried doing dds1 <- DESeq(dds1) it tells me:

The design matrix has the same number of samples and coefficients to fit,
  so estimation of dispersion is not possible. Treating samples
  as replicates was deprecated in v1.20 and no longer supported since v1.22.

What should I do if I want to compare them and get a result on which gene is expressed higher in one cell line compared to another?

Meanwhile, my data looks like below:

DDX11L1     0         0            0           
WASH7P      217       209          116         
MIR6859-1   1         0            0           
MIR1302-2HG 0         0            2           
MIR1302-2   0         0            0           

Currently I'm importing them into dds via splitting the data matrix into 3 files each with two cell lines, so that they get to be compared with only one other cell line. Is there other better ways?

Thank you!

FeatureCount differential DESeq2 R analysis • 855 views
Entering edit mode
Entering edit mode
22 months ago
ATpoint 83k

Without replicates no DEG analysis in DESeq2, simple as that. Please google "differential analysis without replicates", it has been asked many times before, and by the way I am afraid to say that it is a poor experimental design. That is why you should read about analysis or talk to an analyst first before conducting an experiment.

Entering edit mode

Thank you so much! It was a pilot test so we only had one data per cell line. I am new to bioinformatics and was copying others' pipeline for our own dataset, likely searched in google for a bunch of wrong keywords without considering the replicate part. Thank you so much!


Login before adding your answer.

Traffic: 2658 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6