Question

trying to use DESeq2, how to set up the data files

0

Entering edit mode

7.1 years ago

susannehoward ▴ 90

have RNAseq data from 9 different plant samples, different cultivars of one species. I have a table of miRNA read counts for each and would like to compare the counts(expression) of these miRNAs. Can I use DESeq2 for that?

if yes:

I am essentially a non-programmer, got as far as installing the DESeq2 package in R, but the instructions on how to set up the DESeqDataSet from my excel-txt-exported table leave me completely confused, in part because I do not have timepoints or conditions. I am at a tiny university outpost, so there is nobody I can ask questions about this, any help that tells me specifically what steps are needed for my specific table would be very ,very greatly appreciated. nor AMP1 shv shc vvi-miR156 206 256 209 215 vvi-miR159 100 100 100 100 vvi-miR160 0 2 100 105

R RNA-Seq • 10k views

ADD COMMENT • link updated 7.1 years ago by Arup Ghosh 3.3k • written 7.1 years ago by susannehoward ▴ 90

0

Entering edit mode

in part because I do not have timepoints or conditions.

You want to compare different cultivars? Then those are your conditions.

ADD REPLY • link 7.1 years ago by WouterDeCoster 47k

0

Entering edit mode

yes, but I don't know how to set up the condition. CONDITIONS expects some method, and since I do not know those, i was looking for examples, but the only ones I could find used either timepoints or repetitions etc. there was nothing that used simply the columnheaders?

ADD REPLY • link 7.1 years ago by susannehoward ▴ 90

score 1 · Answer 1 · 2018-01-29

1

Entering edit mode

7.1 years ago

igor 13k

If you are a non-programmer, it might be easier to just use something like https://gallery.shinyapps.io/DEApp/ . It uses DESeq2 on the backend and has a nice tutorial.

ADD COMMENT • link 7.1 years ago by igor 13k

0

Entering edit mode

thanks I will look at that and see if it fits with my non-standard analysis!

ADD REPLY • link 7.1 years ago by susannehoward ▴ 90

0

Entering edit mode

Thanks again for that link. I was able to run the comparison on my data, as long as I have more than 2 data columns. (where column A holds the row names, col B and C the data). I get an error stating that the two files (data and metadata) do not match if I try that, no idea why.

ADD REPLY • link 7.1 years ago by susannehoward ▴ 90

0

Entering edit mode

Check to make sure the sample names are exactly alike between the two files. I'd use nothing but letters, numbers, and underscores.

ADD REPLY • link 7.1 years ago by swbarnes2 14k

score 0 · Answer 2 · 2018-01-29

0

Entering edit mode

7.1 years ago

swbarnes2 14k

It will be easier for you to do this in Excel.

Make an excel sheet with your counts, every gene a row, every sample a column. Do put the gene names as the first column, and the sample names as the header row. That's the first file DESeq wants as input. The second is a file with the first column being sample names, and the other columns being factors, like "treatment" or "time point" or "date of sequencing" or "species".

ADD COMMENT • link 7.1 years ago by swbarnes2 14k

1

Entering edit mode

susannehoward : After you follow these directions be sure to save the files in some de-limited (comma, tab etc) format for easy import into R/DESeq2.

ADD REPLY • link 7.1 years ago by GenoMax 149k

0

Entering edit mode

i do have the excel sheet exported and opened in R , I can see where the row names are assigned. it is the from that point on I get lost. I do not have se, columnmetadata. I tried this (for2 of the sampels): conditions=factor(c("norton","AMP1")) but I can't tell if that is ok, because then I don't know what comes next. no error though :)

ADD REPLY • link 7.1 years ago by susannehoward ▴ 90

0

Entering edit mode

No one can evaluate that line of code by itself. (Though if I had to guess, I'd say that if you have 12 samples, your conditions vector needs 12 elements, not 2) You do not have to construct the conditions file in R; you can make it in excel and import it. I strongly recommend doing that, since you do not seem to understand the lines of code you are copying.

ADD REPLY • link 7.1 years ago by swbarnes2 14k

score 0 · Answer 3 · 2018-02-01

0

Entering edit mode

7.1 years ago

Arup Ghosh 3.3k

There are several ways to design a DEG analysis using DESeq2. You can a matrix with feature(gene) names in a column and subsequent columns with reading counts from different conditions. In conditions group the columns as treatment and control. Or you can directly specify the count files separately and use the same strategy for design. The following article has a detailed procedure for analysis.

https://dwheelerau.com/2014/02/17/how-to-use-deseq2-to-analyse-rnaseq-data/

If you don't want all the hassle you can use Galaxy webserver or Seqmonk.

PS.- Convert the Excel files to .csv or .tsv format.

ADD COMMENT • link 7.1 years ago by Arup Ghosh 3.3k

0

Entering edit mode

Hi

I have been trying to import my data set in .txt, .csv and in .excel format. Every time, when I issue the coldata command, I get the same error.

> (coldata <- data.frame(row.names=colnames(analysis1), condition))
> Error in data.frame(row.names = colnames(analysis1), condition) :   
> row names supplied are of the wrong length

What am I missing here? How can I fix this? I am very new to R and RNAseq analysis and any help would be appreciated.

ADD REPLY • link 5.9 years ago by deshpande.neha2 • 0