Question: trying to use DESeq2, how to set up the data files
0
gravatar for susannehoward
22 months ago by
United States
susannehoward90 wrote:

have RNAseq data from 9 different plant samples, different cultivars of one species. I have a table of miRNA read counts for each and would like to compare the counts(expression) of these miRNAs. Can I use DESeq2 for that?

if yes:

I am essentially a non-programmer, got as far as installing the DESeq2 package in R, but the instructions on how to set up the DESeqDataSet from my excel-txt-exported table leave me completely confused, in part because I do not have timepoints or conditions. I am at a tiny university outpost, so there is nobody I can ask questions about this, any help that tells me specifically what steps are needed for my specific table would be very ,very greatly appreciated. nor AMP1 shv shc vvi-miR156 206 256 209 215 vvi-miR159 100 100 100 100 vvi-miR160 0 2 100 105

rna-seq R • 3.4k views
ADD COMMENTlink modified 22 months ago by arup1.9k • written 22 months ago by susannehoward90

in part because I do not have timepoints or conditions.

You want to compare different cultivars? Then those are your conditions.

ADD REPLYlink modified 22 months ago • written 22 months ago by WouterDeCoster42k

yes, but I don't know how to set up the condition. CONDITIONS expects some method, and since I do not know those, i was looking for examples, but the only ones I could find used either timepoints or repetitions etc. there was nothing that used simply the columnheaders?

ADD REPLYlink written 22 months ago by susannehoward90
1
gravatar for igor
22 months ago by
igor8.9k
United States
igor8.9k wrote:

If you are a non-programmer, it might be easier to just use something like https://gallery.shinyapps.io/DEApp/ . It uses DESeq2 on the backend and has a nice tutorial.

ADD COMMENTlink modified 22 months ago • written 22 months ago by igor8.9k

thanks I will look at that and see if it fits with my non-standard analysis!

ADD REPLYlink written 22 months ago by susannehoward90

Thanks again for that link. I was able to run the comparison on my data, as long as I have more than 2 data columns. (where column A holds the row names, col B and C the data). I get an error stating that the two files (data and metadata) do not match if I try that, no idea why.

ADD REPLYlink written 22 months ago by susannehoward90

Check to make sure the sample names are exactly alike between the two files. I'd use nothing but letters, numbers, and underscores.

ADD REPLYlink written 22 months ago by swbarnes27.0k
0
gravatar for swbarnes2
22 months ago by
swbarnes27.0k
United States
swbarnes27.0k wrote:

It will be easier for you to do this in Excel.

Make an excel sheet with your counts, every gene a row, every sample a column. Do put the gene names as the first column, and the sample names as the header row. That's the first file DESeq wants as input. The second is a file with the first column being sample names, and the other columns being factors, like "treatment" or "time point" or "date of sequencing" or "species".

ADD COMMENTlink modified 22 months ago • written 22 months ago by swbarnes27.0k
1

susannehoward : After you follow these directions be sure to save the files in some de-limited (comma, tab etc) format for easy import into R/DESeq2.

ADD REPLYlink modified 22 months ago • written 22 months ago by genomax75k

i do have the excel sheet exported and opened in R , I can see where the row names are assigned. it is the from that point on I get lost. I do not have se, columnmetadata. I tried this (for2 of the sampels): conditions=factor(c("norton","AMP1")) but I can't tell if that is ok, because then I don't know what comes next. no error though :)

ADD REPLYlink written 22 months ago by susannehoward90

No one can evaluate that line of code by itself. (Though if I had to guess, I'd say that if you have 12 samples, your conditions vector needs 12 elements, not 2) You do not have to construct the conditions file in R; you can make it in excel and import it. I strongly recommend doing that, since you do not seem to understand the lines of code you are copying.

ADD REPLYlink written 22 months ago by swbarnes27.0k
0
gravatar for arup
22 months ago by
arup1.9k
India
arup1.9k wrote:

There are several ways to design a DEG analysis using DESeq2. You can a matrix with feature(gene) names in a column and subsequent columns with reading counts from different conditions. In conditions group the columns as treatment and control. Or you can directly specify the count files separately and use the same strategy for design. The following article has a detailed procedure for analysis.

https://dwheelerau.com/2014/02/17/how-to-use-deseq2-to-analyse-rnaseq-data/

If you don't want all the hassle you can use Galaxy webserver or Seqmonk.

PS.- Convert the Excel files to .csv or .tsv format.

ADD COMMENTlink modified 22 months ago • written 22 months ago by arup1.9k

Hi

I have been trying to import my data set in .txt, .csv and in .excel format. Every time, when I issue the coldata command, I get the same error.

> (coldata <- data.frame(row.names=colnames(analysis1), condition))
> Error in data.frame(row.names = colnames(analysis1), condition) :   
> row names supplied are of the wrong length

What am I missing here? How can I fix this? I am very new to R and RNAseq analysis and any help would be appreciated.

ADD REPLYlink written 8 months ago by deshpande.neha20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 899 users visited in the last hour