Question: Deseq2 differential result
0
gravatar for krushnach80
13 months ago by
krushnach80300
krushnach80300 wrote:

I m running Deseq2 pipeline my final result file is like 18,000 genes .So how do take the differential expressed genes from my list of genes .Do i sort if based on p value or is there a way to do inside the deseq2 before getting the final list of genes

Any help or suggestion would be appreciated .

rna-seq R • 715 views
ADD COMMENTlink modified 13 months ago by VHahaut1.1k • written 13 months ago by krushnach80300

This sounds like a basic R question on how to subset a dataframe to only contain the significant genes before saving the result to a file, am I right?

ADD REPLYlink written 13 months ago by WouterDeCoster29k

no not exactly , I was wondering can i get it done while running the pipeline

ADD REPLYlink written 13 months ago by krushnach80300

And that pipeline is in R, right?

ADD REPLYlink written 13 months ago by WouterDeCoster29k

yes , I was wondering if I can filter result out instead of getting the final values where I would set threshold to filter result

ADD REPLYlink written 13 months ago by krushnach80300
2
gravatar for VHahaut
13 months ago by
VHahaut1.1k
Belgium
VHahaut1.1k wrote:

Usually people look at the pvalue adjusted from the results and take a threshold to decide which gene is differentially expressed.

ADD COMMENTlink written 13 months ago by VHahaut1.1k

okay so i have to decide what threshold I m going to use that is what you mean?

ADD REPLYlink written 13 months ago by krushnach80300
1

Yes. People use commonly <0.05 or <0.1 for padj. Look in the litterature and decide which one is the most adapted to your experiment.

ADD REPLYlink written 13 months ago by VHahaut1.1k
1
gravatar for Sreeraj Thamban
13 months ago by
Indian Institute of Science Education and Research
Sreeraj Thamban90 wrote:

Hi, I would suggest extracting the results after sorting based on the adjusted p-value.

resOrdered <- res[order(res$padj),]

where res is your result. Thank you

ADD COMMENTlink written 13 months ago by Sreeraj Thamban90

yes I know but as I said i have like 18000 in my list and most of them wouldn't be of much use so how do I filter out the result .Is it based on some cutoff or what other parameters ?

ADD REPLYlink written 13 months ago by krushnach80300
1

You can extract the result and then sort them based on the log2FC, I normally use 0.4 (FC - 1.3 times) and above for upregulated genes and -0.4 for downregulated genes. You can do this simply using excel. Thanks

ADD REPLYlink written 13 months ago by Sreeraj Thamban90
2

I agree with Sreeraj....but stay away from excel ... much better if you write a script (can be reused quickly)

Pseudocode:

  • foreach file --> open the file

  • while open --> iterate through eachline

  • split line by tab

  • if header --> next

  • else if ((FC>1.5) OR (FC<-1.5)) AND (PADJ<0.05) --> write that line to a new file

Now do a line count of the output file.... Then check that it is the same count you would get when you use excel (but dont save anything in excel - just use the handy filter and sort options to check)

ADD REPLYlink modified 13 months ago • written 13 months ago by YaGalbi1.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1058 users visited in the last hour