Ballgown DE analysis
0
0
Entering edit mode
2.2 years ago
iamsmor • 0

Hi everyone

I will do differantial expression analysis for my SRA data. I trimmed and then aligned them with HISAT2 and I used to preindex for alignment then I started expression analysis. And I am following this tutorial

But I am stuck on first step that I should create file for ballgown analysis on R in tutorial. They are use below code for 6 replicate total :3 of tumor vs 3 of normal then create 6 expression file

printf "\"ids\",\"type\",\"path\"\n\"UHR_Rep1\",\"UHR\",\"$RNA_HOME/expression/stringtie/ref_only/UHR_Rep1\"\n\"UHR_Rep2\",\"UHR\",\"$RNA_HOME/expression/stringtie/ref_only/UHR_Rep2\"\n\"UHR_Rep3\",\"UHR\",\"$RNA_HOME/expression/stringtie/ref_only/UHR_Rep3\"\n\"HBR_Rep1\",\"HBR\",\"$RNA_HOME/expression/stringtie/ref_only/HBR_Rep1\"\n\"HBR_Rep2\",\"HBR\",\"$RNA_HOME/expression/stringtie/ref_only/HBR_Rep2\"\n\"HBR_Rep3\",\"HBR\",\"$RNA_HOME/expression/stringtie/ref_only/HBR_Rep3\"\n" > UHR_vs_HBR.csv

and I tried created my own file for 6 autistic srr vs 6 normal srr by using this code below

printf "\"ids\",\"type\",\"path\"\n\"SRR309133\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309133\"\n\"SRR309134\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309134\"\n\"SRR309135\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309135\"\n\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309135\"\n\"SRR309136\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309136\"\n\"SRR309137\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309137"\,\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309137\"\n\"SRR309138\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309138\"\n
    \"SRR309139\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309139\"\n\"SRR309140\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309140\"\n\"SRR309141",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309141\"\n"\ "NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309141\"\n\"SRR309142\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309142\"\n\"SRR309143\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309143"\,\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309143\"\n\"SRR309144\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309144\"\n > AUTISM_vs_NORMAL.csv

but I get error No such file or directory. I don't understand main idea how I can create file for my data. Thank you for any help

R ballgown Differential-expression • 1.0k views
ADD COMMENT
1
Entering edit mode

Do yourself a favor and use http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html rather than ballgown which has never been developed for gene-level differential analysis.

ADD REPLY
1
Entering edit mode

You seem to have followed the tutorial as such.

$ printf "\"ids\",\"type\",\"path\"\n\"UHR_Rep1\",\"UHR\",\"$RNA_HOME/expression/stringtie/ref_only/UHR_Rep1\"\n\"UHR_Rep2\",\"UHR\",\"$RNA_HOME/expression/stringtie/ref_only/UHR_Rep2\"\n\"UHR_Rep3\",\"UHR\",\"$RNA_HOME/expression/stringtie/ref_only/UHR_Rep3\"\n\"HBR_Rep1\",\"HBR\",\"$RNA_HOME/expression/stringtie/ref_only/HBR_Rep1\"\n\"HBR_Rep2\",\"HBR\",\"$RNA_HOME/expression/stringtie/ref_only/HBR_Rep2\"\n\"HBR_Rep3\",\"HBR\",\"$RNA_HOME/expression/stringtie/ref_only/HBR_Rep3\"\n"                

would print this:

"ids","type","path"
"UHR_Rep1","UHR","/expression/stringtie/ref_only/UHR_Rep1"
"UHR_Rep2","UHR","/expression/stringtie/ref_only/UHR_Rep2"
"UHR_Rep3","UHR","/expression/stringtie/ref_only/UHR_Rep3"
"HBR_Rep1","HBR","/expression/stringtie/ref_only/HBR_Rep1"
"HBR_Rep2","HBR","/expression/stringtie/ref_only/HBR_Rep2"
"HBR_Rep3","HBR","/expression/stringtie/ref_only/HBR_Rep3"

This is a csv file with text in quotes. You could have created this csv file with any text editors or libreoffice calc or excel, instead of editing original CLI text. Your code hasn't escaped quotes for sample SRR309141. Due to which, it was failing to print. Once you fix that, here is the output and but it's incorrect. Corrected one is below the output:

$ printf "\"ids\",\"type\",\"path\"\n\"SRR309133\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309133\"\n\"SRR309134\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309134\"\n\"SRR309135\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309135\"\n\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309135\"\n\"SRR309136\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309136\"\n\"SRR309137\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309137"\,\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309137\"\n\"SRR309138\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309138\"\n\"SRR309139\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309139\"\n\"SRR309140\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309140\"\n\"SRR309141\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309141\"\n"\ "NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309141\"\n\"SRR309142\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309142\"\n\"SRR309143\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309143"\,\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309143\"\n\"SRR309144\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309144\"\n"

"ids","type","path"
"SRR309133","AUTISM","/expression/stringtie/ref_only/SRR309133"
"SRR309134","AUTISM","/expression/stringtie/ref_only/SRR309134"
"SRR309135","AUTISM","/expression/stringtie/ref_only/SRR309135"
"AUTISM","/expression/stringtie/ref_only/SRR309135"
"SRR309136","AUTISM","/expression/stringtie/ref_only/SRR309136"
"SRR309137","AUTISM","/expression/stringtie/ref_only/SRR309137,"AUTISM","/expression/stringtie/ref_only/SRR309137"n"SRR309138","AUTISM","/expression/stringtie/ref_only/SRR309138"n"SRR309139","NORMAL","/expression/stringtie/ref_only/SRR309139"n"SRR309140","NORMAL","/expression/stringtie/ref_only/SRR309140"n"SRR309141","NORMAL","/expression/stringtie/ref_only/SRR309141"n\ NORMAL","/expression/stringtie/ref_only/SRR309141"n"SRR309142","NORMAL","/expression/stringtie/ref_only/SRR309142"n"SRR309143","NORMAL","/expression/stringtie/ref_only/SRR309143\,"NORMAL","/expression/stringtie/ref_only/SRR309143"
"SRR309144","NORMAL","/expression/stringtie/ref_only/SRR309144"

Your CLI text should be some thing like this:

$ printf "\"ids\",\"type\",\"path\"\n\"SRR309133\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309133\"\n\"SRR309134\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309134\"\n\"SRR309135\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309135\"\n\"SRR309136\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309136\"\n\"SRR309137\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309137\"\n\"SRR309138\",\"AUTISM\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309138\"\n\"SRR309139\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309139\"\n\"SRR309140\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309140\"\n\"SRR309141\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309141\"\n\"SRR309142\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309142\"\n\"SRR309143\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309143\"\n\"SRR309144\",\"NORMAL\",\"$RNA_HOME/expression/stringtie/ref_only/SRR309144\"\n"

output would be:

"ids","type","path"
"SRR309133","AUTISM","/expression/stringtie/ref_only/SRR309133"
"SRR309134","AUTISM","/expression/stringtie/ref_only/SRR309134"
"SRR309135","AUTISM","/expression/stringtie/ref_only/SRR309135"
"SRR309136","AUTISM","/expression/stringtie/ref_only/SRR309136"
"SRR309137","AUTISM","/expression/stringtie/ref_only/SRR309137"
"SRR309138","AUTISM","/expression/stringtie/ref_only/SRR309138"
"SRR309139","NORMAL","/expression/stringtie/ref_only/SRR309139"
"SRR309140","NORMAL","/expression/stringtie/ref_only/SRR309140"
"SRR309141","NORMAL","/expression/stringtie/ref_only/SRR309141"
"SRR309142","NORMAL","/expression/stringtie/ref_only/SRR309142"
"SRR309143","NORMAL","/expression/stringtie/ref_only/SRR309143"
"SRR309144","NORMAL","/expression/stringtie/ref_only/SRR309144"
ADD REPLY
1
Entering edit mode

Since you are reading this into R, you would not even need to create the file. Try following in R:

> samples=paste0("SRR",seq(309133,309144,1))
> path="/home/user/data/"
> headers=c("ids","type","path")
> df=data.frame(samples, rep(c("Autism","Normal"),each=6), paste0(path,samples))
> names(df)=headers
> df
         ids   type                      path
1  SRR309133 Autism /home/user/data/SRR309133
2  SRR309134 Autism /home/user/data/SRR309134
3  SRR309135 Autism /home/user/data/SRR309135
4  SRR309136 Autism /home/user/data/SRR309136
5  SRR309137 Autism /home/user/data/SRR309137
6  SRR309138 Autism /home/user/data/SRR309138
7  SRR309139 Normal /home/user/data/SRR309139
8  SRR309140 Normal /home/user/data/SRR309140
9  SRR309141 Normal /home/user/data/SRR309141
10 SRR309142 Normal /home/user/data/SRR309142
11 SRR309143 Normal /home/user/data/SRR309143
12 SRR309144 Normal /home/user/data/SRR309144

Replace path as per your local data.

ADD REPLY
0
Entering edit mode

Thank you

ADD REPLY

Login before adding your answer.

Traffic: 2791 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6