I will be processing a big amount of RNAseq data in the next months, but we do not have a server yet (it will arrive next march).
In the mean time, I have a small PC with 16gb ram. I would like to test my scripts (cutadapt, RNA-Star, STAR-fusion, htseq, deseq2 etc). This means that I want to debug, beforehand, how to install programs, how to use them etc.
For this purpose, I was thinking that if I had a small RNAseq dataset (very small), I could test scripts faster without memory limitations, just to get used to all programs.
Basically, what I would need is:
GTF file for splice junctions;
maybe adaptor sequences?
Is there any online dataset for learning purposes like this?
Please put this as an answer! I calculated and it seems like it will be plausible to do everything I need with the S. cerevisiae RNA-seq datasets. They are small enough that I can create indexes and align fast, but also it is highly studied which gives rise to many acessible datasets and references. I will try this and will write a more elaborate answer after !
Thank you for your suggestion
Done. Please accept it (green check mark on the left) if it worked for you.