Hello, relatively inexperienced bioinformatician here tasked with setting up a RNA-seq pipeline (tailored towards looking at DEG and pathway enrichment analysis for perturbed vs unperturbed cancer samples). I am very new to cloud computing and been looking at both google cloud and AWS. There are a lot of resources and it is a bit overwhelming so I was wondering if anybody had some insight as to the most efficient method for this. My main issue is that, since it seems like the alignment step is usually run on the command line, I can't really have an entire pipeline in one script so that I could simply run as a Jupternotebook/ on AWS sagemaker or Google datalab. And then there is the issue of where to keep the GR38 reference sequence and GTF file and how to upload new sequences for analysis ever time. Is there a commonly accepted best practice for a method such as this? Any advice would help. Thank you.