class: center, middle, title # A cloud decoder ring ## Jonah M Duckles --- layout: true .logo[ ![logo](/assets/abacusbio_logo.png) ] --- .one-half[ ### A Cloud Decoder Ring: Demystifying Cloud Environments ] .one-half[
### Jonah M Duckles -
@jduckles ##### [jduckles@abacusbio.co.nz](mailto:jduckles@abacusbio.co.nz) ##### http://abacusbio.com ## Cloud Guide https://genomicsaotearoa.github.io/cloudsforresearch/ .citation[https://jduck.net/presentations/QMB_datascience] ] --- # Research Data Science Skills .center[ ![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/pb-hPg2w5uVnJ.png) ] .citation[https://github.com/jduckles/dsskills] --- # Just use the cloud they said .center[ ![](https://f002.backblazeb2.com/file/jduck-dropshare/giphy-2-.gif#090c9fff5cc09c89dcfd0bb7951e137caffd29d10da39d71b82ab0fadff08183) ] --- # Why the cloud? .one-half[
.center[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/np_cloud-hosting_2015988_000000.svg) ] ] .one-half[ For **$1.36/hour** you can have: * 64GB RAM machine with * 16 CPUs and * 600GB of fast local NVMe storage For **$5.42/hour** you can have * 384GB RAM * 96 CPUs and * 3.6TB of fast local NVMe storage ] --- # It isn't for every workload * $1.36 x 365 x 24 = **$11,913** * $5.42 x 365 x 24 = **$47,479** Still less than a computer + an IT professional You can have incredible capabilities for incremental costs. This can be great to test ideas and to explore research and business opportunites which justify capital expenditures on IT professionals + Hardware. It gets VERY expensive at 100% utilisation, but you get the ICT expertise of billion-dollar companies for $x/hour. --- # There will be haters .center[
] --- # The cloud is confusing! .one-third[ Amazon ![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/Screen-Shot-2019-09-02-08-23-06.89.png#131dbb11f4702b9b068da059e572c8c93253383c075945fa74bf4a4e3b4d3526) ] .one-third[ Google ![](https://f002.backblazeb2.com/file/jduck-dropshare/Screen-Recording-2019-09-02-08-34-03.gif#e3537d1b0324c16373540f06de25605d557134cadbdbcbe975112f75a236ca51) ] .one-third[ Microsoft ![](https://f002.backblazeb2.com/file/jduck-dropshare/Screen-Recording-2019-09-02-08-39-42.gif#37aac3e32c3d825ade68ed19aa8b7636a9706b69b05a550fcbd5fdfb1d53d023) ] .citation[Screen caputures by the author on September 2, 2019] --- # Cloud Types .citation[[Image from bmc.com](https://blogs.bmc.com/wp-content/uploads/2017/09/saas-vs-paas-vs-iaas.png)] .one-half[ ![](https://blogs.bmc.com/wp-content/uploads/2017/09/saas-vs-paas-vs-iaas.png) ] .one-half[
* **IaaS - Infrastructure as a Service** * **PaaS - Platform as a Service** * **SaaS - Software as a Service**
We're going to be talking largely about IaaS in this talk. ] --- # Infrastructure as a Service - IaaS .one-half[ ![](https://f002.backblazeb2.com/file/jduck-dropshare/pb-kJ6GDeepfS.png#5093a0ca9cfb043d0d29960fd69aad929fa437fdace46052c74504ed77e64659) .citation[ [Image Source](https://www.datacenterknowledge.com/archives/2012/03/14/estimate-amazon-cloud-backed-by-450000-servers) ] ] .one-half[ * Largely about running virtual machines. * Virtual machines allow cloud providers to take large multi-core systems and sell fractional parts of that system. * A virtual machine is a full computer which you can have administrator access on to install anything you want. ] --- ## What is a Virtual Machine? .center[ ![](https://f002.backblazeb2.com/file/jduck-dropshare/pb-E7hykrMhwu.png#8fd34f27d78a214b62c2e375812d9969bb9cd7414804f75065d931dfe6e473a9) ] .citation[By John Aplessed - Own work, Public Domain, https://commons.wikimedia.org/w/index.php?curid=12351968] --- ## What is a Container? .one-half[ * Docker - The most famous container environment * Singularity - HPC Container Environment ] .one-half[ .center[ ![](https://f002.backblazeb2.com/file/jduck-dropshare/pb-Q0hhOGpixD.png#d1dfd023cc5d210d32d2b75acc69e826ed16f81a06974feb60fa026ea39dd88d) ] ] .citation[Accessed from [Docker.com](https://www.docker.com/sites/default/files/d8/styles/large/public/2018-11/container-what-is-container.png?itok=vle7kjDj) September 2, 2019] --- # Say What!? .center[ ![](https://f002.backblazeb2.com/file/jduck-dropshare/sawat.gif#fefa90b3916dd2c4e2ec4710ed3e2b53163a20d86e8ce9b516d6695984cc1daa) ] --- # Recap .one-half[ ![](https://f002.backblazeb2.com/file/jduck-dropshare/pb-FkfbPK0O4r.png#e9e2fefc3222af0bad0640d9fca47337c8d6c1d96f93c59b6a896f615b251da0) ] .one-half[ * **Virtual Machines** - Allow you to partition full "computers" into virtual computers (machines). These are each some small fraction of the larger **host** computer. * **Containers** - Allow you to package an application and all of its dependencies so it can run on other computers that have that container environment (Docker/Singularity for example) ] --- ## I just wanted to do biology? .center[ ![](https://f002.backblazeb2.com/file/jduck-dropshare/giphy.gif#cee597bd9af78c3e0ced3b911e0db2df586686b1e9f2db5fd01b27a29647abec) ] --- ## #!%* .middle[ .center[ ![](https://f002.backblazeb2.com/file/jduck-dropshare/giphy-1-.gif#1dc9322635193ced20ef217aac82602a64b39845869e3d5b5bf5e057422a7128) ]] --- # Virtual Machines at Cloud Providers
## AWS * EC2 - Elastic Compute Cloud ## Google * GCE - Google Compute Engine ## Microsoft Azure * Virtual Machines ## Catalyst Cloud * Compute Instances --- # Starting a Virtual Machine .center[ ![](https://f002.backblazeb2.com/file/jduck-dropshare/Screen-Shot-2019-09-02-15-06-21.01.png#66064053af8a37bf85fb8f6b35cc1200bac72bc8ab84fa2292556867a031c99f) ] --- ## #!%* .middle[ .center[ ![](https://f002.backblazeb2.com/file/jduck-dropshare/giphy-1-.gif#1dc9322635193ced20ef217aac82602a64b39845869e3d5b5bf5e057422a7128) ]] --- ## Major components of all computers .medium[ * .svg-icon[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/np_cpu_2340472_000000.svg)] Processor (CPU) - This is where the work gets done * .svg-icon[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/np_ram_32078_000000.svg)] Memory (RAM) - this is where running programs and data are stored * .svg-icon[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/np_disk_1954873_000000.svg)] Storage (Hard disk) - Where we save data or output persistently * .svg-icon[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/np_ethernet-port_417768_000000.svg)] Network - How we get data on/off the computer * .svg-icon[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/np_security_1207188_000000.svg)] Access/Security * .svg-icon[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/np_globe_1707556_000000.svg)] Location Virtual machines let you configure all of this in Software. THERE ARE A LOT OF OPTIONS! ] --- ## Cloud terminology .medium[ * .svg-icon[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/np_disk_1954873_000000.svg)] Image (AMI, Boot Disk) * .svg-icon[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/np_cpu_2340472_000000.svg)] .svg-icon[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/np_ram_32078_000000.svg)] Instance Type - CPU / Memory (RAM) $$ * .svg-icon[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/np_globe_1707556_000000.svg)] Location (Availability zone, region, data center) * .svg-icon[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/np_ethernet-port_417768_000000.svg)] Firewall - Security Groups * .svg-icon[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/np_security_1207188_000000.svg)] SSH Keys * .svg-icon[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/np_disk_1954873_000000.svg)] Storage - Block Storage ] --- # Block Storage .one-half[ Accessed over the network, but mountable as a "live filesystem" for your Virtual Machine. When programs are reading/writing data this is where you'll want them to read/write. ] .one-half[ Think of this as an infinite (for a a price) storage system you can connect as a live filesystem for your Virtual Machine ] --- # "Bucket" Storage .one-half[ * Amazon - S3 - Simple Storage Service * Google - GCS - Google Cloud Storage * Microsoft - Microsoft Blob Storage ] .one-half[ Bucket storage allows you to "GET" or "PUT" files into named buckets. These files are not generally accessable without downloading the whole file. This is not a "live" filesystem like you experience on your laptop/server. ] --- # Some tricks * Most cloud environments let you keep block devices around even if the Virtual Machine is destroyed, you just have to pay * This can enable you to bring that computer back again at a later date in the near future. * $0.1/GB/**month** - 100GB - $10/month * Bucket storage can be a good place to store your large files. Transfer to VMs in same region are usually free. * Spot instance pricing - allows you to have your workloads ready to run, and they will run in light-demand times for less $ --- # Towards DevOps * Use things like cloud init to describe the compute environment. * Think of your configuration as a piece of code that should be version controlled. * Containerize your VMs? (advanced) --- ### I don't have money for the cloud All major cloud providers have granting programs. Catalyst Cloud is receptive to interesting conversations about how researchers could use their cloud for the benefit of NZ and NZ-based research. .one-half[ #### Amazon * [Grants for Research](https://aws.amazon.com/grants/) #### Microsoft Azure * [AI for Humanitarian Action](https://www.microsoft.com/en-us/ai/ai-for-humanitarian-action) * [AI for Earth](https://www.microsoft.com/en-us/aiforearth) * [AI for Accessibility](https://www.microsoft.com/en-us/ai-for-accessibility) #### Google Cloud * [Grants for Research](https://lp.google-mkto.com/gcp-research-credits-FAQ.html) ] --- # Simpler Clouds .medium[ These clouds have no-nonsense "flat" pricing and are much easier to get started with. * DigitalOcean.com * Vultr.com * Packet.com ] --- # Are you using cloud environments? .medium[ * What are the challenges to using the cloud in your research? * What stops you from trying if you haven't yet? * Would you like to have a training on Cloud? ] --- # Questions ## Jonah Duckles ### jduckles@abacusbio.co.nz
@jduckles ## Cloud Guide https://genomicsaotearoa.github.io/cloudsforresearch/ .citation[https://jduck.net/presentations/QMB_datascience] ]