{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "
dnf
, shipped with the recent releases of Fedora.\n",
"dnf
. The best option in order to add a new file /etc/yum.repos.d/google-cloud-sdk.repo
(note that root permissions are required, thus this is an action to be performed likely through sudo
), having the following content:\n",
"\n",
"[google-cloud-sdk]\n", "name=Google Cloud SDK\n", "baseurl=https://packages.cloud.google.com/yum/repos/cloud-sdk-el7-x86_64\n", "enabled=1\n", "gpgcheck=1\n", "repo_gpgcheck=1\n", "gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg\n", " https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg\n", "\n", "\n", "Now the installation can be carried out as usual:\n", "\n", "
local> sudo dnf install google-cloud-sdk\n", "\n", "\n", "The next step consists in initializing the SDK, associating it to the project we have created. This is done through the following command:\n", "\n", "
local> gcloud init\n", "\n", "\n", "The initialization process is composed by several steps, detailed here below.\n", "\n", "1. Selection of a Google account: you should already find the one you registered with as a possible choice, although there is the possibility to insert another account (note that an authentication step will be fired up, for instance through the mediation of a Web browser).\n", "2. Selection of a project: you should be prompted with a list with your existing projects, within which you can select the project you have created for the course.\n", "3. Selection of region and zone: answer «Y» when asked if you want to select the compute region and zone. This allows you to specify a default region and zone for the commands you will issue using the SDK. Although several architectural patterns might apply here, we will keep it simple and refer to the one called «single-region deployment», consisting in placing everything in the same zone, selecting a region close to our ISP. We will select «europe-west6-a», corresponding to one of the zones in the Zurich datacenter (one of the nearest in Milano at the time of writing). Note that only some of the available zones are listed, and it might be required to type
list
in order to show all options.\n",
"\n",
"\n",
"local> export IMAGE_FAMILY=\"tf2-latest-cpu\"\n", "local> export ZONE=\"europe-west6-a\"\n", "local> export INSTANCE_NAME=\"amd-instance\"\n", "local> export INSTANCE_TYPE=\"n1-standard-1\"\n", "\n", "local> gcloud compute instances create $INSTANCE_NAME \\\n", " --zone=$ZONE \\\n", " --image-family=$IMAGE_FAMILY \\\n", " --image-project=deeplearning-platform-release \\\n", " --maintenance-policy=TERMINATE \\\n", " --machine-type=$INSTANCE_TYPE \\\n", " --boot-disk-size=200GB \\\n", " --preemptible\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
local> gcloud compute ssh --zone=$ZONE \\\n", " jupyter@$INSTANCE_NAME -- -L 8080:localhost:8080\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Connecting via terminal could require to add your SSH keys to the Google console. Note however that, although a terminal might be handy for several reasons, most of the time we will interact with the VM only via Web, using one or more jupyter notebooks (which have been installed according to the specification of `IMAGE_FAMILY` during the VM creation). Indeed the above ssh connection also sets up a forwarding between the port 8080 on the local machine and the analogous one in the VM, as everything follows `--` in previous command is automatically dispatched to the ssh client. This allows us to open a browser and connect to http://localhost:8080/tree and directly use jupyter notebooks on the VM. Note also that a directory `tutorials` is automatically created and populated with some tutorials on the use of GCP within jupyter.\n", "\n", "In order to verify that everything has been installed, create a new notebook based on python 3 and execute the following code." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import scipy as sp\n", "import seaborn as sns\n", "import matplotlib.pyplot as plt\n", "import tensorflow as tf\n", "\n", "tf.__version__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you obtain any import error, some of the libraries has not been correctly installed. Moreover, if you don't get as output a string starting by `'2.'`, the latest TensorFlow release has not been installed. In both cases, check that you followed the described procedure." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
local> gcloud compute instances delete $INSTANCE_NAME" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "which deletes (upon explicit confirmation) all the allocated resources. Note that this also includes the disk containing the notebook we previously created. We will deal with this issue in a while.\n", "\n", "If now you get back to the Google console, you should see a peak in the graph displaying the amount of API requested, as well as notice that there are no more resources allocated." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Decoupling VM and storage\n", "\n", "As we have seen in previous section, on the one hand it is recommended to shut down a VM once it is no more needed in order to perform a computation, altough this also has the effect of permanently deleting its attached storage, thus including the results of the computation itself, which typically are notebooks, textual or binary data generally stored as one or more files. An obvious solution to this problem is that of keeping VMs alive at least for the time of transferring such results in a safe place, such as within our local machine. This can be done through the `gcloud` utility as follows:\n", "\n", "
\n", "local> export LOCAL_PATH='/home/local'\n", "local> export INSTANCE_FILE='~/data.txt'\n", "local> gcloud compute scp $LOCAL_PATH jupyter@$INSTANCE_NAME:$INSTANCE_FILE\n", "\n", "\n", "Here\n", "- `INSTANCE_NAME` contains the name of a running VM instance,\n", "- `LOCAL_PATH` is the pathname of the local directory where data should be saved (replace «/home/local» with an existing path),\n", "- `INSTANCE_FILE` is the pathname of the file to be copied from the VM (again, replace the contents of this variable with an appropriate value)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Directory-recursive copies are possible, via using the `--recurse` option. Of course it is possible to use the same tool in order to copy something from the local machine to the VM, for instance configuration files or data to be processed.\n", "\n", "Another long-term storage place is a cloud-based resource which can live independently of a VM. Such storage is organized in terms of file-system-like objects called _buckets_. Buckets can be managed, as well as VMs, either via the Web console or using a terminal, although in this case the command to be used is `gsutil` (automatically installed alongside the Google Cloud SDK). For instance\n", "\n", "
local> gsutil mb -l europe-west6 gs://amd-bucket/\n", "\n", "has the effect of creating a bucket identified by the name «amd-bucket». The URI-like reference «gs://amd-bucket» underlines the fact that, exactly as with projects, we are here dealing with global identifiers, thus it is not possible to create buckets having names already chosen by someone else. Note that in this case we specified only a region («europe-west6») instead than a specific zone. The `gsutil` tool can be straightforwardly used in order to move data between a bucket and a local machine, exactly with the same semantics of `cp` in a bash. For insance\n", "\n", "
local> export LOCAL_PATH='~'\n", "local> export BUCKET_NAME='amd-bucket'\n", "local> export BUCKET_PATH='data.txt'\n", "local> gsutil cp gs://$BUCKET_NAME/$BUCKET_PATH $LOCAL_PATH\n", "\n", "\n", "copies a file named «data.txt» from a bucket to the home directory of the local machine. The tool is also available when we ssh into a VM, and thus can be used in order to transfer data between the ephemeral disk of the VM and the persistent storage of a bucket, using the URI-like namespace for objects inside a bucket, such as in\n", "\n", "
VM> export VM_PATH='~/notebooks/result.csv'\n", "VM> export BUCKET_NAME='amd-bucket'\n", "VM> export BUCKET_PATH='result.csv'\n", "VM> gsutil cp $VM_PATH gs://$BUCKET_NAME/$BUCKET_PATH\n", "\n", "\n", "which exports the results of a computation done on a VM (the file «results.csv») in a bucket. Finally, recall that, although cheaper than VMs, buckets have themselves a cost and thus it is recommended to dispose of them if they are not needed for long-term storage (which could be a good option in case of massive data which cannot be saved in local HDs/SSDs). This can be done again via `gsutil`:\n", "\n", "
local> gsutil rm -r gs://amd-bucket" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setup a small cloud\n", "\n", "If needed, we can adapt the procedure shown in previous sections to the case of creating a cloud of machines, for instance in order to build up a distributed environment based on Hadoop and/or Spark (don't worry if these names do not sound to you, we'll learn about them during classes). Just to give an example, issuing the following command in the local terminal has the effect of creating a small cloud with one master machine and two workers.\n", "\n", "
local> gcloud dataproc clusters create amd-cluster \\\n", " --num-workers=2 \\\n", " --scopes=cloud-platform \\\n", " --worker-machine-type=n1-standard-1 \\\n", " --master-machine-type=n1-standard-1 \\\n", " --zone=$ZONE\n", "\n", "\n", "Besides `zone`, having the same meaning of before, the options are:\n", "- `num-workers`, setting the number of worker machines in the cloud,\n", "- `scopes`, declaring one ore more scope (each referring to a set of API which can be accessed by the cloud);\n", "- `worker-machine-type` and `master-machine-type`, describing the type of images to be used for workers and master, respectively.\n", "\n", "Also in this case it will take some time, but if you check your Google console you'll notice under the tab «Resources» that you are running three Compute Engine instances and using one bucket. Here _instance_ and _bucket_ mean \"computing resource\" and \"disk space\", respectively. Clicking on that section of the tab will show you some more details, and in particular you'll notice that VMs are named «amd-cluster-m», «amd-cluster-w0», and «amd-cluster-w1» in order to differentiate between master and workers. You can ssh into these instances, again clicking on the «SSH» button or using a terminal as follows. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
local> export USER='amd'\n", "local> gcloud compute ssh --zone=$ZONE $USER@amd-cluster-m\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, the `USER` variable needs to be set to the username for the master machine, which is equal to the Google username of your account (typically, the part on the left of the 'at' sign in the related @gmail.com address).\n", "\n", "Also in this case, an explicit cleaning phase needs to be executed when the cluster have been used.\n", "\n", "
local> gcloud dataproc clusters delete amd-cluster\n", "\n", "\n", "Coming back to the Google console, you should see that there are no more computing resources allocated, although the «Resources» tab should report that some storage is still in use. Clicking on the related section shows up a list of buckets, likely containing only one item whose name starts by \"dataproc\". For now, just select this item and click the «DELETE» button, confirming the action." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Double-check unused resources and keep an eye on your billing costs\n", "\n", "If you have followed the instructions described in this document, you should have deleted all VMs and buckets and thus you should not be paying for unused, yet still running, resources. However it is always better to perform a double-check: connect to the Google Cloud console and verify that you have not left anything alive.\n", "\n", "Finally, still in the console, have a look at the «Billing» tab, where you should see an estimate of the costs for the services we have used so far. It is likely that, unless it took you more than half an hour (well, probabily a fairly higher amount of time) in order to reproduce all the described steps, you will see an estimated charge of 0 USD. If you check the detailed charges, however, you will notice that the resources we used have actually been accounted for, although they were used for so small a time that we did not incur in any cost charge. Note nonetheless that the shown charges only represent an estimate, thus take the habit of periodically control the effective costs which have been charged on your account." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Connecting Colab and GCP\n", "\n", "Colab provides limited resources, that is the provided virtual machine is automatically shut down after a fixed period (at the time of writing, 12 hours). Moreover, despite the fact that this virtual machine has a remarkable computational power (it can be boosted using GPUs or TPUs), it could represent an insufficient resource for dealing with complex problems. When a more reliable computing environment is needed and the Colab frontend can be hosted on a machine provided by us, for instance a VM spinned up using GCP, as previously explained. We repeat here the code to be executed from the local machine in order to ssh to the VM when the latter is up and running:\n", "\n", "
local> gcloud compute ssh --zone=europe-west6-a \\\n", " jupyter@amd-instance -- -L 8080:localhost:8080\n", "\n", "\n", "Recall that this command also puts in place an ssh-based forwarding of the 8080 local port to the same one on the VM: we have seen that this is important in order to access the remote jupyter installation, and using the same technique it will be possible to tie that installation to a Colab frontend. This is done as follows:\n", "\n", "1. pointing to https://colab.research.google.com/ using a browser,\n", "2. click on the arrow right after the button «Connect» (in the upper right part of the page),\n", "2. selecting «Connect to local runtime...»,\n", "3. inserting «http://localhost:8080/» in the shown form.\n", "\n", "This will execute all the following cell computations within the virtual machine." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Challenge!\n", "\n", "- Chose a problem.\n", "- Set up a solution for that problem which:\n", " - spins up a virtual machine on GCP,\n", " - transfers software and data to be processed,\n", " - execute sowftware,\n", " - saves the results in a persistent location." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example\n", "\n", "[Adversarial examples](https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/generative/adversarial_fgsm.ipynb)\n", "\n", "