Kubeflow and MLOps Workshop Recap – Oct 2021

Kubeflow and MLOps Workshop

Last week we hosted a free Kubeflow and MLOps workshop presented by Kubeflow Community Product manager Josh Bottum. In this blog post we’ll recap some highlights from the workshop, plus give a summary of the Q&A. Ok, let’s dig in.

First, thanks for voting for your favorite charity!

With the unprecedented circumstances facing our global community, Arrikto is looking for even more ways to contribute. With this in mind, we thought that in lieu of swag we could give attendees the opportunity to vote for their favorite charity and help guide our monthly donation to charitable causes. The charity that won this month’s workshop voting was Action Against Hunger. We are pleased to be making a donation of $250 to them on behalf of the Kubeflow community. Again, thanks to all of you who attended and voted!

What topics were covered in the workshop?

  • How to install Kubeflow via MiniKF locally or on a public cloud
  • Take a snapshot of your notebook
  • Clone the snapshot to recreate the exact same environment
  • Create a pipeline starting from a Jupyter notebook
  • Go back in time using Rok. Reproduce a step of the pipeline and view it from inside your notebook
  • Create an HP Tuning (Katib) experiment starting from your notebook
  • Serve a model from inside your notebook by creating a KFServing server 
  • An overview of what’s new in Kubeflow 1.4

What did I miss?

Here’s a short teaser from the 45 minute workshop where Josh walks us through some of the steps required to turn models into pipelines using Kubeflow.

Install MiniKF

In the workshop Josh discussed how MIniKF is the easiest way to get started with Kubeflow on the platform of your choice (AWS, GCP or locally) and the basic mechanics of the installs.

Here’s the links:

Hands-on Tutorials

Although Josh focused primarily on the examples shown in tutorial #3 (which makes heavy use of the Open Vaccine Covid-19  example), make sure to also try out tutorial #4 which does a great job of walking you through all the steps you’ll need to master, when bringing together all the Kubeflow components to turn your models into pipelines. Get started with these hands-on, practical tutorials.

Need help?

Join the Kubeflow Community on Slack and make sure to add the #minikf channel to your workspace. The #minikf channel is your best resource for immediate technical assistance regarding all things MiniKF!

Missed the Oct 28 workshop?

If you were unable to join us last week but would still like to attend a workshop in the future you can sign up for the next workshop happening on Nov 18.

FREE Kubeflow courses and certifications

We are excited to announce the first of several free instructor-led and on-demand Kubeflow courses! The “Introduction to Kubeflow” series of courses will start with the fundamentals, then go on to deeper dives of various Kubeflow components. Each course will be delivered over Zoom with the opportunity to earn a certificate upon successful completion of an exam. To learn more, sign up for the first course.

Q&A from the workshop

Below is a summary of some of the questions that popped into the Q&A box during the workshop. [Edited for readability and brevity.]

We are planning to migrate from AWS SageMaker to Kubeflow. Does Kubeflow support the importing of pre-built Docker images which contain the model?

Yes, you can run custom images on Kubeflow.

Does Kubeflow have Data Versioning like DVC?

Yes, MiniKF and Arrikto Enterprise Kubeflow (EKF) package Rok, which is a K8s-native data management layer that provides data versioning, packaging of the data, and shipping of the data across clusters.

Can we run and orchestrate Spark jobs using Kubeflow?

Yes! Check out this video and register for the Dec 2 virtual Kubeflow Meetup where we will discuss this topic in depth.

How can I propagate pipelines and notebooks across Kubeflow running in different environments?

We use Rok and Rok Registry for this. Rok Registry connects multiple clusters in a peer-to-peer network and allows you to ship efficiently your versioned snapshots from one cluster to another independently of the cloud/region they are running on.

How is Kubeflow is different from AWS Sagemaker?

Kubeflow is a complete open source MLOps platform that runs on top of Kubernetes. This means you can deploy it locally, on any cloud or data center where Kubernetes is supported. Unlike Sagemaker, Kubeflow provides end-to-end workflow functionality, including pipeline tooling that can be used by both data scientists and DevOps.

Will the Kale deployment create customized YAML for a Pipeline using Notebooks?

First, Kale creates a snapshot of all of the Notebook’s PVCs, which include all your libraries and data. It then auto generates Kubeflow Pipelines DSL code for every step, along with code that clones the versioned snapshots, and then mounts them on every step of the Pipeline. Finally, it runs the KFP compiler to produce the Argo YAMLs,  then runs the pipeline.

Can different Notebooks be combined in the same Pipeline?

This is currently not supported, but Kale is going to be able to combine different pipelines in the near future. This means you will be able to create two pipelines from two different Notebooks and then combine them to run together.

When using Kale does the code run on the same image and pod?

Kale is packaged in the JupyterLab image as an extension. When you run Kale it creates the Kubeflow Pipeline, and the pipeline then runs on new pods via Argo. The pipeline steps use the same base image as the notebook, and then Kale attaches a clone of the Notebooks persistent volume (PV) to each pipeline step.  The clone provides the pipeline step with the exact code from the notebook, along with all the downloaded packages/libraries, datasets and metadata.

When using Kale do I need to manage Jupyter Notebook versions?

Kale automatically snapshots the whole environment before pushing a pipeline, which contains the notebook as well. We will also support pushing the code to git automatically as well before running the pipeline in the near future. We also provide the Kale SDK, which allows you to do the same from any other IDE or your git repository directly, by just decorating your existing Python functions (the same way you would annotate a Notebook cell).

Can I control the base image for each step when using Kale?

Not yet, but this is coming as a new feature in Kale. Stay tuned.

In colab, uploaded files get deleted after runtime. Do we have to upload the datasets here everyday as well?

No. The Notebook Servers persist. You snapshot them and then have versions for everything.

Does Kubelfow support different cloud platforms?

Yes, Arrikto’s enterprise, multi-node, version of MiniKF is called EKF and it runs in all clouds. MiniKF is great for testing and it is available in the AWS and GCP marketplaces.

After the Pipeline is set, can I run exactly the same pipeline with a different set of parameters using Katib?

Yes, you can.

Do you have to provision and automate the Kubeflow infrastructure using tools like Terraform, Puppet or Chef?

We provide a GitOps repository and automation tooling for installation, and we also provide live upgrades.

I am on an M1 chip and have difficulty starting MiniKF on VirtualBox. Do you guys have MiniKF for UTM?

You’ll want to make sure you have sufficient resources before attempting to install locally. The minimum requirements are 32 GB RAM, 2 CPUs, 50 GB disk space, Linux, macOS, or Windows, plus Vagrant and VirtualBox. Learn more about running MiniKF locally and some troubleshooting tips here.

I only have 8 GB RAM. Can I run Kubeflow?

Not locally. Your best bet is to install MiniKF via the AWS or GCP Marketplaces.

What’s Next?