Intro to Kubeflow: Fundamentals Training and Certification Recap – Feb 16, 2022

We recently hosted the third delivery of the “Intro to Kubeflow: Fundamentals Training and Certification” prep course. In this blog post we’ll recap some highlights from the class, plus give a summary of the Q&A. Ok, let’s dig in!

Congratulations to Indranil Dutta!

The first student to earn the “Fundamentals” certificate at the conclusion of the course was Indranil Dutta who works at EY. A free MiniKF hoodie and shirt is on the way, well done!

First, thanks for voting for your favorite charity!

With the unprecedented circumstances facing our global community, Arrikto is looking for even more ways to contribute. With this in mind, we thought that in lieu of swag we could give workshop attendees the opportunity to vote for their favorite charity and help guide our monthly donation to charitable causes. The charity that won this workshop’s voting was Society of Women Engineers. It is a not-for-profit educational and service organization that empowers women to succeed and advance in engineering and be recognized for their life-changing contributions as engineers and leaders. We are pleased to be making a donation of $250 to them on behalf of the Kubeflow community. Again, thanks to all of you who attended and voted!

What topics were covered in the course?

This initial course aimed to get data scientists and DevOps engineers with little or no experience familiar with the fundamentals of how Kubeflow works.

  • Kubeflow architecture
  • Overview of machine learning workflows
  • Kubeflow components
  • Tools and add-ons (Kale, Rok, Istio, etc)
  • Distributions
  • Installing Kubeflow on AWS
  • Community overview

What did I miss?

Here’s a short teaser from the 90 minute training. In this video we demonstrate three things in regards to Katib (which is a Kubeflow component that provides AutoML, hyperparameter tuning and early stopping):

  • How to view an experiment
  • How to set up experiments
  • How to identify the best run

Missed the Feb 16 Kubeflow Fundamentals training?

If you were unable to join us last week, but would still like to attend a future training, the next “Kubeflow Fundamentals” training is happening on Feb 16. You can sign up for this and the upcoming Notebooks, Pipelines and Kale/Katib courses here.

NEW: Advanced Kubeflow, Notebooks and Pipelines Workshops

We are excited to announce a new series of FREE workshops focused on taking popular Kaggle and Udacity machine learning examples from “Notebook to Pipeline.” Registration is now open for the following workshops:

Arrikto Academy

If you are ready to put what you’ve learned into practice with hands-on labs? Then check out Arrikto Academy! On this site you’ll find a variety of FREE skills-building exercises including:

  • Kale 101: Transform Jupyter Notebooks into Kubeflow Pipelines
  • Katib 101: Automated Hyperparameter Tuning for Models in Kubeflow Pipelines
  • Rok 101: Manage and Restore Kubeflow Pipeline Snapshots

Q&A from the training

Below is a summary of some of the questions that popped into the Q&A box during the course. [Edited for readability and brevity.]

Will a session recording be available for offline study & learning?

Yes, you can view the lectures and demos on this YouTube playlist.

Can I integrate my local resources with Kubeflow?

If by “local” you mean Kubernetes infrastructure, yes. Kubeflow can run wherever Kubernetes can run, assuming you have enough underlying CPU, RAM and storage to support your machine learning workloads. If by “local” you mean existing Notebooks, again yes, you can create a new Notebook Server on Kubeflow and import them.

What is the difference between Katib and Optuna?

While both are open source hyperparameter optimization engines, there are a few reasons to consider Katib:

  • Part of Kubeflow: If you are standardizing on Kubeflow for your MLOps platform, Katib is “baked-in” with no additional integration work required to get AuoML capabilities
  • Multi-tenancy: Katib supports multi-tenancy
  • Distributed training: Katib supports distributed training (e.g., parameter servers, RingAllReduce, etc.). Frameworks like Optuna lack support for distributed training.
  • Cloud-native: Katib is Kubernetes ready. That makes it an excellent fit for cloud-native deployments.
  • Extensibility: Katib is easily extensible, providing a modular interface that allows for search algorithms and data storage customization.
  • Support for NAS: Katib supports Neural Architecture Search

Is it possible to set up Kubeflow + Kale + Rok in an existing Kubernetes cluster (1.21.9), for example with manifests?

Yes, you can accomplish this with Arrikto’s Enterprise Kubeflow distribution. You can learn more here: https://www.arrikto.com/enterprise-kubeflow/