Announcing Arrikto Enterprise Kubeflow Q3-2021 Release

Enterprise Kubeflow q3 2021

Arrikto is excited to announce the Q3-2021 release of Arrikto Enterprise Kubeflow (EKF) featuring Kubeflow 1.4. Many thanks to our customers and community of users for helping inform our roadmap, designs and testing plans. If you are unfamiliar with Arrikto, our mission is to accelerate the MLOps potential of Kubeflow by enabling data scientists to build and deploy machine learning models faster, more efficiently, and securely.

What is Arrikto EKF?

EKF is a complete machine learning operations (MLOps) platform that simplifies, accelerates, and secures the machine learning model development life cycle with Kubeflow. It comes bundled with enterprise-grade Kubeflow and add-ons like Kale to accelerate the model development and pipelines, plus Rok to handle all your data management tasks. To learn more about EKF check out the product page.

Ok, so what’s new in this release? Let’s dive in…

Support for the Latest Kubeflow Release

Arrikto EKF supports the latest Kubeflow 1.4 release which includes simplified operations and streamlined workflows.

Empowering Data Scientists

Pytorch support in Kale

You can now run distributed PyTorch training jobs using Kale. This significantly reduces model-to-market time and takes advantage of parallel training and GPUs.

Environment Variables Supported in the Kale SDK

The Kale SDK now enables you to set environment variables for Kubernetes resources. You may set CPU and RAM limits for pipeline steps and even instruct a pipeline step to run on a prespecified node. 

Control Pipeline Workflows with Conditionals

We have introduced conditionals that enable you to control the workflow through a pipeline. With this feature, you can control which branches of your pipeline execute depending on the results of upstream steps.

Specify Docker Images in Pipeline Steps

Kale now enables you to specify the Docker image to use for specific pipeline steps. This enables you to use standard images used throughout your enterprise for common tasks such as data cleaning.

More Deployment Automation

In this release we focused heavily on developing a continuous automated deployment solution for our customers that is inline with GitOps principles and methodologies, specifically:

  • We are providing an automated end to end deployment process inside of EKF.
  • EKF uses a wizard to present the necessary details about your infrastructure.
  • EKF employs a deployment automation tool that generates Kustomize manifests automatically based on your infrastructure.
  • EKF then commits the manifests to a Git repo to ensure immutability and reproducibility.
  • Finally, EKF applies these manifests to Kubernetes to deploy updates and configuration changes.

In a nutshell, EKF generates, commits, and applies the manifests in a fully automated way. At Arrikto we will continue to increase the level of automation in our Q4-2021 release. Stay tuned!

Improved Infrastructure Management

In this release, EKF’s infrastructure management updates were focused on improving not only the performance of EKF, but also the control users have over pipelines, enabling data scientists to build and deploy even faster. 

More Efficient Use of Resources

EFK now shuts down idle notebook servers to save resources and reduce costs.

Support for ReadWriteMany (RWX) Volumes

This release introduced ReadWriteMany (RWX) volumes to allow multiple pods, and therefore notebooks running in different nodes to read and write from the same volume.

Automated Application of Kubernetes Resources to Namespaces

EFF now provides an automated way to apply a skeleton of Kubernetes resources (secrets, ConfigMaps, ServiceAccounts, RoleBindings, PodDefaults) to every namespace. This capability enables automated access to Rok or external services, for example, an on-prem object store or database via PodDefaults. This also enables automated Rok snapshots so that users don’t accidentally lose data. Note that Admins can configure the interval of the snapshots by editing a ConfigMap.

Configurable Kale Marshal Volume Size

Kale’s marshal volume size is now configurable giving users greater control over the infrastructure Kale uses.

Autoscaler Support for Scale-In

We’ve extended the Kubernetes autoscaler to support scale-in when using local volumes. An upstream contribution to the Kubernetes project will follow.

More User Management Options

The user management improvements in this release were focused on making it easier to administer EKF so that machine learning models can be deployed more efficiently with the same expected security. 

Support for PingID, Okta

We expanded our identity management options with PingID and Okta. Arrikto Enterprise Kubeflow already integrates with Google and GitLab.

Support Istio authorization based on groups

You can now authorize groups inherited from the identity provider to access EKF resources via Istio. Our implementation is based on a fork of Istio. An upstream contribution will follow.

Improvements to Interfaces

APIs

  • You may now set limits and requests on resources via the Kale API
  • You can now make predictions using an existing KF Serving inference service via the Kale API.

UI

  • You can now monitor the last activity in your Notebook Servers through the Notebooks UI.
  • Finally, we revamped Kubeflow central dashboard to make it responsive and included a number of information panels that place tutorials, documentation, and other resources at your fingertips.

About Arrikto

At Arrikto, we are active members of the Kubeflow community having made significant contributions to the latest 1.4 release. Our projects/products include:

  • Kubeflow as a Service is the easiest way to get started with Kubeflow in minutes! It comes with a Free 7-day trial (no credit card required).
  • Enterprise Kubeflow (EKF) is a complete machine learning operations platform that simplifies, accelerates, and secures the machine learning model development life cycle with Kubeflow.
  • Rok is a data management solution for Kubeflow. Rok’s built-in Kubeflow integration simplifies operations and increases performance, while enabling data versioning, packaging, and secure sharing across teams and cloud boundaries.
  • Kale, a workflow tool for Kubeflow, which orchestrates all of Kubeflow’s components seamlessly.