Kubeflow, an open-source platform built on Kubernetes, has become widely used by many data scientists and machine learning engineers. This is because it provides a standardized and integrated set of tools covering various stages, including data preparation, model training, hyperparameter tuning, model deployment, and monitoring, significantly simplifying the process of bringing AI models into production.

However, many developers still encounter the problem of being unable to partition GPU resources when using Kubeflow. Due to its inherent characteristics, a single container will occupy an entire GPU’s resources. When a project requires less compute power, it leads to some GPU capacity sitting idle and not being effectively utilized. Today’s article will provide a hands-on guide on how to use INFINITIX’s ixGPU module to perform GPU partitioning on the Kubeflow platform.

Leveraging the ixGPU Module on the Kubeflow Platform for Flexible GPU Partitioning

Custom YAML PodDefault Configuration

We’re using a Tesla P4 GPU for this GPU partitioning demonstration. Before applying the ixGPU module to Kubeflow, you’ll need to prepare your YAML files. We’ve set up two YAML files for this: poddefault1.yaml and poddefault2.yaml.

As shown in the image, by running the following two commands, we can see the detailed configurations of the PodDefault resources defined within these two YAML files: 

cat poddefault1.yaml 
cat poddefault2.yaml

These two YAML files define distinct PodDefault configurations. They’re designed to allocate different sizes of Tesla P4 GPU memory resources—specifically, 1GB and 3GB—to Pods within the Kubeflow environment that are tagged with specific labels.

Next, you’ll use the kubectl apply command to read the defined PodDefault configurations (which contain the GPU partitioning information) from the YAML files and apply them to your Kubernetes cluster.

kubectl apply -f poddefault1.yaml
kubectl apply -f poddefault2.yaml

Afterward, run the kubectl get poddefault -A command to query detailed information about all PodDefault resources across all namespaces in your current Kubernetes cluster.

Kubeflow Notebooks GPU Partitioning

Once the preparatory steps are complete, we’ll head over to the Kubeflow platform to create a new notebook container.

Click on Advanced Options, and under the Configurations section, you’ll see the two new GPU resource specifications we just added: Tesla P4 1GB and Tesla P4 3GB.

Once both containers are created (notebook1 with 1GB and notebook2 with 3GB), we can check if the container resources have truly been partitioned. Click CONNECT to open notebook1’s Jupyter Lab.

Next, we can use the !nvidia-smi command to verify the current resources within the container. As shown in the image, the memory resource in notebook1 is 1GB.

Follow the same steps for notebook2: click CONNECT to launch Jupyter Lab.

After entering the !nvidia-smi command, you’ll see that this container has 3GB of memory resources.

There you have it! Flexible GPU partitioning on Kubeflow is complete! Pretty straightforward, right?

Next up, we’ll walk you through how to apply the ixGPU module within Kubeflow Pipelines. This will let us allocate the exact GPU memory resources needed for different tasks throughout your entire workflow, achieving even more precise and efficient flexible allocation to maximize your compute effectiveness.

Integrate the ixGPU Module into Kubeflow Pipeline for High-Efficiency GPU Resource Allocation

In this example, we’ve set up a simple Pipeline with two stages: Training and Testing.

Machine Learning Pipeline GPU Resource Allocation Pre-configuration

In pipeline.py, the Python script used to define our Kubeflow Pipeline, we first define the GPU resource allocation for the Training and Testing steps. Since Training has higher compute demands, we’ve allocated 3GB of GPU memory; for Testing, which requires less compute, we’ve allocated 1GB of resources. Please refer to the image below for the code.

Next, we’ll enter python pipeline.py in the TERMINAL to compile the entire workflow. After execution, you’ll see mnist-pipeline.yaml appear in the left-hand list. This indicates that our Python script has been successfully run, and the entire machine learning workflow we defined in the Python code has been compiled into a YAML file that conforms to Kubeflow Pipeline specifications.

Kubeflow Platform Pipeline Configuration

Next, head over to the Pipelines page on Kubeflow and click + Upload pipeline in the top-right corner to create a new pipeline version.

Select the “Upload a file” option and upload the mnist-pipeline.yaml file you just compiled. This is the crucial step for deploying the machine learning workflow, which was developed and defined locally, to the Kubeflow cluster for execution and management.

After a successful creation, the Kubeflow Pipelines user interface will display an overview page for the newly uploaded mnist-pipeline. You can then click + Create run to start executing your machine learning Pipeline.

Once the training process is complete, you can click in to view the logs.

You can see that in this machine learning model training workflow, we successfully completed the training task using 3GB of GPU memory resources.

Next, you can check the execution logs for the testing task.

We successfully partitioned the GPU and allocated 1GB of resources to the testing task.

Conclusion

That wraps up our complete tutorial on applying the ixGPU module within the Kubeflow platform!

INFINITIX’s ixGPU module can help enterprises utilize GPU compute resources more flexibly. After seeing the GPU partitioning capabilities of the ixGPU module combined with Kubeflow, are you also interested in this feature? Feel free to contact us to learn more! 

To better understand the core advantages of the ixGPU module, please refer to this article: Breaking Through Kubeflow Limitations: INFINITIX ixGPU Module Achieves Flexible GPU Partitioning