{"id":10592,"date":"2025-07-28T16:19:32","date_gmt":"2025-07-28T08:19:32","guid":{"rendered":"https:\/\/ai-stack.ai\/kubeflow-ixgpu-tutorial"},"modified":"2025-07-28T17:50:52","modified_gmt":"2025-07-28T09:50:52","slug":"kubeflow-ixgpu-tutorial","status":"publish","type":"post","link":"https:\/\/ai-stack.ai\/en\/kubeflow-ixgpu-tutorial","title":{"rendered":"Master Kubeflow GPU Partitioning: Hands-on with INFINITIX ixGPU Module for High-Efficiency Resource Utilization!"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/www.kubeflow.org\/\" data-type=\"link\" data-id=\"https:\/\/www.kubeflow.org\/\" target=\"_blank\" rel=\"noopener\">Kubeflow<\/a>, an open-source platform built on <a href=\"https:\/\/kubernetes.io\/\" data-type=\"link\" data-id=\"https:\/\/kubernetes.io\/\" target=\"_blank\" rel=\"noopener\">Kubernetes<\/a>, has become widely used by many data scientists and machine learning engineers. This is because it provides a standardized and integrated set of tools covering various stages, including data preparation, model training, hyperparameter tuning, model deployment, and monitoring, significantly simplifying the process of bringing AI models into production.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">However, many developers still encounter the problem of being unable to partition GPU resources when using Kubeflow. Due to its inherent characteristics, a single container will occupy an entire GPU&#8217;s resources. When a project requires less compute power, it leads to some GPU capacity sitting idle and not being effectively utilized. Today&#8217;s article will provide a hands-on guide on how to use INFINITIX&#8217;s ixGPU module to perform GPU partitioning on the Kubeflow platform.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Leveraging the ixGPU Module on the Kubeflow Platform for Flexible GPU Partitioning<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Custom YAML PodDefault Configuration<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">We&#8217;re using a <strong>Tesla P4 GPU<\/strong> for this GPU partitioning demonstration. Before applying the ixGPU module to Kubeflow, you&#8217;ll need to prepare your YAML files. We&#8217;ve set up two YAML files for this: poddefault1.yaml and poddefault2.yaml.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">As shown in the image, by running the following two commands, we can see the detailed configurations of the PodDefault resources defined within these two YAML files:&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cat poddefault1.yaml&nbsp;<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>cat poddefault2.yaml<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">These two YAML files define distinct PodDefault configurations. They&#8217;re designed to allocate different sizes of Tesla P4 GPU memory resources\u2014specifically, 1GB and 3GB\u2014to Pods within the Kubeflow environment that are tagged with specific labels.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXedVB4s3dQU-iAEOoHgA7ZAsXE2lXU6NFZA_4stGFbnL1cAKEYPguj8ZHbD88B9byqfUXxo9NuS71CCCV0-A2F08qbKVnqUdASHw49z0OVNR5vAG0KhcOQKQD4EXI8w59mBkGzi0g?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Next, you&#8217;ll use the kubectl apply command to read the defined PodDefault configurations (which contain the GPU partitioning information) from the YAML files and apply them to your Kubernetes cluster.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl apply -f poddefault1.yaml<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl apply -f poddefault2.yaml<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Afterward, run the kubectl get poddefault -A command to query detailed information about all PodDefault resources across all namespaces in your current Kubernetes cluster.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXdJu1fgnq5d-r9-_pYfa_yy-E-aDsVT0ucgbyCdhLUaA_nH_NAyWUNmXywnvb7zStO1I9zAYjlTdnY4xLsToVGU5mqRUzWx6R18bqv4rSTSaUsJlpDeW6jaRG4ej12-Bef043TmCA?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Kubeflow Notebooks GPU Partitioning<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Once the preparatory steps are complete, we&#8217;ll head over to the Kubeflow platform to create a new notebook container.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXe4XIh7cTeeSRzlUkSqhg1UCmIlc0kFDbtI5NGMRKAr031XVXug6M7uuPOaOzpoJePWTr0MlwfmA1lOJlUl0zTnhUJBDq8TgjsPBKjm8mnPFeHHsGv0Eiv5ZlRC27YRyCn_F1Ez?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Click on <strong>Advanced Options<\/strong>, and under the <strong>Configurations<\/strong> section, you&#8217;ll see the two new GPU resource specifications we just added: <strong>Tesla P4 1GB<\/strong> and <strong>Tesla P4 3GB<\/strong>.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXdrT_lX9Q29ua-5D5WbtyZuobs6-6T09miuFQk6_rX9WiiLEKbjUWl4j7MrLBd0D-GoJyGnp5EpsoGsP3v_qi-gqhVv9OGw12-apHD_7CJIAs7F_x1NspsJOSh6ULDNMpr7mo2xNQ?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Once both containers are created (notebook1 with 1GB and notebook2 with 3GB), we can check if the container resources have truly been partitioned. Click <strong>CONNECT<\/strong> to open notebook1&#8217;s Jupyter Lab.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXeeK_mUXlx_l2Ixk-c-OfTdwFsFrrYilrV7OdR6VDLIihM1PYjSxzFarRB3NORLva5MrNY526A48t-X-m2b58MjhPQ9zDmQggLLfJxWT5aUc_zWb08JB2fdsJ6Y9rF1jHfxTDBX2Q?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Next, we can use the !nvidia-smi command to verify the current resources within the container. As shown in the image, the memory resource in notebook1 is 1GB.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXeSXxefED759gZsHvgZ1qsIMvVP-yxuaZ6YDxAoHeo1ciSmqjyISNUKvdLeRlrztIcu-HMtMdqLYVNWzPicnlkS4S22bdmyHrvD32KobnpJ58QoqQZmxD-ElejLbV_qGitqekky3w?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Follow the same steps for notebook2: click <strong>CONNECT<\/strong> to launch Jupyter Lab.<img fetchpriority=\"high\" decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXfJ2GTB_NwKGBSM-RfoChig3V1wHDIWks5wB1wgKvDDHoRZwGuWKzOeehhJ8alMx2msaWjtsS43t52Pt5HFstQTN3NtR25CBIQPWeSqsuHoHc3CaJEn_xsphjN7hFeznWWfWk_O?key=XtIKFmg3kQ0OaKlKQs7Qlg\" width=\"677\" height=\"237\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">After entering the !nvidia-smi command, you&#8217;ll see that this container has 3GB of memory resources.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXdi8NwwKNnr4LLHLDMCzJySJHKL89fk4gEyt7KczTHcBEMRA-JTFKIuccawd7vWWk0TkcxDSkcXk7cCqT9GJFrFdF_OpnjWIcVQ8dtJq3S5ACT55Uu6BZtyesoK7ypdREn7lats3A?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">There you have it! Flexible GPU partitioning on Kubeflow is complete! Pretty straightforward, right?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Next up, we&#8217;ll walk you through how to apply the ixGPU module within Kubeflow Pipelines. This will let us allocate the exact GPU memory resources needed for different tasks throughout your entire workflow, achieving even more precise and efficient flexible allocation to maximize your compute effectiveness.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Integrate the ixGPU Module into Kubeflow Pipeline for High-Efficiency GPU Resource Allocation<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">In this example, we&#8217;ve set up a simple Pipeline with two stages: Training and Testing.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Machine Learning Pipeline GPU Resource Allocation Pre-configuration<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">In pipeline.py, the Python script used to define our Kubeflow Pipeline, we first define the GPU resource allocation for the <strong>Training<\/strong> and <strong>Testing<\/strong> steps. Since Training has higher compute demands, we&#8217;ve allocated <strong>3GB of GPU memory<\/strong>; for Testing, which requires less compute, we&#8217;ve allocated <strong>1GB of resources<\/strong>. Please refer to the image below for the code.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXf3qf-RQnSsUUqsftjOKkIBOqkSyDYcFErCsfG3VmEX2m_MEQYkH6FaDey1FErcbVH4iZKyD999Olk0z630UzVwP3CZ4QJ6RCCivEFK2DmZ06ggLJ2qP7kUjIGzqdWKUZFfKe6a?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Next, we&#8217;ll enter python pipeline.py in the TERMINAL to compile the entire workflow. After execution, you&#8217;ll see mnist-pipeline.yaml appear in the left-hand list. This indicates that our Python script has been successfully run, and the entire machine learning workflow we defined in the Python code has been compiled into a YAML file that conforms to Kubeflow Pipeline specifications.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXc_-Cvrjc6OW8Jcdyh2z-qckDMT5S6O63q8eCBfCCQbm1-OIx0v4TwY89UZ7oeCaMMjh3-9Ouyfqx3JtfyWQJA6brQCk0x58c2Tj2v2qcvHB6n5cokVB2Bwm-lnkDSm8xcKfD43?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Kubeflow Platform Pipeline Configuration<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Next, head over to the Pipelines page on Kubeflow and click + Upload pipeline in the top-right corner to create a new pipeline version.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXePvGSLReBcyrZo4M_LqVOk7H6jUVDYHLRtp10w-OFNoq5KbdIuJXElVjjHqIGKyvDKfdbsagiByh7HFIXIveFiF-VPG4-VLgO3zsp-UkwR3238cQWzxtrpSUhCEBEaAZI8wyw9nQ?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Select the <strong>&#8220;Upload a file&#8221;<\/strong> option and upload the mnist-pipeline.yaml file you just compiled. This is the crucial step for deploying the machine learning workflow, which was developed and defined locally, to the Kubeflow cluster for execution and management.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXd4HfuKtaCZSwMnrlUpLT3tjWEEmS5iYlxNmEiPHD--hEdjoaO2z7ZSJtA9gdkObv_yeNpN2agHc31wSihE82ISu7E7tuco5Y4hhgrWT86htLlK6Kee5dKi1pAGsN936ifIajmaTw?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">After a successful creation, the Kubeflow Pipelines user interface will display an overview page for the newly uploaded mnist-pipeline. You can then click + Create run to start executing your machine learning Pipeline.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXdNoKaClJzFIG9nrnG-AQ4kR3R1JoM-4BwIJqhllNzxqCshPi-k0Zd3_MhmFh3K2sDwlZKyst24zIkTND-PjtN63pYt8QtaOALCEn5CS309UehJ9xbYSSrC4pNfmXrDa1PEOMa_Yw?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Once the training process is complete, you can click in to view the logs.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXfr5uFMK3aPyYU3d8nYpAaXtAhnVBNx40_vvvXbUlAal0v9vngYJwymLbslFzjOmNQsnLhuro3A4NDf6WlIbwFZSddLj_47O_qFdYj4aKjfySdYUJBI9OhS81RR3miA2Zb574YDLw?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">You can see that in this machine learning model training workflow, we successfully completed the training task using 3GB of GPU memory resources.<img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXeLI6hZJH_XHzyjCoW9Nq3h0H7rZ4Iluu06E38TJnH24CBZ7-6Fyh-q26st_6ykBDRkgwYMRf2Yu-msv-hiWPsRzP4ZJA1TnO_2-O_PoEn7FmRMMNNuKTEERyr44RW1vJ8NRTZSDQ?key=XtIKFmg3kQ0OaKlKQs7Qlg\" width=\"677\" height=\"380\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Next, you can check the execution logs for the testing task.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXeQE2Fa66rjrg_z8veV0v7RxCaxBOplFAuTVlDL6nGqTrwcSRWL4-2IcJ_KxgVIpoqG7v0EOtrM3UiGs0Fv-uAwutldx_MXa2w6Co27-Ck5yvrpB88dqMsxFdTwtb64RFv0A4bFWg?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">We successfully partitioned the GPU and allocated 1GB of resources to the testing task.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXeNWMZeaq1eAx5hVM2emH1pBrbemTcEi204QuE_YP-XzrX1btdexYwfn6nQ4uHokH-nQk6y22bSrzycWfEyOoX2WHqzUTMB7CGvvFcxTmhxjxbz3q2MSJdPlJHkYm8ZS2nKUW3T?key=XtIKFmg3kQ0OaKlKQs7Qlg\" alt=\"\"\/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">That wraps up our complete tutorial on applying the ixGPU module within the Kubeflow platform!<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">INFINITIX&#8217;s ixGPU module can help enterprises utilize GPU compute resources more flexibly. After seeing the GPU partitioning capabilities of the ixGPU module combined with Kubeflow, are you also interested in this feature? Feel free to contact us to learn more!&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To better understand the core advantages of the ixGPU module, please refer to this article: <strong><a href=\"https:\/\/ai-stack.ai\/en\/kubeflow-ixgpu-gpu-partitioning\" data-type=\"link\" data-id=\"https:\/\/ai-stack.ai\/en\/kubeflow-ixgpu-gpu-partitioning\">Breaking Through Kubeflow Limitations: INFINITIX ixGPU Module Achieves Flexible GPU Partitioning<\/a><\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Many developers still encounter issues with GPU resource partitioning when using Kubeflow. This article will guide you step-by-step on how to perform Kubeflow GPU partitioning using Infinitix&#8217;s ixGPU module.<\/p>\n","protected":false},"author":253372381,"featured_media":10593,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_crdt_document":"","jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[96987598,96987570],"tags":[96988171,96988172,96988174],"class_list":["post-10592","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-product-features","category-technical-support","tag-ixgpu-en","tag-kubeflow-en","tag-gpu-partitioning"],"blocksy_meta":[],"acf":[],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/ai-stack.ai\/wp-content\/uploads\/2025\/07\/AI%E8%B3%87%E6%96%99%E4%B8%AD%E5%BF%83%E6%98%AF%E4%BB%80%E9%BA%BC-%E8%B7%9F%E5%82%B3%E7%B5%B1%E8%B3%87%E6%96%99%E4%B8%AD%E5%BF%83%E6%9C%89%E4%BB%80%E9%BA%BC%E5%B7%AE%E5%88%A5-5.png?fit=1920%2C1080&quality=100&ct=202603031250&ssl=1","jetpack_shortlink":"https:\/\/wp.me\/ph344V-2KQ","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/ai-stack.ai\/en\/wp-json\/wp\/v2\/posts\/10592","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ai-stack.ai\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ai-stack.ai\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ai-stack.ai\/en\/wp-json\/wp\/v2\/users\/253372381"}],"replies":[{"embeddable":true,"href":"https:\/\/ai-stack.ai\/en\/wp-json\/wp\/v2\/comments?post=10592"}],"version-history":[{"count":0,"href":"https:\/\/ai-stack.ai\/en\/wp-json\/wp\/v2\/posts\/10592\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ai-stack.ai\/en\/wp-json\/wp\/v2\/media\/10593"}],"wp:attachment":[{"href":"https:\/\/ai-stack.ai\/en\/wp-json\/wp\/v2\/media?parent=10592"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ai-stack.ai\/en\/wp-json\/wp\/v2\/categories?post=10592"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ai-stack.ai\/en\/wp-json\/wp\/v2\/tags?post=10592"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}