-Check running state of all pods and services using below command
+Check running state of all pods and services using below command :
.. code:: bash
- kubectl get pods --all-namespaces
- kubectl get svc --all-namespaces
-
+ ~$ kubectl get pods --all-namespaces
+
+ kubeflow cache-deployer-deployment-cf9646b9c-jxlqc 1/1 Running 0 53m
+ kubeflow cache-server-56d4959c9-sz948 1/1 Running 0 53m
+ kubeflow leofs-bfc4794f5-7xfdn 1/1 Running 0 56m
+ kubeflow metadata-envoy-deployment-9c7db86d8-7rlkf 1/1 Running 0 53m
+ kubeflow metadata-grpc-deployment-d94cc8676-mhw4l 1/1 Running 5 (47m ago) 53m
+ kubeflow metadata-writer-cd5dd8f7-6qsx6 1/1 Running 1 (46m ago) 53m
+ kubeflow minio-5dc6ff5b96-4f9xd 1/1 Running 0 53m
+ kubeflow ml-pipeline-85b6bf5f67-5x9lq 1/1 Running 2 53m
+ kubeflow ml-pipeline-persistenceagent-fc7c944d4-bjz5n 1/1 Running 1 (46m ago) 53m
+ kubeflow ml-pipeline-scheduledworkflow-676478b778-h42kx 1/1 Running 0 53m
+ kubeflow ml-pipeline-ui-76bc4d6c99-8rw9x 1/1 Running 0 53m
+ kubeflow ml-pipeline-viewer-crd-8574556b89-g5xw7 1/1 Running 0 53m
+ kubeflow ml-pipeline-visualizationserver-5d7c54f495-mhdtj 1/1 Running 0 53m
+ kubeflow mysql-5b446b5744-mcqlw 1/1 Running 0 53m
+ kubeflow workflow-controller-679dcfdd4f-c64bj 1/1 Running 0 53m
+ traininghost aiml-dashboard-667c546669-rslbz 1/1 Running 0 38m
+ traininghost aiml-notebook-5689459959-hd8r4 1/1 Running 0 38m
+ traininghost cassandra-0 1/1 Running 0 41m
+ traininghost data-extraction-bd7dc6747-98ddq 1/1 Running 0 39m
+ traininghost kfadapter-75c88574d5-ww7qb 1/1 Running 0 38m
+ traininghost modelmgmtservice-56874bfc67-ct6lk 1/1 Running 0 38m
+ traininghost tm-757bf57cb-rlx7v 1/1 Running 0 39m
+ traininghost tm-db-postgresql-0 1/1 Running 0 53m
+
+
+
+**Note: In K Release, dashboard is not supported. We recomment to use cURL to interact with AIMLFW components.
+Details are provided in further section for each operation required for model training.**
+
+
+Software Uninstallation & Upgrade
+---------------------------------
-Check the AIMLFW dashboard by using the following url
-Note: In K Release, dashboard is not supported. We recomment to use cURL to interact with AIMLFW components.
-Details are provided in further section for each operation required for model training.
+Run the following script to uninstall the `traininghost`:
.. code:: bash
- http://localhost:32005/
+ bin/uninstall_traininghost.sh
-In case of any change required in the RECIPE_EXAMPLE/example_recipe_latest_stable.yaml file after installation,
-the following steps can be followed to reinstall with new changes.
+To update the AIMLFW component, you need to follow a series of steps to ensure that the new changes are properly installed and integrated.
.. code:: bash
+ # Step 1: Uninstall the existing AIMLFW component
bin/uninstall.sh
- bin/install.sh -f RECIPE_EXAMPLE/example_recipe_latest_stable.yaml
-
-Software Uninstallation
------------------------
+ # Step 2: Update the RECIPE_EXAMPLE/example_recipe_latest_stable.yaml file
+ # Make necessary changes to the recipe file here
-.. code:: bash
+ # Step 3: Reinstall the AIMLFW component with the updated recipe
+ bin/install.sh -f RECIPE_EXAMPLE/example_recipe_latest_stable.yaml
- bin/uninstall_traininghost.sh
.. _install-influx-db-as-datalake:
.. _reference2:
-Install Influx DB as datalake (Optional)
-----------------------------------------
+DataLake Installation
+----------------------
+
+In the context of AIMLFW, a datalake can be used to store and manage large amounts of data generated by various sources.
+
+This section provides a detailed guide on how to install and configure a datalake for AIMLFW. Currently we support following methods to injest data for model-training: Standalone InfluxDB Installation and Prepare Non-RT RIC DME as a Data Source for AIMLFW.
+
-Standalone Influx DB installation can be used if DME is not used as a data source.
+1. Install Influx DB as datalake (Optional)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Standalone Influx DB can be installed using the following commands:
.. code:: bash
helm repo add bitnami https://charts.bitnami.com/bitnami
helm install my-release bitnami/influxdb --version 5.13.5
- kubectl exec -it <pod name> bash
+
-From below command we can get username, org name, org id and access token
+ ~$ kubectl get pods
-.. code:: bash
+ NAME READY STATUS RESTARTS AGE
+ my-release-influxdb-85888dfd97-77dwg 1/1 Running 0 15m
- cat bitnami/influxdb/influxd.bolt | tr -cd "[:print:]"
+Use the following command to get `INFLUX_DB_TOKEN` which is required while creating feature-group.
-eg: {"id":"0a576f4ba82db000","token":"xJVlOom1GRUxDNkldo1v","status":"active","description":"admin's Token","orgID":"783d5882c44b34f0","userID":"0a576f4b91edb000","permissions" ...
+.. code:: bash
-Use the tokens further in the below configurations and in the recipe file.
+ kubectl get secret my-release-influxdb -o jsonpath="{.data.admin-user-token}" | base64 --decode
-Following are the steps to add qoe data to Influx DB.
+**This section provides a detailed guide to onboard test-data to execute model-training.**
Execute below from inside Influx DB container to create a bucket:
.. code:: bash
- influx bucket create -n UEData -o primary -t <token>
+ # INFLUX_DB_TOKEN is referred to the influxDb-token collected in previous step:
+ kubectl exec -it <influxdb-pod-name> -- influx bucket create -n UEData -o primary -t <INFLUX_DB_TOKEN>
+Note: This Bucket name `UEData` will be reffered while creating featureGroup in further-steps.
-Install the following dependencies
+
+Install the following dependencies which is required for parsing and onboarding data from `.csv` file:
.. code:: bash
git clone -b f-release https://gerrit.o-ran-sc.org/r/ric-app/qp
cd qp/qp
-Update :file:`insert.py` file with the following content:
+Overwrite :file:`insert.py` file with the following content:
.. code-block:: python
class INSERTDATA:
def __init__(self):
- self.client = InfluxDBClient(url = "http://localhost:8086", token="<token>")
+ self.client = InfluxDBClient(url = "http://localhost:8086", token="<INFLUX_DB_TOKEN>")
def explode(df):
populatedb()
-Update ``<token>`` in :file:`insert.py` file
+Update ``<INFLUX_DB_TOKEN>`` in :file:`insert.py` with the influxDb-token collected in previous step.
-Follow below command to port forward to access Influx DB
+Follow below command to port forward for the script to access Influx DB (as no NodePort is exposed for InfluxDb)
.. code:: bash
kubectl port-forward svc/my-release-influxdb 8086:8086
-To insert data:
+Execute the following script to onboard test-data to local influxDb:
.. code:: bash
.. code:: bash
- influx query 'from(bucket: "UEData") |> range(start: -1000d)' -o primary -t <token>
-
-
+ # Token is referred to the acess-token collected in previous step:
+ kubectl exec -it <influxdb-pod-name> -- influx query 'from(bucket: "UEData") |> range(start: -1000d)' -o primary -t <INFLUX_DB_TOKEN>
-.. _reference3:
-
-Prepare Non-RT RIC DME as data source for AIMLFW (optional)
------------------------------------------------------------
-Bring up the RANPM setup by following the steps mentioned in the file install/README.md present in the repository `RANPM repository <https://gerrit.o-ran-sc.org/r/admin/repos/nonrtric/plt/ranpm>`__
+ Result: _result
+ Table: keys: [_start, _stop, _field, _measurement]
+ _start:time _stop:time _field:string _measurement:string _time:time _value:int
+ ------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ --------------------------
+ 2022-05-18T12:52:18.008858111Z 2025-02-11T12:52:18.008858111Z availPrbDl liveCell 2025-01-23T17:01:22.563381000Z 45
+ 2022-05-18T12:52:18.008858111Z 2025-02-11T12:52:18.008858111Z availPrbDl liveCell 2025-01-23T17:01:22.573381000Z 91
+ 2022-05-18T12:52:18.008858111Z 2025-02-11T12:52:18.008858111Z availPrbDl liveCell 2025-01-23T17:01:22.583381000Z 273
+ 2022-05-18T12:52:18.008858111Z 2025-02-11T12:52:18.008858111Z availPrbDl liveCell 2025-01-23T17:01:22.593381000Z 53
-Once all the pods are in running state, follow the below steps to prepare ranpm setup for AIMLFW qoe usecase data access
-The scripts files are present in the folder demos/hrelease/scripts of repository `AIMLFW repository <https://gerrit.o-ran-sc.org/r/admin/repos/aiml-fw/aimlfw-dep>`__
-
-Note: The following steps need to be performed in the VM where the ranpm setup is installed.
-
-.. code:: bash
- git clone "https://gerrit.o-ran-sc.org/r/aiml-fw/aimlfw-dep"
- cd aimlfw-dep/demos/hrelease/scripts
- ./get_access_tokens.sh
-Output of ./get_access_tokens.sh can be used during feature group creation step.
+.. _reference3:
-Execute the below script
-
-.. code:: bash
-
- ./prepare_env_aimlfw_access.sh
-
-Add feature group from AIMLFW dashboard, example on how to create a feature group is shown in this demo video: `Feature group creation demo <https://lf-o-ran-sc.atlassian.net/wiki/download/attachments/13697168/feature_group_create_final_lowres.mp4?api=v2>`__
-
-Execute below script to push qoe data into ranpm setup
-
-.. code:: bash
-
- ./push_qoe_data.sh <source name mentioned when creating feature group> <Number of rows> <Cell Identity>
-
-Example for executing above script
-
-.. code:: bash
-
- ./push_qoe_data.sh gnb300505 30 c4/B2
-
-Steps to check if data is upload correctly
-
-
-.. code:: bash
-
- kubectl exec -it influxdb2-0 -n nonrtric -- bash
- influx query 'from(bucket: "pm-logg-bucket") |> range(start: -1000000000000000000d)' |grep pdcpBytesDl
+Prepare Non-RT RIC DME as data source for AIMLFW (optional)
+-----------------------------------------------------------
-Steps to clear the data in InfluxDB
+Please refer to the `RANPM Installation Guide <https://docs.o-ran-sc.org/projects/o-ran-sc-aiml-fw-aimlfw-dep/en/latest/ranpm-installation.html>`__ to install NonRtRic's RANPM and Prepare the DME as a data-soruce for AIMLFW.
-.. code:: bash
-
- kubectl exec -it influxdb2-0 -n nonrtric -- bash
- influx delete --bucket pm-logg-bucket --start 1801-01-27T05:00:22.305309038Z --stop 2023-11-14T00:00:00Z
-Feature group creation
+Feature Group Creation
----------------------
-From AIMLFW dashboard create feature group (Training Jobs-> Create Feature Group ) Or curl
+A Feature Group is a logical entity to represent structured dataset, often stored in a Feature Store, to ensure consistency and reusability across different ML models and pipelines.
+
-NOTE: Here is a curl request to create feature group using curl
+Following is the cURL request to create a feature group.
.. code:: bash
--data '{
"featuregroup_name": "<Name of the feature group>",
"feature_list": "<Features in a comma separated format>",
- "datalake_source": "InfluxSource",
+ "datalake_source": "<DATALAKE_SOURCE>",
"enable_dme": <True for DME use, False for Standalone Influx DB>,
"host": "<IP of VM where Influx DB is installed>",
"port": "<Port of Influx DB>",",
- "dme_port": "",
+ "dme_port": "<If DME is True, then it refers to the Nodeport of InformationService (in RANPM)>",
"bucket": "<Bucket Name>",
"token": "<INFLUX_DB_TOKEN>",
- "source_name": "<any source name. but same needs to be given when running push_qoe_data.sh>",
- "measured_obj_class": "",
- "measurement": "<Measurement of the db>",
+ "source_name": "<If DME is True, any source name. but same needs to be given when running push_qoe_data.sh>",
+ "measured_obj_class": "<Applicable in case of DME>",
+ "measurement": "<Measurement of the db that contains your features>",
"db_org": "<Org of the db>"
}'
-NOTE: Below are some example values to be used for the DME based feature group creation for qoe usecase
+
+Below are two examples covering supported scenarios for Data Injestion.
+
+**1. Non-RT RIC DME based feature group creation for Qoe Usecase**
.. code:: bash
"db_org": "est"
} '
-NOTE: Below are some example values to be used for the standalone influx DB creation for qoe usecase. Dme is not used in this example.
+**2. Standalone Influx DB based feature group creation for Qoe Usecase.**
.. code:: bash
"db_org": "primary"
}'
-Register Model (compulsory)
----------------------------
+Register Model
+---------------
+
+A model MUST be registered to the Model-Management-Service (MME) before submitting any training request.
+A model is uniquely identified by modelName and modelVersion.
+Following is the sample cURL request to be used for registering the model.
-Register the model using the below steps using Model management service for training.
.. code:: bash
}
}'
+ # inputDataType & outputDataType represents the input(features) & output for trainedModels.
+ # Note: Currently, outputDataType is not fucntionality used in implementation.
+
Model Discovery
---------------
-Model discovery can be done using the following API endpoint:
+This section describes model-discovery and its various options.
-
-To fetch all registered models, use the following API endpoint:
+a. To fetch all registered models, use the following API endpoint:
.. code:: bash
curl --location 'http://<VM IP where AIMLFW is installed>:32006/ai-ml-model-discovery/v1/models'
-To fetch models with model name , use the following API endpoint:
+b. To fetch models with modelName, use the following API endpoint:
.. code:: bash
curl --location 'http://<VM IP where AIMLFW is installed>:32006/ai-ml-model-discovery/v1/models?model-name=<model_name>'
-To fetch specific model, use the following API endpoint:
+c. To fetch specific model, use the following API endpoint:
.. code:: bash
curl --location 'http://<VM IP where AIMLFW is installed>:32006/ai-ml-model-discovery/v1/models?model-name=<model_name>&&model-version=<model_version>'
+Onboarding Training/Re-Training Pipelines
+-------------------------------------------
+
+Training and retraining pipelines in AIMLFW (AI/ML Framework for O-RAN SC) are structured sequences of steps designed to train or retrain ML models. These pipelines automate the execution of data processing, model training, evaluation, and storage, ensuring a streamlined workflow.
+
+1. Onboard Pre-Existing Pipeline
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+
+2. Onboard Custom Pipeline
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+| To use a training/retraining pipeline in AIMLFW, it needs to be onboarded into the system. This involves the following steps:
+| **a. Pipeline Definition**: A pipeline must be defined in code (e.g., Python) using Kubeflow Pipelines SDK. It includes all necessary ML steps, such as data ingestion, preprocessing, training, and model deployment.
+| **b. Pipeline Registration**: The pipeline must be registered in Kubeflow Pipelines so that AIMLFW can utilize it. This is done by compiling the pipeline code and uploading it to the pipeline repository.
+| **c. Pipeline Invocation**: During training instance creation, users must specify the required pipeline.AIMLFW uses the registered pipeline to launch a training/retraining job.
+
+Following is a sample `pseudo-code` for a custom pipeline which user can implement and onboard.
+
+.. code:: python
+
+ from kfp import dsl
+ from kfp.compiler import Compiler
+ from kubernetes import client as k8s_client
+
+ @dsl.pipeline(
+ name="Model Training Pipeline",
+ description="A sample pipeline for training a machine learning model"
+ )
+ def training_pipeline():
+ # Implement the trainingPipeline Here
+
+
+ # Compile the pipeline to yaml-file
+ Compiler().compile(training_pipeline, "<OutputFile.yaml>")
+
+ # Upload Pipeline to AIMLFW
+ import requests
+ requests.post("http://<VM-Ip where AIMLFW is installed>:32002/pipelines/<Training_Pipeline_Name>/upload", files={'file':open("<OutputFile.yaml>",'rb')})
+
+
+
+One can refer `kubeflow documentation <https://www.kubeflow.org/docs/components/pipelines/>`__ for implementing your pipeline.
+
+
+
Training job creation with DME or Standalone InfluxDB as data source
--------------------------------------------------------------------
-#. AIMLFW should be installed by following steps in section :ref:`Software Installation and Deployment <reference1>`
-#. RANPM setup should be installed and configured as per steps mentioned in section :ref:`Prepare Non-RT RIC DME as data source for AIMLFW <reference3>`
-#. After training job is created and executed successfully, model can be deployed using steps mentioned in section :ref:`Deploy trained qoe prediction model on Kserve <reference4>` or
- :ref:`Steps to deploy model using Kserve adapter <reference6>`
-
NOTE: The QoE training function does not come pre uploaded, we need to go to training function, create training function and run the qoe-pipeline notebook.
+The TrainingJob
+
+.. code:: bash
+
+ curl --location 'http://<VM IP where AIMLFW is installed>:32002/ai-ml-model-training/v1/training-jobs' \
+ --header 'Content-Type: application/json' \
+ --data '{
+ "modelId":{
+ "modelname": "modeltest15",
+ "modelversion": "1"
+ },
+ "model_location": "",
+ "training_config": {
+ "description": "trainingjob for testing",
+ "dataPipeline": {
+ "feature_group_name": <Name of FeatureGroup created >,
+ "query_filter": "<This query-filter will be used to filter/transform your features>",
+ "arguments": "{'epochs': 1}"
+ },
+ "trainingPipeline": {
+ "training_pipeline_name": "qoe_Pipeline_testing_1",
+ "training_pipeline_version": "qoe_Pipeline_testing_1",
+ "retraining_pipeline_name":"qoe_Pipeline_retrain",
+ "retraining_pipeline_version":"2"
+ }
+ },
+ "training_dataset": "",
+ "validation_dataset": "",
+ "notification_url": "",
+ "consumer_rapp_id": "",
+ "producer_rapp_id": ""
+ }'
+
+
.. code:: bash
curl --location 'http://<VM IP where AIMLFW is installed>:32002/ai-ml-model-training/v1/training-jobs' \