To use OpenVINO with your Intel integrated GPU, you typically follow 4 main steps: install it, prepare a model, convert the model to OpenVINO format, and run inference on CPU/GPU.

Below is a simple beginner workflow.


1️⃣ Install OpenVINO

The easiest way is with Python + pip.

Install core OpenVINO

pip install openvino

Install extra tools (model conversion, optimization)

pip install openvino-dev

Verify installation:

python -c "import openvino as ov; print(ov.__version__)"

2️⃣ Check available hardware

OpenVINO can use:

  • CPU

  • Intel integrated GPU

  • Intel NPU (on new chips)

Run this Python script:

import openvino as ov

core = ov.Core()
print(core.available_devices)

Example output:

['CPU', 'GPU']

If you see GPU, your Intel iGPU is supported.


3️⃣ Get a model

OpenVINO works with models from:

  • Hugging Face

  • PyTorch

  • TensorFlow

  • ONNX

Example small models:

  • Phi-3 Mini

  • Qwen2 3B

These are better for integrated GPUs.


4️⃣ Convert the model to OpenVINO format

OpenVINO uses IR format (.xml + .bin).

Example conversion:

from openvino.tools import mo

Or CLI:

mo --input_model model.onnx

Output:

model.xml
model.bin

These files are optimized for OpenVINO.


5️⃣ Run inference on GPU

Example Python code:

import openvino as ov

core = ov.Core()

model = core.read_model("model.xml")

compiled_model = core.compile_model(model, "GPU")

infer_request = compiled_model.create_infer_request()

result = infer_request.infer({"input": input_data})

print(result)

Device options:

"CPU"
"GPU"
"AUTO"

Best option:

compiled_model = core.compile_model(model, "AUTO")

This lets OpenVINO choose the fastest hardware.


6️⃣ Example architecture

Typical workflow:

HuggingFace model
        ↓
convert to ONNX
        ↓
convert to OpenVINO IR
        ↓
OpenVINO runtime
        ↓
CPU + Intel GPU acceleration

⚠️ Important for LLMs

Intel integrated GPUs have limited memory.

Typical limits:

Model Feasible
3B good
7B possible
13B usually too large

So you should prefer small models.


✅ If your goal is OpenClaw replacement

A good stack is:

OpenClaw
   ↓
OpenVINO inference server
   ↓
Phi-3 / Qwen2
   ↓
Intel CPU + iGPU

💡 If you want, I can also show you the easiest way to run OpenVINO LLMs without coding (only 3 commands). This method is what most people use now.

Logo

AtomGit 是由开放原子开源基金会联合 CSDN 等生态伙伴共同推出的新一代开源与人工智能协作平台。平台坚持“开放、中立、公益”的理念,把代码托管、模型共享、数据集托管、智能体开发体验和算力服务整合在一起,为开发者提供从开发、训练到部署的一站式体验。

更多推荐