Use OpenVINO with your Intel integrated GPU
To use OpenVINO with your Intel integrated GPU, you typically follow 4 main steps: install it, prepare a model, convert the model to OpenVINO format, and run inference on CPU/GPU.
Below is a simple beginner workflow.
1️⃣ Install OpenVINO
The easiest way is with Python + pip.
Install core OpenVINO
pip install openvino
Install extra tools (model conversion, optimization)
pip install openvino-dev
Verify installation:
python -c "import openvino as ov; print(ov.__version__)"
2️⃣ Check available hardware
OpenVINO can use:
-
CPU
-
Intel integrated GPU
-
Intel NPU (on new chips)
Run this Python script:
import openvino as ov
core = ov.Core()
print(core.available_devices)
Example output:
['CPU', 'GPU']
If you see GPU, your Intel iGPU is supported.
3️⃣ Get a model
OpenVINO works with models from:
-
Hugging Face
-
PyTorch
-
TensorFlow
-
ONNX
Example small models:
-
Phi-3 Mini
-
Qwen2 3B
These are better for integrated GPUs.
4️⃣ Convert the model to OpenVINO format
OpenVINO uses IR format (.xml + .bin).
Example conversion:
from openvino.tools import mo
Or CLI:
mo --input_model model.onnx
Output:
model.xml
model.bin
These files are optimized for OpenVINO.
5️⃣ Run inference on GPU
Example Python code:
import openvino as ov
core = ov.Core()
model = core.read_model("model.xml")
compiled_model = core.compile_model(model, "GPU")
infer_request = compiled_model.create_infer_request()
result = infer_request.infer({"input": input_data})
print(result)
Device options:
"CPU"
"GPU"
"AUTO"
Best option:
compiled_model = core.compile_model(model, "AUTO")
This lets OpenVINO choose the fastest hardware.
6️⃣ Example architecture
Typical workflow:
HuggingFace model
↓
convert to ONNX
↓
convert to OpenVINO IR
↓
OpenVINO runtime
↓
CPU + Intel GPU acceleration
⚠️ Important for LLMs
Intel integrated GPUs have limited memory.
Typical limits:
| Model | Feasible |
|---|---|
| 3B | good |
| 7B | possible |
| 13B | usually too large |
So you should prefer small models.
✅ If your goal is OpenClaw replacement
A good stack is:
OpenClaw
↓
OpenVINO inference server
↓
Phi-3 / Qwen2
↓
Intel CPU + iGPU
💡 If you want, I can also show you the easiest way to run OpenVINO LLMs without coding (only 3 commands). This method is what most people use now.
AtomGit 是由开放原子开源基金会联合 CSDN 等生态伙伴共同推出的新一代开源与人工智能协作平台。平台坚持“开放、中立、公益”的理念,把代码托管、模型共享、数据集托管、智能体开发体验和算力服务整合在一起,为开发者提供从开发、训练到部署的一站式体验。
更多推荐



所有评论(0)