概述

1.  什么是Operating Performance Points?

如今复杂的Soc由多个并行工作的子模块组成。在一个执行各种用例的操作系统中,不是Soc中的所有模块都一直以其最高的执行频率工作。为了实现这一目的,Soc中的子模块被分组成域,允许一些域以较低的频率和电压运行,而其他的域运行在较高的电压和频率上。将域中每个设备支持的电压和频率的离散元组的集合称为Operating Performance Points(OPP)。

举例如下:
假设一个MPU设备支持如下的电压和频率关系:
{300MHz at minimum voltage of 1V}
{800MHz at minimum voltage of 1.2V}
{1GHz at minimum voltage of 1.3V}

用OPP表示就可以用{Hz,  uV}方式表示如下:
{300000000, 1000000} 
{800000000, 1200000}
{1000000000, 1300000}

2.  Operating Performance Points Library

OPP library提供了一系列辅助函数去管理和查询设备的OPP信息。OPP library的源代码路径在drivers/base/power/opp.c,头文件路径在include/linux/pm_opp.h中。OPP library功能可以通过kernel config:  CONFOG_PM_OPP去使能。

opp library的典型用法如下:
a.  用户为设备(比如CPU)配置/注册一些默认的opp信息。
b.  Soc会根据具体的运行情况,通过opp层去改变/查询设备的opp信息。

数据结构

Linux系统使用struct dev_pm_opp结构表示一个opp描述结构
struct dev_pm_opp {
	struct list_head node;

	bool available;
	unsigned long rate;
	unsigned long u_volt;

	struct device_opp *dev_opp;
	struct rcu_head head;
};
node:         用于链表管理此设备下的opp。
available:   用于判断此opp使能可以使用。
rate:           频率,单位Hz
u_volt:       电压。
dev_opp:   struct device_opp类型指针,指向此opp所属的设备。

Linux系统使用struct device_opp结构表示opp设备。
struct device_opp {
	struct list_head node;

	struct device *dev;
	struct srcu_notifier_head head;
	struct list_head opp_list;
};
.node:       用于将所有的opp设备使用dev_opp_list链表管理。
.dev:         设备指针。
.head:       opp设备的通知链。
.opp_list:  opp设备具有的opp数据信息。

opp layer数据组织格式

Internal data structure organization with the OPP layer library is as follows:
dev_opp_list (root)
|- device 1 (represents voltage domain 1)
|	|- opp 1 (availability, freq, voltage)
|	|- opp 2 ..
...	...
|	`- opp n ..
|- device 2 (represents the next voltage domain)
...
`- device m (represents mth voltage domain)
device 1, 2.. are represented by dev_opp structure while each opp is represented by the opp structure.
可以看到所有的opp设备使用dev_opp_list组织,在每个opp设备(struct dev_opp)中存在着不同的opp数据信息,这些opp信息使用struce dev_pm_opp表示。这样就形成一个树状的结构。查找合适的opp信息,可以从树的root节点查起。
opp library维护一张内部的列表,Soc框架只需要填充和访问opp数据。但是struct dev_opp和struct dev_pm_opp结构只是在opp library内部使用,外部不需要关心其内部。

API接口说明

  • 添加一个opp table(dev_pm_opp_add)
int dev_pm_opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
{
	struct device_opp *dev_opp = NULL;
	struct dev_pm_opp *opp, *new_opp;
	struct list_head *head;

	/* allocate new OPP node */
	new_opp = kzalloc(sizeof(*new_opp), GFP_KERNEL);
	if (!new_opp) {
		dev_warn(dev, "%s: Unable to create new OPP node\n", __func__);
		return -ENOMEM;
	}

	/* Hold our list modification lock here */
	mutex_lock(&dev_opp_list_lock);

	/* Check for existing list for 'dev' */
	dev_opp = find_device_opp(dev);
	if (IS_ERR(dev_opp)) {
		/*
		 * Allocate a new device OPP table. In the infrequent case
		 * where a new device is needed to be added, we pay this
		 * penalty.
		 */
		dev_opp = kzalloc(sizeof(struct device_opp), GFP_KERNEL);
		if (!dev_opp) {
			mutex_unlock(&dev_opp_list_lock);
			kfree(new_opp);
			dev_warn(dev,
				"%s: Unable to create device OPP structure\n",
				__func__);
			return -ENOMEM;
		}

		dev_opp->dev = dev;
		srcu_init_notifier_head(&dev_opp->head);
		INIT_LIST_HEAD(&dev_opp->opp_list);

		/* Secure the device list modification */
		list_add_rcu(&dev_opp->node, &dev_opp_list);
	}

	/* populate the opp table */
	new_opp->dev_opp = dev_opp;
	new_opp->rate = freq;
	new_opp->u_volt = u_volt;
	new_opp->available = true;

	/*
	 * Insert new OPP in order of increasing frequency
	 * and discard if already present
	 */
	head = &dev_opp->opp_list;
	list_for_each_entry_rcu(opp, &dev_opp->opp_list, node) {
		if (new_opp->rate <= opp->rate)
			break;
		else
			head = &opp->node;
	}

	/* Duplicate OPPs ? */
	if (new_opp->rate == opp->rate) {
		int ret = opp->available && new_opp->u_volt == opp->u_volt ?
			0 : -EEXIST;

		dev_warn(dev, "%s: duplicate OPPs detected. Existing: freq: %lu, volt: %lu, enabled: %d. New: freq: %lu, volt: %lu, enabled: %d\n",
			 __func__, opp->rate, opp->u_volt, opp->available,
			 new_opp->rate, new_opp->u_volt, new_opp->available);
		mutex_unlock(&dev_opp_list_lock);
		kfree(new_opp);
		return ret;
	}

	list_add_rcu(&new_opp->node, head);
	mutex_unlock(&dev_opp_list_lock);

	/*
	 * Notify the changes in the availability of the operable
	 * frequency/voltage list.
	 */
	srcu_notifier_call_chain(&dev_opp->head, OPP_EVENT_ADD, new_opp);
	return 0;
}
此函数有三个参数,第一个参数是为那个设备添加oppinfo,第二个是频率,第三个是电压值。
1.   分配一个新的dev_pm_opp结构,用于存放传入的频率和电压数据。
2.   通过find_device_opp函数在dev_opp_list链表中查找此设备是否已经注册。如果注册直接将opp info添加到设备中。如果没有注册则重新分配一个struct device_opp结构,进行一系列初始化。
3.   填充opp数据,包括电压,频率,是否使能,所属的设备。
4.   按照递增的顺序插入opp数据到opp table中,如果插入的opp数据已经存在返回相应的结果。
5.   调用通知链,通知有新的opp添加进来。

  • opp 查询相关接口
  • dev_pm_opp_find_freq_exact(返回指定freq的opp)
  • dev_pm_opp_find_freq_floor(返回小于或者等于指定freq的opp,返回时从参数freq中返回实际获取的freq)
  • dev_pm_opp_find_freq_ceil (返回大于或者等于指定freq的opp,返回时从参数freq中返回实际获取的freq)
此三个函数大致一样,不过dev_pm_opp_find_freq_exact函数会根据参数available,返回处于disable的opp。而剩余的两个函数只返回enable的opp。
struct dev_pm_opp *dev_pm_opp_find_freq_ceil(struct device *dev,
					     unsigned long *freq)
{
	struct device_opp *dev_opp;
	struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);

	if (!dev || !freq) {
		dev_err(dev, "%s: Invalid argument freq=%p\n", __func__, freq);
		return ERR_PTR(-EINVAL);
	}

	dev_opp = find_device_opp(dev);
	if (IS_ERR(dev_opp))
		return ERR_CAST(dev_opp);

	list_for_each_entry_rcu(temp_opp, &dev_opp->opp_list, node) {
		if (temp_opp->available && temp_opp->rate >= *freq) {
			opp = temp_opp;
			*freq = opp->rate;
			break;
		}
	}

	return opp;
}
1.   得opp的设备。如果没有返回错误
2.   从dev_opp的opp_list中逐一判断,如果此频率大于等于传入的参数freq,则返回此opp。同时通过参数freq返回opp的rate值。

  • 使能/禁止设备的opp(dev_pm_opp_enable/dev_pm_opp_disable)
static int opp_set_availability(struct device *dev, unsigned long freq,
		bool availability_req)
{
	struct device_opp *tmp_dev_opp, *dev_opp = ERR_PTR(-ENODEV);
	struct dev_pm_opp *new_opp, *tmp_opp, *opp = ERR_PTR(-ENODEV);
	int r = 0;

	/* keep the node allocated */
	new_opp = kmalloc(sizeof(*new_opp), GFP_KERNEL);
	if (!new_opp) {
		dev_warn(dev, "%s: Unable to create OPP\n", __func__);
		return -ENOMEM;
	}

	mutex_lock(&dev_opp_list_lock);

	/* Find the device_opp */
	list_for_each_entry(tmp_dev_opp, &dev_opp_list, node) {
		if (dev == tmp_dev_opp->dev) {
			dev_opp = tmp_dev_opp;
			break;
		}
	}
	if (IS_ERR(dev_opp)) {
		r = PTR_ERR(dev_opp);
		dev_warn(dev, "%s: Device OPP not found (%d)\n", __func__, r);
		goto unlock;
	}

	/* Do we have the frequency? */
	list_for_each_entry(tmp_opp, &dev_opp->opp_list, node) {
		if (tmp_opp->rate == freq) {
			opp = tmp_opp;
			break;
		}
	}
	if (IS_ERR(opp)) {
		r = PTR_ERR(opp);
		goto unlock;
	}

	/* Is update really needed? */
	if (opp->available == availability_req)
		goto unlock;
	/* copy the old data over */
	*new_opp = *opp;

	/* plug in new node */
	new_opp->available = availability_req;

	list_replace_rcu(&opp->node, &new_opp->node);
	mutex_unlock(&dev_opp_list_lock);
	kfree_rcu(opp, head);

	/* Notify the change of the OPP availability */
	if (availability_req)
		srcu_notifier_call_chain(&dev_opp->head, OPP_EVENT_ENABLE,
					 new_opp);
	else
		srcu_notifier_call_chain(&dev_opp->head, OPP_EVENT_DISABLE,
					 new_opp);

	return 0;

unlock:
	mutex_unlock(&dev_opp_list_lock);
	kfree(new_opp);
	return r;
}
通过低三个参数availability_req去判断去enable/disable opp。
1.   分配一个新的dev_pm_opp结构。
2.   从dev_opp_list中找到opp设备,找不到opp设备就返回错误。
3.   从opp_list中通过opp的freq逐一去比较freq的值。如果找到,获得到当前的opp结构。
4.   如果当前opp的状态与传入的状态一致,则退出。
5.   将找到的opp数据拷贝到分配的新的opp中,然后设置opp的状态。
6.   替换新的opp,然后释放旧的opp数据。
7.   根据参数发送相应的OPP_EVENT_ENABLE/OPP_EVENT_DISABLE通知链。

  • of_init_opp_table(从dt中初始化opp table)
先列举一个opp在dt中的格式:
cpu@0 { compatible = "arm,cortex-a9";
	reg = <0>;
	next-level-cache = <&L2>;
	operating-points = < 
                  /* kHz    uV */ 
                  792000  1100000 
                  396000  950000 
                  198000  850000 
           >; 
}; 
可以看到有三组频率和电压的组合,使用operating-points节点表示。从dt中解析opp的代码如下:
int of_init_opp_table(struct device *dev)
{
	const struct property *prop;
	const __be32 *val;
	int nr;

	prop = of_find_property(dev->of_node, "operating-points", NULL);
	if (!prop)
		return -ENODEV;
	if (!prop->value)
		return -ENODATA;

	/*
	 * Each OPP is a set of tuples consisting of frequency and
	 * voltage like <freq-kHz vol-uV>.
	 */
	nr = prop->length / sizeof(u32);
	if (nr % 2) {
		dev_err(dev, "%s: Invalid OPP list\n", __func__);
		return -EINVAL;
	}

	val = prop->value;
	while (nr) {
		unsigned long freq = be32_to_cpup(val++) * 1000;
		unsigned long volt = be32_to_cpup(val++);

		if (dev_pm_opp_add(dev, freq, volt))
			dev_warn(dev, "%s: Failed to add OPP %ld\n",
				 __func__, freq);
		nr -= 2;
	}

	return 0;
}
找到operation-points节点,调用dev_pm_opp_add函数添加opp数据。


GitHub 加速计划 / li / linux-dash
10.39 K
1.2 K
下载
A beautiful web dashboard for Linux
最近提交(Master分支:2 个月前 )
186a802e added ecosystem file for PM2 4 年前
5def40a3 Add host customization support for the NodeJS version 4 年前
Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐