DataFrame中的count()函数，以及常用的统计方法

fly_Xiaoma

100788人浏览 · 2019-04-05 21:08:16

fly_Xiaoma · 2019-04-05 21:08:16 发布

count()函数

pandas.DataFrame.count

DataFrame中常见的其他方法：

分组统计

count()函数

官方API为：

pandas.DataFrame.count

DataFrame.count(axis=0, level=None, numeric_only=False)[source]

Count non-NA cells for each column or row.

The values None, NaN, NaT, and optionally numpy.inf (depending on pandas.options.mode.use_inf_as_na) are considered NA.

Parameters:	axis : {0 or ‘index’, 1 or ‘columns’}, default 0 If 0 or ‘index’ counts are generated for each column. If 1 or ‘columns’ counts are generated for each row. level : int or str, optional If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a DataFrame. A str specifies the level name. numeric_only : boolean, default False Include only float, int or boolean data.
Returns:	Series or DataFrame For each column/row the number of non-NA/null entries. If level is specified returns a DataFrame.

Parameters:

axis : {0 or ‘index’, 1 or ‘columns’}, default 0

If 0 or ‘index’ counts are generated for each column. If 1 or ‘columns’ counts are generated for each row.

level : int or str, optional

If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a DataFrame. A str specifies the level name.

numeric_only : boolean, default False

Include only float, int or boolean data.

Returns:

Series or DataFrame

For each column/row the number of non-NA/null entries. If level is specified returns a DataFrame.

See also

Series.count

Number of non-NA elements in a Series.

DataFrame.shape

Number of DataFrame rows and columns (including NA elements).

DataFrame.isna

Boolean same-sized DataFrame showing places of NA elements.

*****************************************翻译一下******************************************

pandas.DataFrame.count

DataFrame。计数(轴= 0,水平= None, numeric_only = False)[源]

计算每一列或每一行的非na细胞。

值None、NaN、NaT和可选的numpy。inf(取决于pandas.options.mode.use_inf_as_na)被认为是NA。

参数:

轴:{0或' index '， 1或' columns '}，默认为0

如果为每个列生成0或' index '计数。如果为每一行生成1个或“列”计数。

级别:int或str，可选

如果轴是一个多索引(层次结构)，则沿着特定的级别计数，折叠成一个数据aframe。str指定级别名称。

numeric_only:布尔值，默认为False

只包含浮点数、int或boolean数据。

返回:

系列或DataFrame

对于每一列/行，非na /null项的数量。如果指定level，则返回一个DataFrame。

另请参阅

Series.count

一个数列中非na元素的个数。

DataFrame.shape

数据aframe行和列的数量(包括NA元素)。

DataFrame.isna

布尔相同大小的数据aframe显示NA元素的位置。

******************************************给出的例子****************************************************

1、

df = pd.DataFrame({"Person":
...                    ["John", "Myla", "Lewis", "John", "Myla"],
...                    "Age": [24., np.nan, 21., 33, 26],
...                    "Single": [False, True, True, True, False]})
>>> df
   Person   Age  Single
0    John  24.0   False
1    Myla   NaN    True
2   Lewis  21.0    True
3    John  33.0    True
4    Myla  26.0   False

2、统计NA

>>> df.count()
Person    5
Age       4
Single    5
dtype: int64

3、针对每一行，进行统计

df.count(axis='columns')
0    3
1    2
2    3
3    3
4    3
dtype: int64

注意：这里axis='columns'表示按“列”操作，相当于axis=0；如果axis=1,对每一行进行操作

4、计算多索引的一个级别

>>> df.set_index(["Person", "Single"]).count(level="Person")
        Age
Person
John      2
Lewis     1
Myla      1

DataFrame中常见的其他方法：

df.count() #非空元素计算
df.min() #最小值
df.max() #最大值
df.idxmin() #最小值的位置，类似于R中的which.min函数
df.idxmax() #最大值的位置，类似于R中的which.max函数
df.quantile(0.1) #10%分位数
df.sum() #求和
df.mean() #均值
df.median() #中位数
df.mode() #众数
df.var() #方差
df.std() #标准差
df.mad() #平均绝对偏差
df.skew() #偏度
df.kurt() #峰度
df.describe() #一次性输出多个描述性统计指标

分组统计

df.groupby('Person').sum()

GitCode 开源社区

旨在为数千万中国开发者提供一个无缝且高效的云端环境，以支持学习、使用和贡献开源项目。

更多推荐

[转载]在Windows环境下安装GNU Radio

转自：在Windows环境下安装GNURadio_恐弱智_新浪博客GNU Radio是用Python开发的，大部分开源的工程能够在Linux环境下运行良好，而Windows下却运行的很勉强，而且安装配置都很复杂。GNU Radio算是个例外了，不光提供了Windows的二进制安装，还有比较详细的说明。我是Python小白，所以折腾了好久才弄好，特意记录下来，免得以后再装还折腾。GNU Radio的

GitCode 开源社区

centOS 8 使用dnf安装Docker

DNF是什么？CentOS 8使用YUM软件包管理器版本v4.0.4。现在，该版本使用DNF(已删除YUM)。DNF是软件包管理器。它会在Linux发行版上安装，执行更新并删除软件包。使用DNF安装Docker跳过具有损坏依赖性的程序包一个有效的解决方案是使您的CentOS 8系统使用以下--nobest命令安装最符合条件的版本：sudo dnf install docker...

GitCode 开源社区

定时同步数据库表(mysql+linux+crontab)

sync.sh里面的参数需要改变，ip/username/password/database/tablesync.sh#!/bin/sh# Please change the IP and password of the data source db.# Then change the table name.filename=/home/nington/db/$(date +%Y-%m