pandas怎么求pandas 某一列求和的长度

羽毛球技术 | 体育赛事 | 英文歌曲 | 住宅风水 | 用户界面设计师 | 六爻 | 书籍改编电影 | 德国足球甲级联赛 | 欧美明星 | PLC | 中国足球 | aj1 | 国家队 | 拜仁慕尼黑足球俱乐部 | 小说创作 | 配音 | iOS应用 | NBA 2K | 古典音乐 | 面相 | 火影忍者 | 武汉大学 | 土拨鼠 | 营销策划 | 秦时明月之天行九歌 | 设计师 | 巴塞罗那足球俱乐部 | 尤文图斯 | 实况足球（游戏） | 少帅 | 罗玉凤 | 比利时 | 跑鞋 | 冷知识 | 肖战 | 李元胜 | 古琴 | 按键精灵 | 罗兰 | 徐波 | 激光手术 | 角色扮演 | 关晓彤 | 微电影 | safari | 北京国安 | 古汉语 | 曼彻斯特联 | 玄幻小说 | 科幻小说 | 双眼皮手术 | 主题曲 | 年会 | 检测仪 | 徒步 | 互联网公司 | 百度输入法 | 镜头 | 宜昌市 | 自拍 | 金蝶 | 电子烟 | 网站建设 | 广播体操 | 文身 | nba篮球 | 索尼(sony) | 天体物理学 | 痛风 | 象棋 | 牛皮癣 | 皮肤护理 | 周星驰（人物） | 试管婴儿 | 亚足联亚洲杯（AFC Asian Cup） | 健美 | 美术生 | 迅雷（软件） | 战斗机 | 穿越小说 | 张璐 | 姓氏 | 诸葛亮 | 后宫·甄嬛传（书籍） | 虎牙直播 | snh48 | 阿迪达斯 | 投影仪 | 组装机 | 微信群 | 阿迪达斯(adidas) | 网球王子 | 分子生物学 | 耽美 | 武磊 | 婚礼 | 表演 | 中国武术 | 动画电影 | Air Jordan | 张子枫 | 免费软件 | 相声演员 | 摩羯座 | 宿舍 | ansys | 法国足球甲级联赛 | 户外 | 剧场版 | 杨凡 | 科幻电影 | galgame | 融资 | 关节炎 | NBA季后赛 | 神话 | 王力宏（人物） | 建模 | 计算机病毒 | 广州恒大淘宝足球俱乐部 | 北京奥运会 | 电脑电源 | 百度翻译 | 字幕 | 讯飞输入法 | 海关 | 易烊千玺 | 深度学习 | 编辑器 | 澳门特别行政区 | 直播 | 流氓软件 | 事故 | 大片 | 李景亮 | 郭富城 | 日语歌曲 | 卡牌游戏 | 小品 | 东京 | 花卉 | 音乐剧 | 互联网创业 | 占卜 | 羽毛球拍 | 婆媳关系 | 日本动画 | 巴黎 | 拳击比赛 | 东南亚 | 足球经理（FM）（游戏） | youtube | 胡歌（演员） | 地铁跑酷 | 植发 | 张继科 | 三国 | 用户界面 | 演技 | 百度竞价 | 青梅竹马 | 移动硬盘 | 韩晓鹏 | 马龙 | 瘦腿 | 宠物医疗 | 巨蟹座 | 徐峥 | 天蝎座 | 胸肌 | 赵丽颖（演员） | adidas阿迪达斯 | 低音炮 | 星际争霸（游戏） | 豆瓣电影 | 微信开放平台 | 手绘 | 吉他学习 | 江苏卫视 | 模特 | 创意 | 团队管理 | 奢侈品 | 王源 | TANK | 笛子 | 偶像 | 莱斯特城 | 维生素 | 新百伦 | 国际物流 | 前女友 | 李小龙 | 华语流行音乐 | 猎头公司 | crm | 搏击项目 | 网站运营 | 鼻炎 | 篮球游戏 |

你的位置：网站首页 >> 频道首页 >>理工学科 >>pandas怎么求pandas 某一列求和的长度

pandas怎么求pandas 某一列求和的长度

来源：蜘蛛抓取(WebSpider) 时间：2017-05-19 08:02 标签： pandas dataframe长度

Python（7）
在数据分析和机器学习的一些任务里面,对于数据集的某些列或者行丢弃，以及数据集之间的合并操作是非常常见的.
一.合并操作
Ⅰ.pandas.merge
pandas.merge(left, right, how=’inner’, on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=False, suffixes=(‘_x’, ‘_y’), copy=True, indicator=False)
作用:通过执行一个类似于数据库风格join的操作,来在columns(列)或者indexes(行)上合并DataFrame对象. 如果在columns和columns上面进行join,那么indexes就会被忽略.同样,要是在indexes和indexes之间或者indexes和columns之间进行join,那么index也会被忽略.
left : DataFrame
right : DataFrame
how : {‘left’, ‘right’, ‘outer’, ‘inner’}, default ‘inner’
left: use only keys from left frame (SQL: left outer join)
right: use only keys from right frame (SQL: right outer join)
outer: use union of keys from both frames (SQL: full outer join)
inner: use intersection of keys from both frames (SQL: inner join)
on : label or list
Field names to join on. Must be found in both DataFrames. If on is None and not merging on indexes, then it merges on the intersection of the columns by default.
left_on : label or list, or array-like
Field names to join on in left DataFrame. Can be a vector or list of vectors of the length of the DataFrame to use a particular vector as the join key instead of columns
right_on : label or list, or array-like
Field names to join on in right DataFrame or vector/list of vectors per left_on docs
left_index : boolean, default False
Use the index from the left DataFrame as the join key(s). If it is a MultiIndex, the number of keys in the other DataFrame (either the index or a number of columns) must match the number of levels
right_index : boolean, default False
Use the index from the right DataFrame as the join key. Same caveats as left_index
sort : boolean, default False
Sort the join keys lexicographically in the result DataFrame
suffixes : 2-length sequence (tuple, list, …)
Suffix to apply to overlapping column names in the left and right side, respectively
copy : boolean, default True
If False, do not copy data unnecessarily
indicator : boolean or string, default False
If True, adds a column to output DataFrame called “_merge” with information on the source of each row. If string, column with information on source of each row will be added to output DataFrame, and column will be named value of string. Information column is Categorical-type and takes on a value of “left_only” for observations whose merge key only appears in ‘left’ DataFrame, “right_only” for observations whose merge key only appears in ‘right’ DataFrame, and “both” if the observation’s merge key is found in both.
New in version 0.17.0.
merged : DataFrame
The output type will the be same as ‘left’, if it is a subclass of DataFrame.
Ⅱ.pandas.concat
二.丢弃操作
Ⅲ.pandas.DataFrame.drop
DataFrame.drop(labels, axis=0, level=None, inplace=False, errors=’raise’)
作用：返回一个指定轴上label被移除之后的对象。
labels : 一个或者一列label值
axis : int类型或者轴的名字，这个轴和labels配合起来，比如，当axis=0的时候，就是行上面的label，当axis=1的时候，就是列上面的label
level : int or level name, default None
For MultiIndex
inplace : bool, 默认是False，这个表示是不是在原始的dataframe上面做替换。要是是Ture的话，原始dataframe会变化，同时返回的是None。
errors : {‘ignore’, ‘raise’},默认是‘raise’。要是是‘ignore’的话，就不管error,已经存在的labels会被丢弃。
import numpy as np
import pandas as pd
df = pd.DataFrame({'A': ['a', 'b', 'a'], 'B': ['b', 'a', 'c'],
'C': [1, 2, 3]})
print("original:\n",df)
get1=df.drop(labels=0)
print("df:\n",df)
print("get1:\n",get1)
get2=df.drop(labels=0,inplace=True)
print("df:\n",df)
print("get1:\n",get2)
get3=df.drop(labels="A",axis=1)
print("df:\n",df)
print("get3:\n",get3)
Ⅳ.pandas.DataFrame.pop
DataFrame.pop(item)
作用：返回这个item，同时把这个item从frame里面丢弃。
1.pandas.get_dummies()
把类别量装换为指示变量(其实就是one-hot encoding)
pandas.get_dummies(data, prefix=None, prefix_sep=’_’, dummy_na=False, columns=None, sparse=False, drop_first=False)
data : 类array类型,Series或者是DataFrame类型.
prefix : 字符串,或者字符串列表,或者字符串字典.默认为None,这里应该传入一个字符串列表,且这个列表的长度是和将要被get_dummis的那些列数量是相等的.同样,prefix选项也可以是一个把列名映射到prefixes的字典.
prefix_sep : string, default ‘_’
If appending prefix, separator/delimiter to use. Or pass a list or dictionary as with prefix.
dummy_na : bool, default False
Add a column to indicate NaNs, if False NaNs are ignored.
columns : list-like, default None
Column names in the DataFrame to be encoded. If columns is None then all the columns with object or category dtype will be converted.
sparse : bool, default False
Whether the dummy columns should be sparse or not. Returns SparseDataFrame if data is a Series or if all columns are included. Otherwise returns a DataFrame with some SparseBlocks.
New in version 0.16.1.
drop_first : bool, default False
Whether to get k-1 dummies out of k categorical levels by removing the first level.
New in version 0.18.0.
dummies : DataFrame or SparseDataFrame
例1.Series
import numpy as np
import pandas as pd
s = pd.Series(list('abca'))
print("original:")
print("get dummy:")
s_dummy=pd.get_dummies(data=s)
print(s_dummy)
print("type of s_dummy:",type(s_dummy))
例2.DataFrame
import numpy as np
import pandas as pd
df = pd.DataFrame({'A': ['a', 'b', 'a'], 'B': ['b', 'a', 'c'],
'C': [1, 2, 3]})
print("original:")
df_dummy=pd.get_dummies(data=df,prefix=["A","B"])
print("get dummy:")
print(df_dummy)
三.处理缺失值
pandas使用浮点数NaN(not a number)表示浮点和非浮点数组中的缺失数据.
pandas中,自己传入的np.nan或者是python内置的None值,都会被当做NaN处理,如下例.
import numpy as np
import pandas as pd
s=pd.Series(data=["tom","jack","kate",np.nan])
Ⅰ.查找缺失值
DataFrame.isnull()
作用,返回一个和原来DataFrame一样形状的,里面值为布尔型的DataFrame.
import numpy as np
import pandas as pd
s=pd.Series(data=["tom","jack","kate",np.nan])
print(s.isnull())
print(type(s.isnull()))
df = pd.DataFrame({'A': ['a', 'b', np.nan], 'B': ['b', 'a', 'c'],
'C': [1, 2, np.nan]})
print("original:")
print(df.isnull())
Ⅱ.填充缺失值
pandas.DataFrame.fillna
使用指定的方法来填充缺失值,并且返回被填充好的DataFrame
DataFrame.fillna(value=None,method=None,axis=None,inplace=False,limit=None,downcast=None, **kwargs)
value : 可以是标量,字典,Series对象,DataFrame对象.value的作用就是用来填充那些缺失的部分.
method : 可选为{‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, 默认是None,
Method to use for filling holes in reindexed Series pad / ffill: propagate last valid observation forward to next valid backfill / bfill: use NEXT valid observation to fill gap
axis : {0 or ‘index’, 1 or ‘columns’}
inplace : 布尔值,默认为False.要是为True的话,那么就会就地修改.
limit : (对于前向填充和后向填充)可以连续填充的最大数量.
Ⅲ.丢弃缺失值
&&相关文章推荐
参考知识库
* 以上用户言论只代表其个人观点，不代表CSDN网站的观点或立场
访问：87001次
积分：1620
积分：1620
排名：千里之外
原创：72篇
评论：72条帐号:密码:下次自动登录{url:/nForum/slist.json?uid=guest&root=list-section}{url:/nForum/nlist.json?uid=guest&root=list-section}
贴数:3&分页:部落发信人: lokta (部落), 信区: Python
标&&题: 求问如何理解pandas里面的rank。
发信站: 水木社区 (Tue Jun 21 10:55:37 2016), 转信 && python for data analysis里面没看懂。
有达人知道么？
-- && ※ 来源:·水木社区 ·[FROM: 222.85.138.*]
&&&发信人: NGYxYmQ (&&&), 信区: Python
标&&题: Re: 求问如何理解pandas里面的rank。
发信站: 水木社区 (Tue Jun 21 11:18:58 2016), 站内 && 就是排名&& &&&& 【在 lokta () 的大作中提到: 】
: python for data analysis里面没看懂。
: 有达人知道么？
发自xsmth (iOS版)
-- && ※ 来源:·水木社区 ·[FROM: 223.104.3.*]
求！包！养！发信人: CrTn (求！包！养！), 信区: Python
标&&题: Re: 求问如何理解pandas里面的rank。
发信站: 水木社区 (Tue Jun 21 11:41:34 2016), 转信 && 本青以前经常考别人的题，假设x没重复元素，证明order(order(x)) = rank(x) && 【在 lokta (部落) 的大作中提到: 】
: python for data analysis里面没看懂。
: 有达人知道么？
&& -- && ※ 来源:·水木社区 ·[FROM: 24.47.99.*]
文章数:3&分页:114网址导航

pandas怎么求pandas 某一列求和的长度

我要回帖

更多关于 pandas dataframe长度的文章

随机推荐

pandas怎么求pandas 某一列求和的长度

我要回帖

更多关于 pandas dataframe长度 的文章

随机推荐

更多关于 pandas dataframe长度的文章