10 to 210 minutee 怎么翻译

10 Minutes to pandas — pandas 0.20.1 documentation
10 Minutes to pandas
This is a short introduction to pandas, geared mainly for new users.
You can see more complex recipes in the
Customarily, we import as follows:
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: import matplotlib.pyplot as plt
Object Creation
Creating a
by passing a list of values, letting pandas create
a default integer index:
In [4]: s = pd.Series([1,3,5,np.nan,6,8])
dtype: float64
Creating a
by passing a numpy array, with a datetime index
and labeled columns:
In [6]: dates = pd.date_range(';, periods=6)
In [7]: dates
DatetimeIndex(['', '', '', '',
'', ''],
dtype='datetime64[ns]', freq='D')
In [8]: df = pd.DataFrame(np.random.randn(6,4), index=dates, columns=list('ABCD'))
In [9]: df
0....135632
1....044236
-0....071804
0....271860
-0....087401
-0....524988
Creating a DataFrame by passing a dict of objects that can be converted to series-like.
In [10]: df2 = pd.DataFrame({ 'A' : 1.,
'B' : pd.Timestamp(';),
'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
'D' : np.array([3] * 4,dtype='int32'),
'E' : pd.Categorical([&test&,&train&,&test&,&train&]),
'F' : 'foo' })
In [11]: df2
Having specific
In [12]: df2.dtypes
datetime64[ns]
dtype: object
If you’re using IPython, tab completion for column names (as well as public
attributes) is automatically enabled. Here’s a subset of the attributes that
will be completed:
In [13]: df2.&TAB&
df2.boxplot
df2.add_prefix
df2.add_suffix
df2.clip_lower
df2.clip_upper
df2.columns
df2.append
bine_first
df2.applymap
df2.consolidate
df2.as_blocks
df2.convert_objects
df2.asfreq
df2.as_matrix
df2.astype
df2.corrwith
df2.at_time
df2.cummax
df2.cummin
df2.between_time
df2.cumprod
df2.cumsum
df2.blocks
As you can see, the columns A, B, C, and D are automatically
tab completed. E the rest of the attributes have been
truncated for brevity.
Viewing Data
See the top & bottom rows of the frame
In [14]: df.head()
0....135632
1....044236
-0....071804
0....271860
-0....087401
In [15]: df.tail(3)
0....271860
-0....087401
-0....524988
Display the index, columns, and the underlying numpy data
In [16]: df.index
DatetimeIndex(['', '', '', '',
'', ''],
dtype='datetime64[ns]', freq='D')
In [17]: df.columns
Out[17]: Index(['A', 'B', 'C', 'D'], dtype='object')
In [18]: df.values
array([[ 0.4691, -0.2829, -1.5091, -1.1356],
[ 1.2121, -0.1732,
0.1192, -1.0442],
[-0.8618, -2.1046, -0.4949,
[ 0.7216, -0.7068, -1.0396,
0.2762, -1.0874],
0.1136, -1.4784,
Describe shows a quick statistic summary of your data
In [19]: df.describe()
6....000000
0....233103
0....973118
-0....135632
-0....076610
0....386188
0....461706
1....071804
Transposing your data
In [20]: df.T
Sorting by an axis
In [21]: df.sort_index(axis=1, ascending=False)
-1....469112
-1....212112
1....861849
0....721555
-1....424972
0....673690
Sorting by values
In [22]: df.sort_values(by='B')
-0....071804
0....271860
0....135632
1....044236
-0....524988
-0....087401
While standard Python / Numpy expressions for selecting and setting are
intuitive and come in handy for interactive work, for production code, we
recommend the optimized pandas data access methods, .at, .iat,
.loc, .iloc and .ix.
See the indexing documentation
Selecting a single column, which yields a Series,
equivalent to df.A
In [23]: df['A']
Freq: D, Name: A, dtype: float64
Selecting via [], which slices the rows.
In [24]: df[0:3]
0....135632
1....044236
-0....071804
In [25]: df[';:';]
1....044236
-0....071804
0....271860
Selection by Label
See more in
For getting a cross section using a label
In [26]: df.loc[dates[0]]
00:00:00, dtype: float64
Selecting on a multi-axis by label
In [27]: df.loc[:,['A','B']]
-0..104569
-0..567020
-0..113648
Showing label slicing, both endpoints are included
In [28]: df.loc[';:';,['A','B']]
-0..104569
Reduction in the dimensions of the returned object
In [29]: df.loc[';,['A','B']]
00:00:00, dtype: float64
For getting a scalar value
In [30]: df.loc[dates[0],'A']
Out[30]: 0.18628
For getting fast access to a scalar (equiv to the prior method)
In [31]: df.at[dates[0],'A']
Out[31]: 0.18628
Selection by Position
See more in
Select via the position of the passed integers
In [32]: df.iloc[3]
00:00:00, dtype: float64
By integer slices, acting similar to numpy/python
In [33]: df.iloc[3:5,0:2]
-0..567020
By lists of integer position locations, similar to the numpy/python style
In [34]: df.iloc[[1,2,4],[0,2]]
-0..494929
-0..276232
For slicing rows explicitly
In [35]: df.iloc[1:3,:]
1....044236
-0....071804
For slicing columns explicitly
In [36]: df.iloc[:,1:3]
-0..509059
-0..119209
-2..494929
-0..039575
For getting a value explicitly
In [37]: df.iloc[1,1]
Out[37]: -0.30858
For getting fast access to a scalar (equiv to the prior method)
In [38]: df.iat[1,1]
Out[38]: -0.30858
Boolean Indexing
Using a single column’s values to select data.
In [39]: df[df.A & 0]
0....135632
1....044236
0....271860
Selecting values from a DataFrame where a boolean condition is met.
In [40]: df[df & 0]
method for filtering:
In [41]: df2 = df.copy()
In [42]: df2['E'] = ['one', 'one','two','three','four','three']
In [43]: df2
0....135632
1....044236
-0....071804
0....271860
-0....087401
-0....524988
In [44]: df2[df2['E'].isin(['two','four'])]
-0....071804
-0....087401
Setting a new column automatically aligns the data
by the indexes
In [45]: s1 = pd.Series([1,2,3,4,5,6], index=pd.date_range(';, periods=6))
In [46]: s1
Freq: D, dtype: int64
In [47]: df['F'] = s1
Setting values by label
In [48]: df.at[dates[0],'A'] = 0
Setting values by position
In [49]: df.iat[0,1] = 0
Setting by assigning with a numpy array
In [50]: df.loc[:,'D'] = np.array([5] * len(df))
The result of the prior setting operations
In [51]: df
A where operation with setting.
In [52]: df2 = df.copy()
In [53]: df2[df2 & 0] = -df2
In [54]: df2
-1... -1.0
-0... -2.0
-0... -3.0
-0... -4.0
-0... -5.0
Missing Data
pandas primarily uses the value np.nan to represent missing data. It is by
default not included in computations. See the
Reindexing allows you to change/add/delete the index on a specified axis. This
returns a copy of the data.
In [55]: df1 = df.reindex(index=dates[0:4], columns=list(df.columns) + ['E'])
In [56]: df1.loc[dates[0]:dates[1],'E'] = 1
In [57]: df1
To drop any rows that have missing data.
In [58]: df1.dropna(how='any')
Filling missing data
In [59]: df1.fillna(value=5)
To get the boolean mask where values are nan
In [60]: pd.isnull(df1)
Operations
Operations in general exclude missing data.
Performing a descriptive statistic
In [61]: df.mean()
dtype: float64
Same operation on the other axis
In [62]: df.mean(1)
Freq: D, dtype: float64
Operating with objects that have different dimensionality and need alignment.
In addition, pandas automatically broadcasts along the specified dimension.
In [63]: s = pd.Series([1,3,5,np.nan,6,8], index=dates).shift(2)
In [64]: s
Freq: D, dtype: float64
In [65]: df.sub(s, axis='index')
-5....0 -1.0
Applying functions to the data
In [66]: df.apply(np.cumsum)
0...509059
In [67]: df.apply(lambda x: x.max() - x.min())
dtype: float64
Histogramming
See more at
In [68]: s = pd.Series(np.random.randint(0, 7, size=10))
In [69]: s
dtype: int64
In [70]: s.value_counts()
dtype: int64
String Methods
Series is equipped with a set of string processing methods in the str
attribute that make it easy to operate on each element of the array, as in the
code snippet below. Note that pattern-matching in str generally uses
by default (and in
some cases always uses them). See more at .
In [71]: s = pd.Series(['A', 'B', 'C', 'Aaba', 'Baca', np.nan, 'CABA', 'dog', 'cat'])
In [72]: s.str.lower()
dtype: object
pandas provides various facilities for easily combining together Series,
DataFrame, and Panel objects with various kinds of set logic for the indexes
and relational algebra functionality in the case of join / merge-type
operations.
Concatenating pandas objects together with :
In [73]: df = pd.DataFrame(np.random.randn(10, 4))
In [74]: df
0 -0....483075
1....745505
2 -0....266046
3 -0....705775
4 -0....009920
0....548106
6 -1....945867
7 -0....016692
8 -0....215897
1....862495
# break it into pieces
In [75]: pieces = [df[:3], df[3:7], df[7:]]
In [76]: pd.concat(pieces)
0 -0....483075
1....745505
2 -0....266046
3 -0....705775
4 -0....009920
0....548106
6 -1....945867
7 -0....016692
8 -0....215897
1....862495
SQL style merges. See the
In [77]: left = pd.DataFrame({'key': ['foo', 'foo'], 'lval': [1, 2]})
In [78]: right = pd.DataFrame({'key': ['foo', 'foo'], 'rval': [4, 5]})
In [79]: left
In [80]: right
In [81]: pd.merge(left, right, on='key')
Another example that can be given is:
In [82]: left = pd.DataFrame({'key': ['foo', 'bar'], 'lval': [1, 2]})
In [83]: right = pd.DataFrame({'key': ['foo', 'bar'], 'rval': [4, 5]})
In [84]: left
In [85]: right
In [86]: pd.merge(left, right, on='key')
Append rows to a dataframe. See the
In [87]: df = pd.DataFrame(np.random.randn(8, 4), columns=['A','B','C','D'])
In [88]: df
1....990582
1 -0....024580
2 -1....532532
1....264610
4 -0....693205
5 -0....591431
0....192451
7 -0....708758
In [89]: s = df.iloc[3]
In [90]: df.append(s, ignore_index=True)
1....990582
1 -0....024580
2 -1....532532
1....264610
4 -0....693205
5 -0....591431
0....192451
7 -0....708758
1....264610
By “group by” we are referring to a process involving one or more of the
following steps
Splitting the data into groups based on some criteria
Applying a function to each group independently
Combining the results into a data structure
In [91]: df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
'foo', 'bar', 'foo', 'foo'],
'B' : ['one', 'one', 'two', 'three',
'two', 'two', 'one', 'three'],
'C' : np.random.randn(8),
'D' : np.random.randn(8)})
In [92]: df
one -1..055224
one -1..395985
three -0..166599
two -0..136473
Grouping and then applying a function sum to the resulting groups.
In [93]: df.groupby('A').sum()
bar -2..42611
Grouping by multiple columns forms a hierarchical index, which we then apply
the function.
In [94]: df.groupby(['A','B']).sum()
-1..395985
three -0..166599
-0..136473
-1..616981
See the sections on
In [95]: tuples = list(zip(*[['bar', 'bar', 'baz', 'baz',
'foo', 'foo', 'qux', 'qux'],
['one', 'two', 'one', 'two',
'one', 'two', 'one', 'two']]))
In [96]: index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
In [97]: df = pd.DataFrame(np.random.randn(8, 2), index=index, columns=['A', 'B'])
In [98]: df2 = df[:4]
In [99]: df2
first second
-1..771208
method “compresses” a level in the DataFrame’s
In [100]: stacked = df2.stack()
In [101]: stacked
dtype: float64
With a “stacked” DataFrame or Series (having a MultiIndex as the
index), the inverse operation of
, which by default unstacks the last level:
In [102]: stacked.unstack()
first second
-1..771208
In [103]: stacked.unstack(1)
B -0..087302
A -1..816482
In [104]: stacked.unstack(0)
B -0..771208
B -0..100230
Pivot Tables
See the section on .
In [105]: df = pd.DataFrame({'A' : ['one', 'one', 'two', 'three'] * 3,
'B' : ['A', 'B', 'C'] * 4,
'C' : ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'] * 2,
'D' : np.random.randn(12),
'E' : np.random.randn(12)})
In [106]: df
foo -1..291836
bar -0..264599
bar -1..057409
foo -1..024098
bar -0..824375
bar -1..595974
We can produce pivot tables from this data very easily:
In [107]: pd.pivot_table(df, values='D', index=['A', 'B'], columns=['C'])
A -0..418757
B -0..879024
C -1..314665
NaN -1.035018
B -1.170653
Time zone representation
In [111]: rng = pd.date_range('3/6/', periods=5, freq='D')
In [112]: ts = pd.Series(np.random.randn(len(rng)), rng)
In [113]: ts
Freq: D, dtype: float64
In [114]: ts_utc = ts.tz_localize('UTC')
In [115]: ts_utc
00:00:00+00:00
00:00:00+00:00
00:00:00+00:00
00:00:00+00:00
00:00:00+00:00
Freq: D, dtype: float64
Convert to another time zone
In [116]: ts_utc.tz_convert('US/Eastern')
19:00:00-05:00
19:00:00-05:00
19:00:00-05:00
19:00:00-05:00
19:00:00-05:00
Freq: D, dtype: float64
Converting between time span representations
In [117]: rng = pd.date_range('1/1/2012', periods=5, freq='M')
In [118]: ts = pd.Series(np.random.randn(len(rng)), index=rng)
In [119]: ts
Freq: M, dtype: float64
In [120]: ps = ts.to_period()
In [121]: ps
Freq: M, dtype: float64
In [122]: ps.to_timestamp()
Freq: MS, dtype: float64
Converting between period and timestamp enables some convenient arithmetic
functions to be used. In the following example, we convert a quarterly
frequency with year ending in November to 9am of the end of the month following
the quarter end:
In [123]: prng = pd.period_range(';, ';, freq='Q-NOV')
In [124]: ts = pd.Series(np.random.randn(len(prng)), prng)
In [125]: ts.index = (prng.asfreq('M', 'e') + 1).asfreq('H', 's') + 9
In [126]: ts.head()
Freq: H, dtype: float64
Categoricals
Since version 0.15, pandas can include categorical data in a DataFrame. For full docs, see the
In [127]: df = pd.DataFrame({&id&:[1,2,3,4,5,6], &raw_grade&:['a', 'b', 'b', 'a', 'a', 'e']})
Convert the raw grades to a categorical data type.
In [128]: df[&grade&] = df[&raw_grade&].astype(&category&)
In [129]: df[&grade&]
Name: grade, dtype: category
Categories (3, object): [a, b, e]
Rename the categories to more meaningful names (assigning to Series.cat.categories is inplace!)
In [130]: df[&grade&].cat.categories = [&very good&, &good&, &very bad&]
Reorder the categories and simultaneously add the missing categories (methods under Series
.cat return a new Series per default).
In [131]: df[&grade&] = df[&grade&].cat.set_categories([&very bad&, &bad&, &medium&, &good&, &very good&])
In [132]: df[&grade&]
Name: grade, dtype: category
Categories (5, object): [very bad, bad, medium, good, very good]
Sorting is per order in the categories, not lexical order.
In [133]: df.sort_values(by=&grade&)
id raw_grade
Grouping by a categorical column shows also empty categories.
In [134]: df.groupby(&grade&).size()
dtype: int64
In [135]: ts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
In [136]: ts = ts.cumsum()
In [137]: ts.plot()
Out[137]: &matplotlib.axes._subplots.AxesSubplot at 0x10f776b70&
On DataFrame,
is a convenience to plot all of the
columns with labels:
In [138]: df = pd.DataFrame(np.random.randn(1000, 4), index=ts.index,
columns=['A', 'B', 'C', 'D'])
In [139]: df = df.cumsum()
In [140]: plt.figure(); df.plot(); plt.legend(loc='best')
Out[140]: &matplotlib.legend.Legend at 0x&
Getting Data In/Out
In [141]: df.to_csv('foo.csv')
In [142]: pd.read_csv('foo.csv')
Unnamed: 0
-0..219582
-0..653061
-0..543861
-10.628548
-9...313940
-10.390377
-8...914107
-8...367740
-8...518439
-9...105593
-10.216020
-9...758560
-11....369368
[1000 rows x 5 columns]
Reading and writing to
Writing to a HDF5 Store
In [143]: df.to_hdf('foo.h5','df')
Reading from a HDF5 Store
In [144]: pd.read_hdf('foo.h5','df')
-0..219582
-0..653061
-0..543861
-10.628548
-9...313940
-10.390377
-8...914107
-8...367740
-8...518439
-9...105593
-10.216020
-9...758560
-11....369368
[1000 rows x 4 columns]
Reading and writing to
Writing to an excel file
In [145]: df.to_excel('foo.xlsx', sheet_name='Sheet1')
Reading from an excel file
In [146]: pd.read_excel('foo.xlsx', 'Sheet1', index_col=None, na_values=['NA'])
-0..219582
-0..653061
-0..543861
-10.628548
-9...313940
-10.390377
-8...914107
-8...367740
-8...518439
-9...105593
-10.216020
-9...758560
-11....369368
[1000 rows x 4 columns]
If you are trying an operation and you see an exception like:
&&& if pd.Series([False, True, False]):
print(&I was true&)
ValueError: The truth value of an array is ambiguous. Use a.empty, a.any() or a.all().
for an explanation and what to do.10分钟用英语怎么说
10分钟用英语怎么说
学习啦【英语口语】 编辑:焯杰
  10分钟说长不长,说短也不短,但只要你抓紧每分每秒,即使十分钟的时间也能够做很多有意义的事。那么你知道10分钟用怎么说吗?下面学习啦小编为大家带来10分钟的英语说法和相关英语表达,希望对你有所帮助。
  10分钟的英语说法1:
  10 minutes
  10分钟的英语说法2:
  ten minutes
  10分钟相关英语表达:
  10 minutes everyday
  每天10分钟
  builds in 10 minutes
  建立在10分钟
  I came back 10 minutes
  我10分钟回来
  10分钟英语说法例句:
  The train is ten minutes behind times.
  火车误点10分钟。
  About a ten minutes'walk.
  大概10分钟行程。
  That clock's ten minutes slow.
  那个钟慢了10分钟。
  I' ll put forward the clock ten minutes.
  我要把钟拨快10分钟。
  I ordered ten minute ago.
  我10分钟以前就点菜了。
  It's ten minutes' ride from here.
  从这儿坐车10分钟路程。
  No, about 10 minutes' drive.
  不远,大约10分钟路程。
  My watch be ten minute slow .
  我的表慢了10分钟。
  The drum is rotated at 20 revolutions per minute for 10 minutes.
  圆筒以每分钟20转的速度转10分钟。
  This clock is not telling the right time. Please set its hands back 10 minutes.
  这只钟报时不准,请把针拔慢10分钟。
  Just 10 minutes later, the floodwaters had reached a depth of 1.3 meters 。
  仅仅10分钟后,洪水就达到了1.3米深。
  By default, watch checks the system for login/ logout activity every10 minutes.
  在默认情况下,watch每10分钟检查一次系统中的登录/注销活动。
  We challenged contestants to get the most transactions per minute during a10-minute benchmark run.
  我们要求参赛者在10分钟基准测试运行时间内获得最多的每分钟事务数。
  Keep in mind to always arrive at the location at least 10 minutes early.
  脑中切记要提早10分钟到达目的点。
  You might be told to spend ten minutes a day talking about your emotions.
  心理医生可能会要求患者每天花10分钟谈他自己的感受。
  It's probably about 10 minutes' walk from here.
  它可以坚持用10分钟,之后就会从那个廉价订书钉里掉出来。
  That was good for about 10 seconds before it snapped out of that cheap little staple they put it in there with.
  从这里步行大概十分钟左右就能到达。
[10分钟用英语怎么说]相关的文章
【英语口语】图文推荐下载翻译插件
请选择您当前的浏览器,下载对应的插件:
其他浏览器正在火热开发中...
新增人工翻译
13352&名专业译员&,&2528&名母语审校
为您提供专属【翻译官】
28国圣诞老人“喊”你拆礼物啦!
手机扫描二维码,进入百度翻译App抽奖
活动期间,每天可拆6次礼物,更有机会赢取iPhone哦!
麦开智能水杯
月光护眼灯
云朵声控闹钟
幻响充电旅行套
乐心微信脂肪秤
熊孩子集线器
槑小熊双肩包
熊孩子笔记本
龙猫移动电源
槑小熊公仔
1. 活动时间:12.24日 0:00—12.25 24:00
2. 每天有六次“拆圣诞礼物”的机会。 12:00前三次,12:00后三次。
3. 收货信息无法修改,请务必填写准确,否则视为弃权。
4. 百度翻译小伙伴Q群:。
5. 奖励可累计获得。参与活动者默认遵守活动规则。
你的常用语种在这里
较慢中速较快
英语发音偏好
输入文字或网址,即可翻译
暂时没有您的历史记录
不想显示历史记录?
努力翻译中...

我要回帖

更多关于 20 minute 的文章

 

随机推荐