【Python数据分析】60.时间序列——时间区间和区间算术2

2023年2月16日

610

本系列文章配套代码获取有以下三种途径：

可以在以下网站查看，该网站是使用JupyterLite搭建的web端Jupyter环境，因此无需在本地安装运行环境即可使用，首次运行浏览器需要下载一些配置文件（大约20M）：

https://returu.github.io/Python_Data_Analysis/lab/index.html

也可以通过百度网盘获取，需要在本地配置代码运行环境，环境配置可以查看【Python基础】2.搭建Python开发环境：

链接：https://pan.baidu.com/s/1MYkeYeVAIRqbxezQECHwcA?pwd=mnsj 提取码：mnsj

前往GitHub详情页面，单击 code 按钮，选择Download ZIP选项：

https://github.com/returu/Python_Data_Analysis

根据《Python for Data Analysis 3rd Edition》翻译整理

—————————————————–

3.时间戳与区间的相互转换：

可以使用 to_period 方法将由时间戳索引的 Series 和 DataFrame 对象转换为区间：

 1>>> dates = pd.date_range("2022-1-1" , periods=3 , freq="M")
 2>>> dates
 3DatetimeIndex(['2022-01-31', '2022-02-28', '2022-03-31'], dtype='datetime64[ns]', freq='M')
 4
 5>>> ts = pd.Series(np.arange(len(dates)) , index=dates)
 6>>> ts
 72022-01-31    0
 82022-02-28    1
 92022-03-31    2
10Freq: M, dtype: int32
11
12>>> ts.to_period()
132022-01    0
142022-02    1
152022-03    2
16Freq: M, dtype: int32

由于区间是不重叠的时间跨度，因此，一个时间戳只能属于给定频率的单个区间。

虽然，默认情况下，新的PeriodIndex 的频率是根据时间戳推断而来的，但是可以指定任意支持的频率。在结果中包含重复的区间也是没问题的：

 1>>> dates = pd.date_range("2022-01-29", periods=6)
 2>>> dates
 3DatetimeIndex(['2022-01-29', '2022-01-30', '2022-01-31', '2022-02-01',
 4               '2022-02-02', '2022-02-03'],
 5              dtype='datetime64[ns]', freq='D')
 6
 7>>> ts2 = pd.Series(np.arange(6), index=dates)
 8>>> ts2
 92022-01-29    0
102022-01-30    1
112022-01-31    2
122022-02-01    3
132022-02-02    4
142022-02-03    5
15Freq: D, dtype: int32
16
17>>> ts2.to_period("M")
182022-01    0
192022-01    1
202022-01    2
212022-02    3
222022-02    4
232022-02    5
24Freq: M, dtype: int32

to_timestamp 方法可以将区间转换为时间戳，该方法返回 DatetimeIndex：

 1>>> pts = ts2.to_period()
 2>>> pts
 32022-01-29    0
 42022-01-30    1
 52022-01-31    2
 62022-02-01    3
 72022-02-02    4
 82022-02-03    5
 9Freq: D, dtype: int32
10
11>>> pts.to_timestamp(how="end")
122022-01-29 23:59:59.999999999    0
132022-01-30 23:59:59.999999999    1
142022-01-31 23:59:59.999999999    2
152022-02-01 23:59:59.999999999    3
162022-02-02 23:59:59.999999999    4
172022-02-03 23:59:59.999999999    5
18Freq: D, dtype: int32

4.从数组生成PeriodIndex ：

固定频率数据集有时存储在跨越多列的时间范围信息中。

例如，下面读取的数据集中，年份和季度在不同的列：

 1>>> data = pd.read_csv("./data/Creating_PeriodIndex_from_Arrays.csv")
 2>>> data.head(5)
 3   Year  Quarter  data
 40  2000        1     1
 51  2000        2     2
 62  2000        3     3
 73  2000        4     4
 84  2001        1     5
 9
10>>> data["Year"]
110     2000
121     2000
132     2000
143     2000
154     2001
165     2001
176     2001
187     2001
198     2002
209     2002
2110    2002
2211    2002
2312    2003
2413    2003
2514    2003
2615    2003
2716    2004
2817    2004
2918    2004
3019    2004
3120    2005
3221    2005
3322    2005
3423    2005
35Name: Year, dtype: int64
36
37>>> data["Quarter"]
380     1
391     2
402     3
413     4
424     1
435     2
446     3
457     4
468     1
479     2
4810    3
4911    4
5012    1
5113    2
5214    3
5315    4
5416    1
5517    2
5618    3
5719    4
5820    1
5921    2
6022    3
6123    4
62Name: Quarter, dtype: int64

通过将这些数组和频率传递给 PeriodIndex，可以将它们组合起来形成 DataFrame 的索引：

 1>>> index = pd.PeriodIndex(year=data["Year"], quarter=data["Quarter"],freq="Q-DEC")
 2>>> index
 3PeriodIndex(['2000Q1', '2000Q2', '2000Q3', '2000Q4', '2001Q1', '2001Q2',
 4             '2001Q3', '2001Q4', '2002Q1', '2002Q2', '2002Q3', '2002Q4',
 5             '2003Q1', '2003Q2', '2003Q3', '2003Q4', '2004Q1', '2004Q2',
 6             '2004Q3', '2004Q4', '2005Q1', '2005Q2', '2005Q3', '2005Q4'],
 7            dtype='period[Q-DEC]')
 8
 9>>> data.index = index
10>>> data
11        Year  Quarter  data
122000Q1  2000        1     1
132000Q2  2000        2     2
142000Q3  2000        3     3
152000Q4  2000        4     4
162001Q1  2001        1     5
172001Q2  2001        2     6
182001Q3  2001        3     7
192001Q4  2001        4     8
202002Q1  2002        1     9
212002Q2  2002        2    10
222002Q3  2002        3    11
232002Q4  2002        4    12
242003Q1  2003        1    13
252003Q2  2003        2    14
262003Q3  2003        3    15
272003Q4  2003        4    16
282004Q1  2004        1    17
292004Q2  2004        2    18
302004Q3  2004        3    19
312004Q4  2004        4    20
322005Q1  2005        1    21
332005Q2  2005        2    22
342005Q3  2005        3    23
352005Q4  2005        4    24

本篇文章来源于微信公众号: 码农设计师

Previous article【Shapely矢量数据空间分析】3.创建简单几何对象1——点、线、线环

Next article【Shapely矢量数据空间分析】5.创建几何集合

欢迎留下您的宝贵建议 Cancel reply

Please enter your comment!

Please enter your name here

You have entered an incorrect email address!

Please enter your email address here

【Python数据分析】60.时间序列——时间区间和区间算术2

【Python计算生态】Dooit——待办事项管理...

【Python内置函数】hex()函数

【Python计算生态】Black——代码格式化工...

欢迎留下您的宝贵建议 Cancel reply

Most Popular

【Python计算生态】Dooit——待办事项管理...

【Python内置函数】hex()函数

【Python计算生态】Black——代码格式化工...

【Python内置函数】help()函数

Recent Comments

EDITOR PICKS

RSS

3D Map Generator Terrain

1.ENVI软件操作基础——窗口介绍及打开、浏览数...

POPULAR POSTS

【ArcGIS工具箱】178.要素——删除要素

【ArcGIS工具箱】175.要素——大地测量密度...

【Python内置函数】frozenset()函数...

POPULAR CATEGORY