pandas时间序列
2017-10-15 00:32
399 查看
《Python for Data Analysis》
1: 利用datetime的strftime和strptime方法转换字符串和日期。
处理缺失值:
日期和时间数据类型及工具
Python datatime模块
In [2]: from datetime import datetime In [3]: now = datetime.now() In [4]: now Out[4]: datetime.datetime(2017, 5, 25, 13, 55, 30, 39000)
1: 利用datetime的strftime和strptime方法转换字符串和日期。
In [5]: stamp = datetime(2011, 1, 3) In [6]: str(stamp) Out[6]: '2011-01-03 00:00:00' In [7]: stamp.strftime('%Y-%m-%d') Out[7]: '2011-01-03' In [10]: datetime.strptime('2011-01-03', '%Y-%m-%d') Out[10]: datetime.datetime(2011, 1, 3, 0, 0) In [11]: datestrs = ['7/6/2011', '8/6/2011'] In [12]: [datetime.strptime(x,'%m/%d/%Y') for x in datestrs] Out[12]: [datetime.datetime(2011, 7, 6, 0, 0), datetime.datetime(2011, 8, 6, 0, 0)]
第三方日期解析
2: dateutil可以解析几乎所有人类能够理解的日期表示形式(中文不行)。In [13]: from dateutil.parser import parse In [14]: parse('2011-01-03') Out[14]: datetime.datetime(2011, 1, 3, 0, 0) # 在国际通用格式,日通常在月的前面 In [15]: parse('6/12/2011', dayfirst=True) Out[15]: datetime.datetime(2011, 12, 6, 0, 0) In [16]: parse('6/12/2011') Out[16]: datetime.datetime(2011, 6, 12, 0, 0) In [17]: parse('Jan 31, 1998 10:45 PM') Out[17]: datetime.datetime(1998, 1, 31, 22, 45)
pandas日期解析
3: pandas.to_datetime()方法,通常用于处理++成组日期++,In [22]: datestrs = ['7/6/2011', '8/6/2011'] In [23]: pd.to_datetime(datestrs) Out[23]: DatetimeIndex(['2011-07-06', '2011-08-06'], dtype='datetime64[ns]', freq=None)
处理缺失值:
idx = pd.to_datetime(datestrs + [None]) print idx print idx[2] print pd.isnull(idx) DatetimeIndex(['2011-07-06 12:00:00', '2011-08-06 00:00:00', 'NaT'], dtype='datetime64[ns]', freq=None) NaT [False False True]
时间序列基础
以时间戳为索引的Series
In [11]: from datetime import datetime ...: dates = [datetime(2011, 1, 2), datetime(2011, 1, 5), ...: datetime(2011, 1, 7), datetime(2011, 1, 8), ...: datetime(2011, 1, 10), datetime(2011, 1, 12)] ...: ts = pd.Series(np.random.randn(6), index=dates) ...: ts ...: Out[11]: 2011-01-02 0.092908 2011-01-05 0.281746 2011-01-07 0.769023 2011-01-08 1.246435 2011-01-10 1.007189 2011-01-12 -1.296221 dtype: float64 In [12]: type(ts) Out[12]: pandas.core.series.Series In [13]: ts.index Out[13]: DatetimeIndex(['2011-01-02', '2011-01-05', '2011-01-07', '2011-01-08', '2011-01-10', '2011-01-12'], dtype='datetime64[ns]', freq=None) In [14]: ts.index[0] Out[14]: Timestamp('2011-01-02 00:00:00')
In [15]: ts[::2] Out[15]: 2011-01-02 0.092908 2011-01-07 0.769023 2011-01-10 1.007189 dtype: float64 In [16]: ts + ts[::2] Out[16]: 2011-01-02 0.185816 2011-01-05 NaN 2011-01-07 1.538045 2011-01-08 NaN 2011-01-10 2.014379 2011-01-12 NaN dtype: float64
索引、选取、子集构造
In [17]: stamp = ts.index[2] ...: ts[stamp] ...: Out[17]: 0.76902256761183874 In [18]: ts['1/10/2011'] Out[18]: 1.0071893575830049 In [19]: ts['20110110'] Out[19]: 1.0071893575830049
In [20]: ts['1/6/2011':'1/11/2011'] Out[20]: 2011-01-07 0.769023 2011-01-08 1.246435 2011-01-10 1.007189 dtype: float64 In [21]: ts.truncate(after='1/9/2011') Out[21]: 2011-01-02 0.092908 2011-01-05 0.281746 2011-01-07 0.769023 2011-01-08 1.246435 dtype: float64 In [22]: ts[datetime(2011, 1, 7):] Out[22]: 2011-01-07 0.769023 2011-01-08 1.246435 2011-01-10 1.007189 2011-01-12 -1.296221 dtype: float64
In [23]: longer_ts = pd.Series(np.random.randn(1000), ...: index=pd.date_range('1/1/2000', periods=1000)) In [24]: longer_ts Out[24]: 2000-01-01 0.274992 ... 2002-09-25 0.884111 2002-09-26 -0.608506 Freq: D, dtype: float64 In [25]: longer_ts['2001'] Out[25]: 2001-01-01 -1.308228 ... 2001-12-31 -0.502678 Freq: D, dtype: float64 In [26]: longer_ts['2001-05'] Out[26]: 2001-05-01 1.489410 ... 2001-05-31 -0.241235 Freq: D, dtype: float64
In [27]: dates = pd.date_range('1/1/2000', periods=100, freq='W-WED') ...: long_df = pd.DataFrame(np.random.randn(100, 4), ...: index=dates, ...: columns=['Colorado', 'Texas', ...: 'New York', 'Ohio']) ...: long_df.loc['5-2001'] ...: Out[27]: Colorado Texas New York Ohio 2001-05-02 0.927335 1.513906 0.538600 1.273768 2001-05-09 0.667876 -0.969206 1.676091 -0.817649 2001-05-16 0.050188 1.951312 3.260383 0.963301 2001-05-23 1.201206 -1.852001 2.406778 0.841176 2001-05-30 -0.749181 -2.989741 -1.295289 -1.690195
相关文章推荐
- pandas中如何计算一个时间序列有多少天
- pandas基于时间序列的固定时间间隔求均值
- pandas 时间序列、绘图、存储文件 date_range()
- pandas时间序列
- pandas 时间序列基础
- pandas小记:pandas时间序列分析和处理Timeseries
- pandas小记:pandas时间序列分析和处理Timeseries
- pandas时间序列频率处理
- python+pandas+时间、日期以及时间序列处理
- pandas中的时间序列
- Pandas详解七之DatetimeIndex、PeriodIndex和TimedeltaIndex时间序列
- python科学计算笔记(十)pandas中时间、日期以及时间序列处理
- python+pandas+时间、日期以及时间序列处理
- pandas学习系列(一):时间序列
- python pandas 对时间序列文件处理代码
- pandas 时间序列分析(一)—— 基础
- pandas小记:pandas时间序列分析和处理Timeseries
- 人工智能:python 实现 第十一章,使用Pandas处理时间序列数据
- Pandas.DataFrame.resample 采样后时间序列起始时刻与采样前不一致
- pandas 时间序列 之日期范围、频率及移动