lambda_apply函数应用_python_数据分析_10

标签: python_数据分析

向量运算_python_数据分析_10

最近在学习Barra多因子模型,发现计算beta的时候速度特别慢:
在这里插入图片描述数据长成这样子,是全市场个股收益率矩阵

def halflife(half_life = 63, length = 252):
    #半衰期为63个交易日的指数加权移动平均
    t = np.arange(length)
    w = 2**(t/half_life) / sum(2 ** (t/half_life))
    return w
W = halflife()
W  

这样计算beta特别慢:

length = 252
#beta = pd.DataFrame(columns = _pct.columns[:-1],index = _pct.index)
residual = pd.DataFrame(columns = _pct.columns[:-1],index = _pct.index)
for j in range(len(_pct.columns)-1):
    for i in range(length+1,len(_pct)):
        mkt_ret = _pct.iloc[i-length:i,-1] * W
        ret = _pct.iloc[i-length:i,j] * W
        model = sm.OLS(ret,mkt_ret)
        results = model.fit()
        residual.iloc[i,j] = results.resid[-1]
        #beta.iloc[i,j] = results.params[0]
        print('calc...%s,%s,%.2f,%.4f'%(_pct.columns[j],str(_pct.index[i])[:10],results.params[0],results.resid[-1],))

修改成lambda+apply形式,速度大幅提升:

def get_beta(ret,mkt_ret):
    y = ret
    x = mkt_ret
    olsFunc = lambda y: sm.OLS(y,x).fit().params[0] 
    b = y.apply(olsFunc)
    
    #print(b)
    return b
length = 252
beta = pd.DataFrame(columns = _pct.columns[:-1],index = _pct.index)
for i in range(length+1,len(_pct)):
    print(_pct.index[i])
    mkt_ret = _pct.iloc[i-length:i,-1] * W
    ret = _pct.iloc[i-length:i,:-1].mul(W,axis=0)
    beta.iloc[i] = get_beta(ret,mkt_ret)
版权声明:本文为forest_sz原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/forest_sz/article/details/106998224