【发布时间】:2017-11-03 08:00:51
【问题描述】:
我的问题是:
我有一个间隔/多个间隔,可以说:
[0;0.3]
[0.3;0,8]
[0.8;1]
在每个区间我都有一个正态分布,采样
truncnorm() and .rvs()。
所以我在 x 轴上有多个“正态分布”。
但是 truncnorm 方法需要区间内分布的均值和标准差。 python中如何计算特定区间的均值和sd???
numpy.mean() f.e.似乎不起作用。而且我得到了奇怪的结果,所以我认为在执行 truncnorm 之前我的均值/标准差计算错误。
谢谢大家
*编辑:对于其他列,间隔不是那么小,它工作正常。 Intervall 的大小是否有限制?错误发生 f.e.间隔
[0,12;0,17]--> 值 0,0937818650369(超出范围)*
是的,当然。 我想要做的是:我有一个区间,给我采样一个值,它位于该区间的边界之间,并以截断正态分布的方式对其进行简单化。我有一个额外的列,它应该写下我通过在另一列中采样获得的值。 例如:Intervall [0.2;0.6] --> 样本值 0.343433 我想我找到了解决方案:
truncnorm().stats()
但我不知道为什么,但是对于我给出的参数
truncnorm()
函数,我获得的价值中几乎 50% 都在边界之外。我做错了什么?
这里是代码(一小部分代码)
convert_cat=(name_convert_column,name_convert_column,_tabelle,name_convert_column,_tabelle,_tabelle,name_convert_column)
drop_view=(name_convert_column)
calculate=(name_convert_column,name_convert_column,name_convert_column,name_convert_column,name_convert_column,_tabelle,name_convert_column,name_convert_column)
cur.execute("CREATE VIEW convert_cat_%s (quotient, %s, rnum) AS SELECT (COUNT(*)/(SELECT COUNT(*) FROM %s ) ) as quotient, %s, row_number() over ( order by (COUNT(*)/(SELECT COUNT(*) FROM %s ) ) desc ) as rnum FROM %s GROUP BY %s ORDER BY quotient desc" %convert_cat)
cur.execute("Select b.ID,a.unten,a.oben, a.mean, a.sd FROM( SELECT t3.RNUM, t3.%s, lag(t3.com_Pr,1,0) OVER (order by rnum asc) as unten , t3.com_PR as oben, ((t3.com_PR +(lag(t3.com_Pr,1,0) OVER (order by rnum asc)))/2) as MEAN, ((t3.com_PR-(lag(t3.com_Pr,1,0) OVER (order by rnum asc)))/6) AS SD FROM( SELECT t1.rnum, t1.%s , SUM(t2.quotient) as com_Pr FROM CONVERT_CAT_%s t1 INNER JOIN CONVERT_CAT_%s t2 ON t1.rnum >= t2.rnum group by t1.rnum, t1.%s, t1.quotient ORDER BY RNUM asc ) t3) a INNER JOIN %s b ON b.%s = a.%s order by ID asc" %calculate)
_content_category = cur.fetchall()
add_category_number_column = (_tabelle, name_convert_column)
cur.execute("ALTER TABLE %s ADD %s_category NUMBER(15,14)" % add_category_number_column)
x=0
for ID in _content_category:
id = _content_category[0]
id_category = [j[0] for j in _content_category]
unten_category = [j[1] for j in _content_category]
oben_category = [j[2] for j in _content_category]
#mean_category = [j[3] for j in _content_category]
sd_category = [j[4] for j in _content_category]
mean, var = truncnorm.stats(unten_category[x], oben_category[x], moments='mv')
# sd = np.sqrt(var)
X = get_truncated_normal(mean= mean, sd=sd_category[x], low=unten_category[x], upp=oben_category[x])
update_cells_value = float(X.rvs(1))
category = (_tabelle, name_convert_column,update_cells_value,id_category[x])
cur.execute("UPDATE %s SET %s_category = %s WHERE ID=%s" % category)
x += 1
我尝试在 sql 查询中计算平均值和标准差
1) ((t3.com_PR +(lag(t3.com_Pr,1,0) OVER (order by rnum asc)))/2) as MEAN
2) ((t3.com_PR-(lag(t3.com_Pr,1,0) OVER (order by rnum asc)))/6) AS SD
并与
truncnorm().stats() 函数。似乎使用 stats 函数,结果变得更糟,并且值比以前更超出范围......
【问题讨论】:
-
你能分享一些最小的代码来解决你的问题吗?
-
我做到了 :) 它现在在最初的帖子中......
标签: python normal-distribution