【问题标题】:Filtering a set of data based on indices in line根据行内索引过滤一组数据
【发布时间】:2014-03-04 01:56:14
【问题描述】:

我有一个 python 脚本,它从外部服务器 SQL 数据库中提取数据,并根据交易编号对值进行求和。我在清理结果集方面得到了一些帮助——这帮助很大,但现在我遇到了另一个问题。

我的原始查询:

SELECT th.trans_ref_no, th.doc_no, th.folio_yr, th.folio_mo, th.transaction_date, tc.prod_id, tc.gr_gals FROM TransHeader th, TransComponents tc WHERE th.term_id="%s" and th.source="L" and th.folio_yr="%s" and th.folio_mo="%s" and (tc.prod_id="TEXLED" or tc.prod_id="103349" or tc.prod_id="103360" or tc.prod_id="103370" or tc.prod_id="113107" or tc.prod_id="113093")and th.trans_ref_no=tc.trans_ref_no;

返回我在这里复制了一个 sn-p 的一组数据:

"0520227370","0001063257","2014","01","140101","113107","000002000"
"0520227370","0001063257","2014","01","140101","TEXLED","000002550"
"0520227378","0001063265","2014","01","140101","113107","000001980"
"0520227378","0001063265","2014","01","140101","TEXLED","000002521"
"0520227380","0001063267","2014","01","140101","113107","000001500"
"0520227380","0001063267","2014","01","140101","TEXLED","000001911"
"0520227384","0001063271","2014","01","140101","113107","000003501"
"0520227384","0001063271","2014","01","140101","TEXLED","000004463"
"0520227384","0001063271","2014","01","140101","113107","000004000"
"0520227384","0001063271","2014","01","140101","TEXLED","000005103"
"0520227385","0001063272","2014","01","140101","113107","000007500"
"0520227385","0001063272","2014","01","140101","TEXLED","000009565"
"0520227388","0001063275","2014","01","140101","113107","000002000"
"0520227388","0001063275","2014","01","140101","TEXLED","000002553"

更新后的查询运行了两次,并加入了 trans_ref_no,它是结果集中的第一个位置,所以前 6 行被压缩为 3 行,最后 4 行被压缩为 2 行。我遇到的问题是将交易号 0520227384 压缩为两行。

SELECT t1.trans_ref_no, t1.doc_no, t1.folio_yr, t1.folio_mo, t1.transaction_date, t1.prod_id, t1.gr_gals, t2.prod_id, t2.gr_gals FROM (SELECT th.trans_ref_no, th.doc_no, th.folio_yr, th.folio_mo, th.transaction_date, tc.prod_id, tc.gr_gals FROM Tms6Data.TransHeader th, Tms6Data.TransComponents tc WHERE th.term_id="00000MA" and th.source="L" and th.folio_yr="2014" and th.folio_mo="01" and (tc.prod_id="103349" or tc.prod_id="103360" or tc.prod_id="103370" or tc.prod_id="113107" or tc.prod_id="113093") and th.trans_ref_no=tc.trans_ref_no) t1 JOIN (SELECT th.trans_ref_no, th.doc_no, th.folio_yr, th.folio_mo, th.transaction_date, tc.prod_id, tc.gr_gals FROM Tms6Data.TransHeader th, Tms6Data.TransComponents tc WHERE th.term_id="00000MA" and th.source="L" and th.folio_yr="2014" and th.folio_mo="01" and tc.prod_id="TEXLED" and th.trans_ref_no=tc.trans_ref_no) t2 ON t1.trans_ref_no = t2.trans_ref_no;

以下是新查询为事务编号 0520227384 返回的内容:

"0520227384","0001063271","2014","01","140101","113107","000003501","TEXLED","000004463"
"0520227384","0001063271","2014","01","140101","113107","000003501","TEXLED","000005103"
"0520227384","0001063271","2014","01","140101","113107","000004000","TEXLED","000004463"
"0520227384","0001063271","2014","01","140101","113107","000004000","TEXLED","000005103"

我需要从中得到的是一组精简的行,在该组中,需要删除第二个和第三个:

"0520227384","0001063271","2014","01","140101","113107","000003501","TEXLED","000004463"
"0520227384","0001063271","2014","01","140101","113107","000004000","TEXLED","000005103"

如何从更新的查询结果集中过滤这些行?

【问题讨论】:

  • 你必须选择“000004000”,“TEXLED”,“000005103”(第4行)而不是“000004000”,“TEXLED”,“000004463”(第3行),但在同样的注释选择“000003501”,“TEXLED”,“000004463”(第1行)而不是“000003501”,“TEXLED”,“000005103”(第2行)?一个似乎需要最大,一个需要最小
  • @Twelfth 基于原始查询结果集。第 7-10 行用于同一事务,但是,第 7 行和第 8 行配对“3501 和 4463”,而第 9 和 10 行配对 -“4000 和 5103”。您的说法是正确的,它似乎选择了最大值和最小值,但生成的数据集以这种方式对其进行排序,因此 3501 首先是 4463 和 5103,然后是 4000 和 4463 和 5103

标签: python mysql sql filter


【解决方案1】:

我想,答案是:

(... your heavy sql ..) group by 7

(... your heavy sql ..) group by t1.gr_gals

【讨论】:

  • 不太确定是否会这样做(mysql 可能会运行它而不是返回错误。Yay MySQL)。 t1.gr_gals 和 t2.gr_gals 都不同,而不仅仅是 t1.gr_gals。
  • @Twelfth 主要思想是使用组或窗口功能。我不确定它应该如何在mysql中实现。我希望我的和会帮助你。
  • @SethKoberg 该查询将按值过滤行,例如"000003501"(我可能会错误地输入确切的行)。您的结果如何?
  • 尝试这样做,查询会删除结果集中四个事务中的三个,而不是仅删除 2 个。是否可以将其扩展为类似于 group by 7 where t1.trans_ref_no = t2.trans_ref_no 的内容?
  • @akaRem 经过进一步检查,您的建议并未删除交易结果中的三个 0520227384,但它在第 7 列中创建了两个具有不同值的 000003501' and 000004000, and the same value in column 9. "0520227384" ,"0001063271","2014","01","140101","113107","000003501","TEXLED","000004463"` 是此交易号的第一个实例,"0520227384","0001063271","2014","01","140101","113107","000004000","TEXLED","000004463" 是第二个实例。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2013-11-10
  • 1970-01-01
  • 1970-01-01
  • 2019-10-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多