【发布时间】:2017-05-12 22:47:03
【问题描述】:
我已经在 hadoop 中使用 map-reduce 框架在数据集上实现了 apriori 算法。
谁能指导我如何优化先验算法(在 hadoop map-reduce 中)?
我会很感激的。
谢谢!
编辑代码:
//MAPPER
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
Utils.count++;
String line = value.toString();
String[] items = line.split(" ");
Arrays.sort( items );
LinkedHashSet myPowerSet = powerset(items);
for (Iterator iterator = myPowerSet.iterator(); iterator.hasNext();) {
Object i = iterator.next();
String _key = i.toString().replaceAll("\\[|\\]| +", "");
context.write(new Text(_key), new IntWritable(1));
}
}
//COMBINER
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int localSum = 0;
for (IntWritable value : values) {
localSum += value.get();
}
context.write(key, new IntWritable(localSum));
}
//REDUCER
public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException
{
int minSupportCount = 3;
int supportCount = 0;
for(IntWritable value : values) {
supportCount += value.get();
}
if (supportCount >= minSupportCount) {
context.write(key, new IntWritable(supportCount));
}
}
【问题讨论】:
-
很难说如何优化你没有展示的代码。
-
嗨,现在我已经添加了代码。请看一看。并感谢您的快速回复。
标签: algorithm hadoop mapreduce data-mining apriori