• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

Java Cluster类代码示例

原作者: [db:作者] 来自: [db:来源] 收藏 邀请

本文整理汇总了Java中org.apache.mahout.clustering.Cluster的典型用法代码示例。如果您正苦于以下问题:Java Cluster类的具体用法?Java Cluster怎么用?Java Cluster使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。



Cluster类属于org.apache.mahout.clustering包,在下文中一共展示了Cluster类的20个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于我们的系统推荐出更棒的Java代码示例。

示例1: buildClusters

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
/**
 * Iterate over the input vectors to produce cluster directories for each iteration
 * 
 * @param conf
 *          the hadoop configuration
 * @param input
 *          the directory Path for input points
 * @param output
 *          the directory Path for output points
 * @param description
 *          model distribution parameters
 * @param numClusters
 *          the number of models to iterate over
 * @param maxIterations
 *          the maximum number of iterations
 * @param alpha0
 *          the alpha_0 value for the DirichletDistribution
 * @param runSequential
 *          execute sequentially if true
 * 
 * @return the Path of the final clusters directory
 */
public static Path buildClusters(Configuration conf, Path input, Path output, DistributionDescription description,
    int numClusters, int maxIterations, double alpha0, boolean runSequential) throws IOException,
    ClassNotFoundException, InterruptedException {
  Path clustersIn = new Path(output, Cluster.INITIAL_CLUSTERS_DIR);
  ModelDistribution<VectorWritable> modelDist = description.createModelDistribution(conf);
  
  List<Cluster> models = Lists.newArrayList();
  for (Model<VectorWritable> cluster : modelDist.sampleFromPrior(numClusters)) {
    models.add((Cluster) cluster);
  }
  
  ClusterClassifier prior = new ClusterClassifier(models, new DirichletClusteringPolicy(numClusters, alpha0));
  prior.writeToSeqFiles(clustersIn);
  
  if (runSequential) {
    ClusterIterator.iterateSeq(conf, input, clustersIn, output, maxIterations);
  } else {
    ClusterIterator.iterateMR(conf, input, clustersIn, output, maxIterations);
  }
  return output;
  
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:45,代码来源:DirichletDriver.java


示例2: configureWithClusterInfo

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
/**
 * Create a list of SoftClusters from whatever type is passed in as the prior
 * 
 * @param conf
 *          the Configuration
 * @param clusterPath
 *          the path to the prior Clusters
 * @param clusters
 *          a List<Cluster> to put values into
 */
public static void configureWithClusterInfo(Configuration conf, Path clusterPath, List<Cluster> clusters) {
  for (Writable value : new SequenceFileDirValueIterable<Writable>(clusterPath, PathType.LIST,
      PathFilters.partFilter(), conf)) {
    Class<? extends Writable> valueClass = value.getClass();
    
    if (valueClass.equals(ClusterWritable.class)) {
      ClusterWritable clusterWritable = (ClusterWritable) value;
      value = clusterWritable.getValue();
      valueClass = value.getClass();
    }
    
    if (valueClass.equals(Kluster.class)) {
      // get the cluster info
      Kluster cluster = (Kluster) value;
      clusters.add(new SoftCluster(cluster.getCenter(), cluster.getId(), cluster.getMeasure()));
    } else if (valueClass.equals(SoftCluster.class)) {
      // get the cluster info
      clusters.add((SoftCluster) value);
    } else if (valueClass.equals(Canopy.class)) {
      // get the cluster info
      Canopy canopy = (Canopy) value;
      clusters.add(new SoftCluster(canopy.getCenter(), canopy.getId(), canopy.getMeasure()));
    } else {
      throw new IllegalStateException("Bad value class: " + valueClass);
    }
  }
  
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:39,代码来源:FuzzyKMeansUtil.java


示例3: configureWithClusterInfo

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
/**
 * Create a list of Klusters from whatever Cluster type is passed in as the prior
 * 
 * @param conf
 *          the Configuration
 * @param clusterPath
 *          the path to the prior Clusters
 * @param clusters
 *          a List<Cluster> to put values into
 */
public static void configureWithClusterInfo(Configuration conf, Path clusterPath, Collection<Cluster> clusters) {
  for (Writable value : new SequenceFileDirValueIterable<Writable>(clusterPath, PathType.LIST,
      PathFilters.partFilter(), conf)) {
    Class<? extends Writable> valueClass = value.getClass();
    if (valueClass.equals(ClusterWritable.class)) {
      ClusterWritable clusterWritable = (ClusterWritable) value;
      value = clusterWritable.getValue();
      valueClass = value.getClass();
    }
    log.debug("Read 1 Cluster from {}", clusterPath);
    
    if (valueClass.equals(Kluster.class)) {
      // get the cluster info
      clusters.add((Kluster) value);
    } else if (valueClass.equals(Canopy.class)) {
      // get the cluster info
      Canopy canopy = (Canopy) value;
      clusters.add(new Kluster(canopy.getCenter(), canopy.getId(), canopy.getMeasure()));
    } else {
      throw new IllegalStateException("Bad value class: " + valueClass);
    }
  }
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:34,代码来源:KMeansUtil.java


示例4: buildClusters

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
/**
 * Iterate over the input vectors to produce cluster directories for each iteration
 * 
 * @param conf
 *          the Configuration to use
 * @param input
 *          the directory pathname for input points
 * @param clustersIn
 *          the directory pathname for initial & computed clusters
 * @param output
 *          the directory pathname for output points
 * @param measure
 *          the classname of the DistanceMeasure
 * @param maxIterations
 *          the maximum number of iterations
 * @param delta
 *          the convergence delta value
 * @param runSequential
 *          if true execute sequential algorithm
 * 
 * @return the Path of the final clusters directory
 */
public static Path buildClusters(Configuration conf, Path input, Path clustersIn, Path output,
    DistanceMeasure measure, int maxIterations, String delta, boolean runSequential) throws IOException,
    InterruptedException, ClassNotFoundException {
  
  double convergenceDelta = Double.parseDouble(delta);
  List<Cluster> clusters = new ArrayList<Cluster>();
  KMeansUtil.configureWithClusterInfo(conf, clustersIn, clusters);
  
  if (clusters.isEmpty()) {
    throw new IllegalStateException("No input clusters found in " + clustersIn + ". Check your -c argument.");
  }
  
  Path priorClustersPath = new Path(output, Cluster.INITIAL_CLUSTERS_DIR);
  ClusteringPolicy policy = new KMeansClusteringPolicy(convergenceDelta);
  ClusterClassifier prior = new ClusterClassifier(clusters, policy);
  prior.writeToSeqFiles(priorClustersPath);
  
  if (runSequential) {
    ClusterIterator.iterateSeq(conf, input, priorClustersPath, output, maxIterations);
  } else {
    ClusterIterator.iterateMR(conf, input, priorClustersPath, output, maxIterations);
  }
  return output;
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:47,代码来源:KMeansDriver.java


示例5: setup

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
@Override
protected void setup(Context context) throws IOException, InterruptedException {
  super.setup(context);
  
  Configuration conf = context.getConfiguration();
  String clustersIn = conf.get(ClusterClassificationConfigKeys.CLUSTERS_IN);
  threshold = conf.getFloat(ClusterClassificationConfigKeys.OUTLIER_REMOVAL_THRESHOLD, 0.0f);
  emitMostLikely = conf.getBoolean(ClusterClassificationConfigKeys.EMIT_MOST_LIKELY, false);
  
  clusterModels = new ArrayList<Cluster>();
  
  if (clustersIn != null && !clustersIn.isEmpty()) {
    Path clustersInPath = new Path(clustersIn);
    clusterModels = populateClusterModels(clustersInPath, conf);
    ClusteringPolicy policy = ClusterClassifier
        .readPolicy(finalClustersPath(clustersInPath));
    clusterClassifier = new ClusterClassifier(clusterModels, policy);
  }
  clusterId = new IntWritable();
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:21,代码来源:ClusterClassificationMapper.java


示例6: writeToSeqFiles

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
public void writeToSeqFiles(Path path) throws IOException {
  writePolicy(policy, path);
  Configuration config = new Configuration();
  FileSystem fs = FileSystem.get(path.toUri(), config);
  SequenceFile.Writer writer = null;
  ClusterWritable cw = new ClusterWritable();
  for (int i = 0; i < models.size(); i++) {
    try {
      Cluster cluster = models.get(i);
      cw.setValue(cluster);
      writer = new SequenceFile.Writer(fs, config,
          new Path(path, "part-" + String.format(Locale.ENGLISH, "%05d", i)), IntWritable.class,
          ClusterWritable.class);
      Writable key = new IntWritable(i);
      writer.append(key, cw);
    } finally {
      Closeables.closeQuietly(writer);
    }
  }
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigData-Max,代码行数:21,代码来源:ClusterClassifier.java


示例7: buildClustersSeq

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
/**
 * Build a directory of Canopy clusters from the input vectors and other
 * arguments. Run sequential execution
 * 
 * @param input
 *          the Path to the directory containing input vectors
 * @param output
 *          the Path for all output directories
 * @param measure
 *          the DistanceMeasure
 * @param t1
 *          the double T1 distance metric
 * @param t2
 *          the double T2 distance metric
 * @param clusterFilter
 *          the int minimum size of canopies produced
 * @return the canopy output directory Path
 */
private static Path buildClustersSeq(Path input, Path output,
    DistanceMeasure measure, double t1, double t2, int clusterFilter)
    throws IOException {
  CanopyClusterer clusterer = new CanopyClusterer(measure, t1, t2);
  Collection<Canopy> canopies = Lists.newArrayList();
  Configuration conf = new Configuration();
  FileSystem fs = FileSystem.get(input.toUri(), conf);

  for (VectorWritable vw : new SequenceFileDirValueIterable<VectorWritable>(
      input, PathType.LIST, PathFilters.logsCRCFilter(), conf)) {
    clusterer.addPointToCanopies(vw.get(), canopies);
  }

  Path canopyOutputDir = new Path(output, Cluster.CLUSTERS_DIR + '0'+ Cluster.FINAL_ITERATION_SUFFIX);
  Path path = new Path(canopyOutputDir, "part-r-00000");
  SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf, path,
      Text.class, ClusterWritable.class);
  ClusterWritable clusterWritable = new ClusterWritable();
  try {
    for (Canopy canopy : canopies) {
      canopy.computeParameters();
      if (log.isDebugEnabled()) {
        log.debug("Writing Canopy:{} center:{} numPoints:{} radius:{}",
            new Object[] { canopy.getIdentifier(),
                AbstractCluster.formatVector(canopy.getCenter(), null),
                canopy.getNumObservations(),
                AbstractCluster.formatVector(canopy.getRadius(), null) });
      }
      if (canopy.getNumObservations() > clusterFilter) {
      	clusterWritable.setValue(canopy);
      	writer.append(new Text(canopy.getIdentifier()), clusterWritable);
      }
    }
  } finally {
    Closeables.closeQuietly(writer);
  }
  return canopyOutputDir;
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:57,代码来源:CanopyDriver.java


示例8: buildClustersMR

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
/**
 * Build a directory of Canopy clusters from the input vectors and other
 * arguments. Run mapreduce execution
 * 
 * @param conf
 *          the Configuration
 * @param input
 *          the Path to the directory containing input vectors
 * @param output
 *          the Path for all output directories
 * @param measure
 *          the DistanceMeasure
 * @param t1
 *          the double T1 distance metric
 * @param t2
 *          the double T2 distance metric
 * @param t3
 *          the reducer's double T1 distance metric
 * @param t4
 *          the reducer's double T2 distance metric
 * @param clusterFilter
 *          the int minimum size of canopies produced
 * @return the canopy output directory Path
 */
private static Path buildClustersMR(Configuration conf, Path input,
    Path output, DistanceMeasure measure, double t1, double t2, double t3,
    double t4, int clusterFilter) throws IOException, InterruptedException,
    ClassNotFoundException {
  conf.set(CanopyConfigKeys.DISTANCE_MEASURE_KEY, measure.getClass()
      .getName());
  conf.set(CanopyConfigKeys.T1_KEY, String.valueOf(t1));
  conf.set(CanopyConfigKeys.T2_KEY, String.valueOf(t2));
  conf.set(CanopyConfigKeys.T3_KEY, String.valueOf(t3));
  conf.set(CanopyConfigKeys.T4_KEY, String.valueOf(t4));
  conf.set(CanopyConfigKeys.CF_KEY, String.valueOf(clusterFilter));

  Job job = new Job(conf, "Canopy Driver running buildClusters over input: "
      + input);
  job.setInputFormatClass(SequenceFileInputFormat.class);
  job.setOutputFormatClass(SequenceFileOutputFormat.class);
  job.setMapperClass(CanopyMapper.class);
  job.setMapOutputKeyClass(Text.class);
  job.setMapOutputValueClass(VectorWritable.class);
  job.setReducerClass(CanopyReducer.class);
  job.setOutputKeyClass(Text.class);
  job.setOutputValueClass(ClusterWritable.class);
  job.setNumReduceTasks(1);
  job.setJarByClass(CanopyDriver.class);

  FileInputFormat.addInputPath(job, input);
  Path canopyOutputDir = new Path(output, Cluster.CLUSTERS_DIR + '0' + Cluster.FINAL_ITERATION_SUFFIX);
  FileOutputFormat.setOutputPath(job, canopyOutputDir);
  if (!job.waitForCompletion(true)) {
    throw new InterruptedException("Canopy Job failed processing " + input);
  }
  return canopyOutputDir;
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:58,代码来源:CanopyDriver.java


示例9: buildClusters

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
/**
 * Iterate over the input vectors to produce cluster directories for each iteration
 * @param input
 *          the directory pathname for input points
 * @param clustersIn
 *          the file pathname for initial cluster centers
 * @param output
 *          the directory pathname for output points
 * @param measure
 *          the classname of the DistanceMeasure
 * @param convergenceDelta
 *          the convergence delta value
 * @param maxIterations
 *          the maximum number of iterations
 * @param m
 *          the fuzzification factor, see
 *          http://en.wikipedia.org/wiki/Data_clustering#Fuzzy_c-means_clustering
 * @param runSequential if true run in sequential execution mode
 * 
 * @return the Path of the final clusters directory
 */
public static Path buildClusters(Configuration conf,
                                 Path input,
                                 Path clustersIn,
                                 Path output,
                                 DistanceMeasure measure,
                                 double convergenceDelta,
                                 int maxIterations,
                                 float m,
                                 boolean runSequential)
  throws IOException, InterruptedException, ClassNotFoundException {
  
  List<Cluster> clusters = new ArrayList<Cluster>();
  FuzzyKMeansUtil.configureWithClusterInfo(conf, clustersIn, clusters);
  
  if (conf==null) {
    conf = new Configuration();
  }
  
  if (clusters.isEmpty()) {
    throw new IllegalStateException("No input clusters found in " + clustersIn + ". Check your -c argument.");
  }
  
  Path priorClustersPath = new Path(output, Cluster.INITIAL_CLUSTERS_DIR);   
  ClusteringPolicy policy = new FuzzyKMeansClusteringPolicy(m, convergenceDelta);
  ClusterClassifier prior = new ClusterClassifier(clusters, policy);
  prior.writeToSeqFiles(priorClustersPath);
  
  if (runSequential) {
    ClusterIterator.iterateSeq(conf, input, priorClustersPath, output, maxIterations);
  } else {
    ClusterIterator.iterateMR(conf, input, priorClustersPath, output, maxIterations);
  }
  return output;
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:56,代码来源:FuzzyKMeansDriver.java


示例10: classifyClusterSeq

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
private static void classifyClusterSeq(Configuration conf, Path input, Path clusters, Path output,
    Double clusterClassificationThreshold, boolean emitMostLikely) throws IOException {
  List<Cluster> clusterModels = populateClusterModels(clusters, conf);
  ClusteringPolicy policy = ClusterClassifier.readPolicy(finalClustersPath(conf, clusters));
  ClusterClassifier clusterClassifier = new ClusterClassifier(clusterModels, policy);
  selectCluster(input, clusterModels, clusterClassifier, output, clusterClassificationThreshold, emitMostLikely);
  
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:9,代码来源:ClusterClassificationDriver.java


示例11: populateClusterModels

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
/**
 * Populates a list with clusters present in clusters-*-final directory.
 * 
 * @param clusterOutputPath
 *          The output path of the clustering.
 * @param conf
 *          The Hadoop Configuration
 * @return The list of clusters found by the clustering.
 * @throws IOException
 */
private static List<Cluster> populateClusterModels(Path clusterOutputPath, Configuration conf) throws IOException {
  List<Cluster> clusterModels = new ArrayList<Cluster>();
  Path finalClustersPath = finalClustersPath(conf, clusterOutputPath);
  Iterator<?> it = new SequenceFileDirValueIterator<Writable>(finalClustersPath, PathType.LIST,
      PathFilters.partFilter(), null, false, conf);
  while (it.hasNext()) {
    ClusterWritable next = (ClusterWritable) it.next();
    Cluster cluster = next.getValue();
    cluster.configure(conf);
    clusterModels.add(cluster);
  }
  return clusterModels;
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:24,代码来源:ClusterClassificationDriver.java


示例12: classifyAndWrite

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
private static void classifyAndWrite(List<Cluster> clusterModels, Double clusterClassificationThreshold,
    boolean emitMostLikely, SequenceFile.Writer writer, VectorWritable vw, Vector pdfPerCluster) throws IOException {
  if (emitMostLikely) {
    int maxValueIndex = pdfPerCluster.maxValueIndex();
    WeightedVectorWritable wvw = new WeightedVectorWritable(pdfPerCluster.maxValue(), vw.get());
    write(clusterModels, writer, wvw, maxValueIndex);
  } else {
    writeAllAboveThreshold(clusterModels, clusterClassificationThreshold, writer, vw, pdfPerCluster);
  }
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:11,代码来源:ClusterClassificationDriver.java


示例13: close

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
@Override
public void close(ClusterClassifier posterior) {
  for (Cluster cluster : posterior.getModels()) {
    cluster.computeParameters();
  }
  
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigData-Max,代码行数:8,代码来源:AbstractClusteringPolicy.java


示例14: write

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
@Override
public void write(DataOutput out) throws IOException {
  out.writeInt(models.size());
  out.writeUTF(modelClass);
  new ClusteringPolicyWritable(policy).write(out);
  for (Cluster cluster : models) {
    cluster.write(out);
  }
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:10,代码来源:ClusterClassifier.java


示例15: readFields

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
@Override
public void readFields(DataInput in) throws IOException {
  int size = in.readInt();
  modelClass = in.readUTF();
  models = Lists.newArrayList();
  ClusteringPolicyWritable clusteringPolicyWritable = new ClusteringPolicyWritable();
  clusteringPolicyWritable.readFields(in);
  policy = clusteringPolicyWritable.getValue();
  for (int i = 0; i < size; i++) {
    Cluster element = ClassUtils.instantiateAs(modelClass, Cluster.class);
    element.readFields(in);
    models.add(element);
  }
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:15,代码来源:ClusterClassifier.java


示例16: readFromSeqFiles

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
public void readFromSeqFiles(Configuration conf, Path path) throws IOException {
  Configuration config = new Configuration();
  List<Cluster> clusters = Lists.newArrayList();
  for (ClusterWritable cw : new SequenceFileDirValueIterable<ClusterWritable>(path, PathType.LIST,
      PathFilters.logsCRCFilter(), config)) {
    Cluster cluster = cw.getValue();
    cluster.configure(conf);
    clusters.add(cluster);
  }
  this.models = clusters;
  modelClass = models.get(0).getClass().getName();
  this.policy = readPolicy(path);
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:14,代码来源:ClusterClassifier.java


示例17: populateClusterModels

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
public static List<Cluster> populateClusterModels(Path clusterOutputPath, Configuration conf) throws IOException {
  List<Cluster> clusters = new ArrayList<Cluster>();
  FileSystem fileSystem = clusterOutputPath.getFileSystem(conf);
  FileStatus[] clusterFiles = fileSystem.listStatus(clusterOutputPath, PathFilters.finalPartFilter());
  Iterator<?> it = new SequenceFileDirValueIterator<Writable>(
      clusterFiles[0].getPath(), PathType.LIST, PathFilters.partFilter(),
      null, false, conf);
  while (it.hasNext()) {
    ClusterWritable next = (ClusterWritable) it.next();
    Cluster cluster = next.getValue();
    cluster.configure(conf);
    clusters.add(cluster);
  }
  return clusters;
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:16,代码来源:ClusterClassificationMapper.java


示例18: classify

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
@Override
public Vector classify(Vector data, ClusterClassifier prior) {
  Collection<SoftCluster> clusters = Lists.newArrayList();
  List<Double> distances = Lists.newArrayList();
  for (Cluster model : prior.getModels()) {
    SoftCluster sc = (SoftCluster) model;
    clusters.add(sc);
    distances.add(sc.getMeasure().distance(data, sc.getCenter()));
  }
  FuzzyKMeansClusterer fuzzyKMeansClusterer = new FuzzyKMeansClusterer();
  fuzzyKMeansClusterer.setM(m);
  return fuzzyKMeansClusterer.computePi(clusters, distances);
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigData-Max,代码行数:14,代码来源:FuzzyKMeansClusteringPolicy.java


示例19: close

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
@Override
public void close(ClusterClassifier posterior) {
  boolean allConverged = true;
  for (Cluster cluster : posterior.getModels()) {
    org.apache.mahout.clustering.kmeans.Kluster kluster = (org.apache.mahout.clustering.kmeans.Kluster) cluster;
    boolean converged = kluster.calculateConvergence(convergenceDelta);
    allConverged = allConverged && converged;
    cluster.computeParameters();
  }
  
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:12,代码来源:KMeansClusteringPolicy.java


示例20: iterateSeq

import org.apache.mahout.clustering.Cluster; //导入依赖的package包/类
/**
 * Iterate over data using a prior-trained ClusterClassifier, for a number of iterations using a sequential
 * implementation
 * 
 * @param conf
 *          the Configuration
 * @param inPath
 *          a Path to input VectorWritables
 * @param priorPath
 *          a Path to the prior classifier
 * @param outPath
 *          a Path of output directory
 * @param numIterations
 *          the int number of iterations to perform
 */
public static void iterateSeq(Configuration conf, Path inPath, Path priorPath, Path outPath, int numIterations)
  throws IOException {
  ClusterClassifier classifier = new ClusterClassifier();
  classifier.readFromSeqFiles(conf, priorPath);
  Path clustersOut = null;
  int iteration = 1;
  while (iteration <= numIterations) {
    for (VectorWritable vw : new SequenceFileDirValueIterable<VectorWritable>(inPath, PathType.LIST,
        PathFilters.logsCRCFilter(), conf)) {
      Vector vector = vw.get();
      // classification yields probabilities
      Vector probabilities = classifier.classify(vector);
      // policy selects weights for models given those probabilities
      Vector weights = classifier.getPolicy().select(probabilities);
      // training causes all models to observe data
      for (Iterator<Vector.Element> it = weights.iterateNonZero(); it.hasNext();) {
        int index = it.next().index();
        classifier.train(index, vector, weights.get(index));
      }
    }
    // compute the posterior models
    classifier.close();
    // update the policy
    classifier.getPolicy().update(classifier);
    // output the classifier
    clustersOut = new Path(outPath, Cluster.CLUSTERS_DIR + iteration);
    classifier.writeToSeqFiles(clustersOut);
    FileSystem fs = FileSystem.get(outPath.toUri(), conf);
    iteration++;
    if (isConverged(clustersOut, conf, fs)) {
      break;
    }
  }
  Path finalClustersIn = new Path(outPath, Cluster.CLUSTERS_DIR + (iteration - 1) + Cluster.FINAL_ITERATION_SUFFIX);
  FileSystem.get(clustersOut.toUri(), conf).rename(clustersOut, finalClustersIn);
}
 
开发者ID:saradelrio,项目名称:Chi-FRBCS-BigDataCS,代码行数:52,代码来源:ClusterIterator.java



注:本文中的org.apache.mahout.clustering.Cluster类示例整理自Github/MSDocs等源码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
Java ModifyCachePoolOp类代码示例发布时间:2022-05-23
下一篇:
Java GeoJsonPoint类代码示例发布时间:2022-05-23
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap