• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

Java MultiFileSplit类代码示例

原作者: [db:作者] 来自: [db:来源] 收藏 邀请

本文整理汇总了Java中org.apache.hadoop.mapred.MultiFileSplit的典型用法代码示例。如果您正苦于以下问题:Java MultiFileSplit类的具体用法?Java MultiFileSplit怎么用?Java MultiFileSplit使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。



MultiFileSplit类属于org.apache.hadoop.mapred包,在下文中一共展示了MultiFileSplit类的6个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于我们的系统推荐出更棒的Java代码示例。

示例1: MultiFileLineRecordReader

import org.apache.hadoop.mapred.MultiFileSplit; //导入依赖的package包/类
public MultiFileLineRecordReader(Configuration conf, MultiFileSplit split)
  throws IOException {
  
  this.split = split;
  fs = FileSystem.get(conf);
  this.paths = split.getPaths();
  this.totLength = split.getLength();
  this.offset = 0;
  
  //open the first file
  Path file = paths[count];
  currentStream = fs.open(file);
  currentReader = new BufferedReader(new InputStreamReader(currentStream));
}
 
开发者ID:rhli,项目名称:hadoop-EAR,代码行数:15,代码来源:MultiFileWordCount.java


示例2: WarcFileRecordReader

import org.apache.hadoop.mapred.MultiFileSplit; //导入依赖的package包/类
public WarcFileRecordReader(Configuration conf, InputSplit split) throws IOException {
  this.fs = FileSystem.get(conf);
  this.conf = conf;
  if (split instanceof FileSplit) {
    this.filePathList = new Path[1];
    this.filePathList[0] = ((FileSplit) split).getPath();
  } else if (split instanceof MultiFileSplit) {
    this.filePathList = ((MultiFileSplit) split).getPaths();
  } else {
    throw new IOException("InputSplit is not a file split or a multi-file split - aborting");
  }

  // get the total file sizes
  for (int i = 0; i < filePathList.length; i++) {
    totalFileSize += fs.getFileStatus(filePathList[i]).getLen();
  }

  Class<? extends CompressionCodec> codecClass = null;

  try {
    codecClass = conf.getClassByName("org.apache.hadoop.io.compress.GzipCodec")
        .asSubclass(CompressionCodec.class);
    compressionCodec = (CompressionCodec) ReflectionUtils.newInstance(codecClass, conf);
  } catch (ClassNotFoundException cnfEx) {
    compressionCodec = null;
    LOG.info("!!! ClassNotFound Exception thrown setting Gzip codec");
  }

  openNextFile();
}
 
开发者ID:lucidworks,项目名称:solr-hadoop-common,代码行数:31,代码来源:WarcFileRecordReader.java


示例3: WarcFileRecordReader

import org.apache.hadoop.mapred.MultiFileSplit; //导入依赖的package包/类
public WarcFileRecordReader(Configuration conf, InputSplit split) throws IOException {
  if (split instanceof FileSplit) {
    this.filePathList=new Path[1];
    this.filePathList[0]=((FileSplit)split).getPath();
  } else if (split instanceof MultiFileSplit) {
    this.filePathList=((MultiFileSplit)split).getPaths();
  } else {
    throw new IOException("InputSplit is not a file split or a multi-file split - aborting");
  }

  // Use FileSystem.get to open Common Crawl URIs using the S3 protocol.
  URI uri = filePathList[0].toUri();
  this.fs = FileSystem.get(uri, conf);
  
  // get the total file sizes
  for (int i=0; i < filePathList.length; i++) {
    totalFileSize += fs.getFileStatus(filePathList[i]).getLen();
  }

  Class<? extends CompressionCodec> codecClass=null;

  try {
    codecClass=conf.getClassByName("org.apache.hadoop.io.compress.GzipCodec").asSubclass(CompressionCodec.class);
    compressionCodec=(CompressionCodec)ReflectionUtils.newInstance(codecClass, conf);
  } catch (ClassNotFoundException cnfEx) {
    compressionCodec=null;
    LOG.info("!!! ClassNotFoun Exception thrown setting Gzip codec");
  }

  openNextFile();
}
 
开发者ID:rossf7,项目名称:wikireverse,代码行数:32,代码来源:WarcFileRecordReader.java


示例4: MultiFileLineRecordReader

import org.apache.hadoop.mapred.MultiFileSplit; //导入依赖的package包/类
public MultiFileLineRecordReader(Configuration conf, MultiFileSplit split)
        throws IOException {

    this.split = split;
    fs = FileSystem.get(conf);
    this.paths = split.getPaths();
    this.totLength = split.getLength();
    this.offset = 0;

    //open the first file
    Path file = paths[count];
    currentStream = fs.open(file);
    currentReader = new BufferedReader(new InputStreamReader(currentStream));
}
 
开发者ID:elephantscale,项目名称:hadoop-book,代码行数:15,代码来源:MultiFileWordCount.java


示例5: getRecordReader

import org.apache.hadoop.mapred.MultiFileSplit; //导入依赖的package包/类
@Override
public RecordReader<WordOffset,Text> getRecordReader(InputSplit split
    , JobConf job, Reporter reporter) throws IOException {
  return new MultiFileLineRecordReader(job, (MultiFileSplit)split);
}
 
开发者ID:rhli,项目名称:hadoop-EAR,代码行数:6,代码来源:MultiFileWordCount.java


示例6: getRecordReader

import org.apache.hadoop.mapred.MultiFileSplit; //导入依赖的package包/类
@Override
public RecordReader<WordOffset, Text> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException {
    return new MultiFileLineRecordReader(job, (MultiFileSplit) split);
}
 
开发者ID:elephantscale,项目名称:hadoop-book,代码行数:5,代码来源:MultiFileWordCount.java



注:本文中的org.apache.hadoop.mapred.MultiFileSplit类示例整理自Github/MSDocs等源码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
Java ViewHolder类代码示例发布时间:2022-05-22
下一篇:
Java EntitySpawnPlacementRegistry类代码示例发布时间:2022-05-22
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap