• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

Java ParquetMetadata类代码示例

原作者: [db:作者] 来自: [db:来源] 收藏 邀请

本文整理汇总了Java中org.apache.parquet.hadoop.metadata.ParquetMetadata的典型用法代码示例。如果您正苦于以下问题:Java ParquetMetadata类的具体用法?Java ParquetMetadata怎么用?Java ParquetMetadata使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。



ParquetMetadata类属于org.apache.parquet.hadoop.metadata包,在下文中一共展示了ParquetMetadata类的20个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于我们的系统推荐出更棒的Java代码示例。

示例1: test

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
@Override
public void test() throws IOException {
  Configuration configuration = new Configuration();
  ParquetMetadata metadata = ParquetFileReader.readFooter(configuration,
      super.fsPath, ParquetMetadataConverter.NO_FILTER);
  ParquetFileReader reader = new ParquetFileReader(configuration,
    metadata.getFileMetaData(),
    super.fsPath,
    metadata.getBlocks(),
    metadata.getFileMetaData().getSchema().getColumns());

  PageStatsValidator validator = new PageStatsValidator();

  PageReadStore pageReadStore;
  while ((pageReadStore = reader.readNextRowGroup()) != null) {
    validator.validate(metadata.getFileMetaData().getSchema(), pageReadStore);
  }
}
 
开发者ID:apache,项目名称:parquet-mr,代码行数:19,代码来源:TestStatistics.java


示例2: readDictionaries

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
/**
 * Return dictionary per row group for all binary columns in given parquet file.
 * @param fs filesystem object.
 * @param filePath parquet file to scan
 * @return pair of dictionaries found for binary fields and list of binary fields which are not dictionary encoded.
 * @throws IOException
 */
public static Pair<Map<ColumnDescriptor, Dictionary>, Set<ColumnDescriptor>> readDictionaries(FileSystem fs, Path filePath, CodecFactory codecFactory) throws IOException {
  final ParquetMetadata parquetMetadata = ParquetFileReader.readFooter(fs.getConf(), filePath, ParquetMetadataConverter.NO_FILTER);
  if (parquetMetadata.getBlocks().size() > 1) {
    throw new IOException(
      format("Global dictionaries can only be built on a parquet file with a single row group, found %d row groups for file %s",
        parquetMetadata.getBlocks().size(), filePath));
  }
  final BlockMetaData rowGroupMetadata = parquetMetadata.getBlocks().get(0);
  final Map<ColumnPath, ColumnDescriptor> columnDescriptorMap = Maps.newHashMap();

  for (ColumnDescriptor columnDescriptor : parquetMetadata.getFileMetaData().getSchema().getColumns()) {
    columnDescriptorMap.put(ColumnPath.get(columnDescriptor.getPath()), columnDescriptor);
  }

  final Set<ColumnDescriptor> columnsToSkip = Sets.newHashSet(); // columns which are found in parquet file but are not dictionary encoded
  final Map<ColumnDescriptor, Dictionary> dictionaries = Maps.newHashMap();
  try(final FSDataInputStream in = fs.open(filePath)) {
    for (ColumnChunkMetaData columnChunkMetaData : rowGroupMetadata.getColumns()) {
      if (isBinaryType(columnChunkMetaData.getType())) {
        final ColumnDescriptor column = columnDescriptorMap.get(columnChunkMetaData.getPath());
        // if first page is dictionary encoded then load dictionary, otherwise skip this column.
        final PageHeaderWithOffset pageHeader = columnChunkMetaData.getPageHeaders().get(0);
        if (PageType.DICTIONARY_PAGE == pageHeader.getPageHeader().getType()) {
          dictionaries.put(column, readDictionary(in, column, pageHeader, codecFactory.getDecompressor(columnChunkMetaData.getCodec())));
        } else {
          columnsToSkip.add(column);
        }
      }
    }
  }
  return new ImmutablePair<>(dictionaries, columnsToSkip);
}
 
开发者ID:dremio,项目名称:dremio-oss,代码行数:40,代码来源:LocalDictionariesReader.java


示例3: DeprecatedParquetVectorizedReader

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
public DeprecatedParquetVectorizedReader(
  OperatorContext operatorContext,
  String path,
  int rowGroupIndex,
  FileSystem fs,
  CodecFactory codecFactory,
  ParquetMetadata footer,
  List<SchemaPath> columns,
  ParquetReaderUtility.DateCorruptionStatus dateCorruptionStatus,
  boolean readInt96AsTimeStamp,
  Map<String, GlobalDictionaryFieldInfo> globalDictionaryColumns,
  GlobalDictionaries globalDictionaries) throws ExecutionSetupException {
  super(operatorContext, columns);
  this.hadoopPath = new Path(path);
  this.fileSystem = fs;
  this.codecFactory = codecFactory;
  this.rowGroupIndex = rowGroupIndex;
  this.footer = footer;
  this.dateCorruptionStatus = dateCorruptionStatus;
  this.readInt96AsTimeStamp = readInt96AsTimeStamp;
  this.globalDictionaryColumns = globalDictionaryColumns == null? Collections.<String, GlobalDictionaryFieldInfo>emptyMap() : globalDictionaryColumns;
  this.globalDictionaries = globalDictionaries;
  this.singleInputStream = null;
}
 
开发者ID:dremio,项目名称:dremio-oss,代码行数:25,代码来源:DeprecatedParquetVectorizedReader.java


示例4: getReaders

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
@Override
public List<RecordReader> getReaders(UnifiedParquetReader unifiedReader) throws ExecutionSetupException {
  final ParquetMetadata footer = unifiedReader.getFooter();
  final DateCorruptionStatus containsCorruptDates = ParquetReaderUtility.detectCorruptDates(footer,
    unifiedReader.columnsInGroupScan, unifiedReader.autoCorrectCorruptDates);
  List<RecordReader> returnList = new ArrayList<>();
    returnList.add(unifiedReader.addFilterIfNecessary(
      new DeprecatedParquetVectorizedReader(
        unifiedReader.context,
        unifiedReader.readEntry.getPath(), unifiedReader.readEntry.getRowGroupIndex(), unifiedReader.fs,
        CodecFactory.createDirectCodecFactory(
          unifiedReader.fs.getConf(),
          new ParquetDirectByteBufferAllocator(unifiedReader.context.getAllocator()), 0),
        footer,
        unifiedReader.realFields,
        containsCorruptDates,
        unifiedReader.readInt96AsTimeStamp,
        unifiedReader.globalDictionaryFieldInfoMap,
        unifiedReader.dictionaries
      )
    ));
  return returnList;
}
 
开发者ID:dremio,项目名称:dremio-oss,代码行数:24,代码来源:UnifiedParquetReader.java


示例5: FileSplitParquetRecordReader

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
public FileSplitParquetRecordReader(
    final OperatorContext oContext,
    final ParquetReaderFactory readerFactory,
    final List<SchemaPath> columnsToRead,
    final List<SchemaPath> groupScanColumns,
    final List<FilterCondition> conditions,
    final FileSplit fileSplit,
    final ParquetMetadata footer,
    final JobConf jobConf,
    final boolean vectorize,
    final boolean enableDetailedTracing
) {
  this.oContext = oContext;
  this.columnsToRead = columnsToRead;
  this.groupScanColumns = groupScanColumns;
  this.conditions = conditions;
  this.fileSplit = fileSplit;
  this.footer = footer;
  this.jobConf = jobConf;
  this.readerFactory = readerFactory;
  this.vectorize = vectorize;
  this.enableDetailedTracing = enableDetailedTracing;
}
 
开发者ID:dremio,项目名称:dremio-oss,代码行数:24,代码来源:FileSplitParquetRecordReader.java


示例6: getRowGroupNumbersFromFileSplit

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
/**
 * Get the list of row group numbers for given file input split. Logic used here is same as how Hive's parquet input
 * format finds the row group numbers for input split.
 */
private static List<Integer> getRowGroupNumbersFromFileSplit(final FileSplit split,
    final ParquetMetadata footer) throws IOException {
  final List<BlockMetaData> blocks = footer.getBlocks();

  final long splitStart = split.getStart();
  final long splitLength = split.getLength();

  final List<Integer> rowGroupNums = Lists.newArrayList();

  int i = 0;
  for (final BlockMetaData block : blocks) {
    final long firstDataPage = block.getColumns().get(0).getFirstDataPageOffset();
    if (firstDataPage >= splitStart && firstDataPage < splitStart + splitLength) {
      rowGroupNums.add(i);
    }
    i++;
  }

  return rowGroupNums;
}
 
开发者ID:dremio,项目名称:dremio-oss,代码行数:25,代码来源:FileSplitParquetRecordReader.java


示例7: read

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
@Test
public void read(String fileName) throws IOException
{
    Path path = new Path(fileName);
    Configuration conf = new Configuration();
    conf.set("fs.hdfs.impl", DistributedFileSystem.class.getName());

    ParquetMetadata metadata = ParquetFileReader.readFooter(conf, path, NO_FILTER);
    ParquetFileReader reader = new ParquetFileReader(conf, metadata.getFileMetaData(), path, metadata.getBlocks(), metadata.getFileMetaData().getSchema().getColumns());
    PageReadStore pageReadStore;
    PageReader pageReader;
    DataPage page;
    while ((pageReadStore = reader.readNextRowGroup()) != null) {
        for (ColumnDescriptor cd: metadata.getFileMetaData().getSchema().getColumns()) {
            pageReader = pageReadStore.getPageReader(cd);
            page = pageReader.readPage();
        }
    }
}
 
开发者ID:dbiir,项目名称:RealtimeAnalysis,代码行数:20,代码来源:ParquetFileReaderTest.java


示例8: ParquetFooterStatCollector

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
public ParquetFooterStatCollector(ParquetMetadata footer, int rowGroupIndex, Map<String, String> implicitColValues,
    boolean autoCorrectCorruptDates, OptionManager options) {
  this.footer = footer;
  this.rowGroupIndex = rowGroupIndex;

  // Reasons to pass implicit columns and their values:
  // 1. Differentiate implicit columns from regular non-exist columns. Implicit columns do not
  //    exist in parquet metadata. Without such knowledge, implicit columns is treated as non-exist
  //    column.  A condition on non-exist column would lead to canDrop = true, which is not the
  //    right behavior for condition on implicit columns.

  // 2. Pass in the implicit column name with corresponding values, and wrap them in Statistics with
  //    min and max having same value. This expands the possibility of pruning.
  //    For example, regCol = 5 or dir0 = 1995. If regCol is not a partition column, we would not do
  //    any partition pruning in the current partition pruning logical. Pass the implicit column values
  //    may allow us to prune some row groups using condition regCol = 5 or dir0 = 1995.

  this.implicitColValues = implicitColValues;
  this.autoCorrectCorruptDates = autoCorrectCorruptDates;
  this.options = options;
}
 
开发者ID:axbaretto,项目名称:drill,代码行数:22,代码来源:ParquetFooterStatCollector.java


示例9: getScanBatch

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
private RecordBatch getScanBatch() throws Exception {
  List<RecordReader> readers = Lists.newArrayList();

  for (String path : inputPaths) {
    ParquetMetadata footer = ParquetFileReader.readFooter(fs.getConf(), new Path(path));

    for (int i = 0; i < footer.getBlocks().size(); i++) {
      readers.add(new ParquetRecordReader(fragContext,
          path,
          i,
          fs,
          CodecFactory.createDirectCodecFactory(fs.getConf(),
              new ParquetDirectByteBufferAllocator(opContext.getAllocator()), 0),
          footer,
          columnsToRead,
          ParquetReaderUtility.DateCorruptionStatus.META_SHOWS_NO_CORRUPTION));
    }
  }

  RecordBatch scanBatch = new ScanBatch(null, fragContext, readers);
  return scanBatch;
}
 
开发者ID:axbaretto,项目名称:drill,代码行数:23,代码来源:MiniPlanUnitTestBase.java


示例10: testIntPredicateAgainstAllNullColWithEval

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
@Test
public void testIntPredicateAgainstAllNullColWithEval() throws Exception {
  // intAllNull.parquet has only one int column with all values being NULL.
  // column values statistics: num_nulls: 25, min/max is not defined
  final File file = dirTestWatcher.getRootDir()
    .toPath()
    .resolve(Paths.get("parquetFilterPush", "intTbl", "intAllNull.parquet"))
    .toFile();
  ParquetMetadata footer = getParquetMetaData(file);

  testParquetRowGroupFilterEval(footer, "intCol = 100", true);
  testParquetRowGroupFilterEval(footer, "intCol = 0", true);
  testParquetRowGroupFilterEval(footer, "intCol = -100", true);

  testParquetRowGroupFilterEval(footer, "intCol > 10", true);
  testParquetRowGroupFilterEval(footer, "intCol >= 10", true);

  testParquetRowGroupFilterEval(footer, "intCol < 10", true);
  testParquetRowGroupFilterEval(footer, "intCol <= 10", true);
}
 
开发者ID:axbaretto,项目名称:drill,代码行数:21,代码来源:TestParquetFilterPushDown.java


示例11: testDatePredicateAgainstDrillCTASHelper

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
private void testDatePredicateAgainstDrillCTASHelper(ParquetMetadata footer) throws Exception{
  testParquetRowGroupFilterEval(footer, "o_orderdate = cast('1992-01-01' as date)", false);
  testParquetRowGroupFilterEval(footer, "o_orderdate = cast('1991-12-31' as date)", true);

  testParquetRowGroupFilterEval(footer, "o_orderdate >= cast('1991-12-31' as date)", false);
  testParquetRowGroupFilterEval(footer, "o_orderdate >= cast('1992-01-03' as date)", false);
  testParquetRowGroupFilterEval(footer, "o_orderdate >= cast('1992-01-04' as date)", true);

  testParquetRowGroupFilterEval(footer, "o_orderdate > cast('1992-01-01' as date)", false);
  testParquetRowGroupFilterEval(footer, "o_orderdate > cast('1992-01-03' as date)", true);

  testParquetRowGroupFilterEval(footer, "o_orderdate <= cast('1992-01-01' as date)", false);
  testParquetRowGroupFilterEval(footer, "o_orderdate <= cast('1991-12-31' as date)", true);

  testParquetRowGroupFilterEval(footer, "o_orderdate < cast('1992-01-02' as date)", false);
  testParquetRowGroupFilterEval(footer, "o_orderdate < cast('1992-01-01' as date)", true);
}
 
开发者ID:axbaretto,项目名称:drill,代码行数:18,代码来源:TestParquetFilterPushDown.java


示例12: getRowGroupNumbersFromFileSplit

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
/**
 * Get the list of row group numbers for given file input split. Logic used here is same as how Hive's parquet input
 * format finds the row group numbers for input split.
 */
private List<Integer> getRowGroupNumbersFromFileSplit(final FileSplit split,
    final ParquetMetadata footer) throws IOException {
  final List<BlockMetaData> blocks = footer.getBlocks();

  final long splitStart = split.getStart();
  final long splitLength = split.getLength();

  final List<Integer> rowGroupNums = Lists.newArrayList();

  int i = 0;
  for (final BlockMetaData block : blocks) {
    final long firstDataPage = block.getColumns().get(0).getFirstDataPageOffset();
    if (firstDataPage >= splitStart && firstDataPage < splitStart + splitLength) {
      rowGroupNums.add(i);
    }
    i++;
  }

  return rowGroupNums;
}
 
开发者ID:axbaretto,项目名称:drill,代码行数:25,代码来源:HiveDrillNativeScanBatchCreator.java


示例13: checkCompatibility

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
private static void checkCompatibility(ParquetMetadata metadata) {
  // make sure we can map Parquet blocks to Chunks
  for (BlockMetaData block : metadata.getBlocks()) {
    if (block.getRowCount() > Integer.MAX_VALUE) {
      IcedHashMapGeneric.IcedHashMapStringObject dbg = new IcedHashMapGeneric.IcedHashMapStringObject();
      dbg.put("startingPos", block.getStartingPos());
      dbg.put("rowCount", block.getRowCount());
      throw new H2OUnsupportedDataFileException("Unsupported Parquet file (technical limitation).",
              "Current implementation doesn't support Parquet files with blocks larger than " +
              Integer.MAX_VALUE + " rows.", dbg); // because we map each block to a single H2O Chunk
    }
  }
  // check that file doesn't have nested structures
  MessageType schema = metadata.getFileMetaData().getSchema();
  for (String[] path : schema.getPaths())
    if (path.length != 1) {
      throw new H2OUnsupportedDataFileException("Parquet files with nested structures are not supported.",
              "Detected a column with a nested structure " + Arrays.asList(path));
    }
}
 
开发者ID:h2oai,项目名称:h2o-3,代码行数:21,代码来源:ParquetParser.java


示例14: readFirstRecords

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
private static ParquetPreviewParseWriter readFirstRecords(ParquetParseSetup initSetup, ByteVec vec, int cnt) {
  ParquetMetadata metadata = VecParquetReader.readFooter(initSetup.parquetMetadata);
  List<BlockMetaData> blockMetaData;
  if (metadata.getBlocks().isEmpty()) {
    blockMetaData = Collections.<BlockMetaData>emptyList();
  } else {
    final BlockMetaData firstBlock = findFirstBlock(metadata);
    blockMetaData = Collections.singletonList(firstBlock);
  }
  ParquetMetadata startMetadata = new ParquetMetadata(metadata.getFileMetaData(), blockMetaData);
  ParquetPreviewParseWriter ppWriter = new ParquetPreviewParseWriter(initSetup);
  VecParquetReader reader = new VecParquetReader(vec, startMetadata, ppWriter, ppWriter._roughTypes);
  try {
    int recordCnt = 0;
    Integer recordNum;
    do {
      recordNum = reader.read();
    } while ((recordNum != null) && (++recordCnt < cnt));
    return ppWriter;
  } catch (IOException e) {
    throw new RuntimeException("Failed to read the first few records", e);
  }
}
 
开发者ID:h2oai,项目名称:h2o-3,代码行数:24,代码来源:ParquetParser.java


示例15: execute

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
@Override
public void execute(CommandLine options) throws Exception {
    super.execute(options);

    String[] args = options.getArgs();
    String input = args[0];

    Configuration conf = new Configuration();
    Path inpath = new Path(input);

    ParquetMetadata metaData = ParquetFileReader.readFooter(conf, inpath, NO_FILTER);
    MessageType schema = metaData.getFileMetaData().getSchema();

    boolean showmd = !options.hasOption('m');
    boolean showdt = !options.hasOption('d');
    boolean cropoutput = !options.hasOption('n');

    Set<String> showColumns = null;
    if (options.hasOption('c')) {
        String[] cols = options.getOptionValues('c');
        showColumns = new HashSet<String>(Arrays.asList(cols));
    }

    PrettyPrintWriter out = prettyPrintWriter(cropoutput);
    dump(out, metaData, schema, inpath, showmd, showdt, showColumns);
}
 
开发者ID:apache,项目名称:parquet-mr,代码行数:27,代码来源:DumpCommand.java


示例16: add

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
private static void add(ParquetMetadata footer) {
  for (BlockMetaData blockMetaData : footer.getBlocks()) {
    ++ blockCount;
    MessageType schema = footer.getFileMetaData().getSchema();
    recordCount += blockMetaData.getRowCount();
    List<ColumnChunkMetaData> columns = blockMetaData.getColumns();
    for (ColumnChunkMetaData columnMetaData : columns) {
      ColumnDescriptor desc = schema.getColumnDescription(columnMetaData.getPath().toArray());
      add(
          desc,
          columnMetaData.getValueCount(),
          columnMetaData.getTotalSize(),
          columnMetaData.getTotalUncompressedSize(),
          columnMetaData.getEncodings(),
          columnMetaData.getStatistics());
    }
  }
}
 
开发者ID:apache,项目名称:parquet-mr,代码行数:19,代码来源:PrintFooter.java


示例17: writeMetadataFile

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
/**
 * writes _common_metadata file, and optionally a _metadata file depending on the {@link JobSummaryLevel} provided
 * @deprecated metadata files are not recommended and will be removed in 2.0.0
 */
@Deprecated
public static void writeMetadataFile(Configuration configuration, Path outputPath, List<Footer> footers, JobSummaryLevel level) throws IOException {
  Preconditions.checkArgument(level == JobSummaryLevel.ALL || level == JobSummaryLevel.COMMON_ONLY,
      "Unsupported level: " + level);

  FileSystem fs = outputPath.getFileSystem(configuration);
  outputPath = outputPath.makeQualified(fs);
  ParquetMetadata metadataFooter = mergeFooters(outputPath, footers);

  if (level == JobSummaryLevel.ALL) {
    writeMetadataFile(outputPath, metadataFooter, fs, PARQUET_METADATA_FILE);
  }

  metadataFooter.getBlocks().clear();
  writeMetadataFile(outputPath, metadataFooter, fs, PARQUET_COMMON_METADATA_FILE);
}
 
开发者ID:apache,项目名称:parquet-mr,代码行数:21,代码来源:ParquetFileWriter.java


示例18: mergeFooters

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
static ParquetMetadata mergeFooters(Path root, List<Footer> footers) {
  String rootPath = root.toUri().getPath();
  GlobalMetaData fileMetaData = null;
  List<BlockMetaData> blocks = new ArrayList<BlockMetaData>();
  for (Footer footer : footers) {
      String footerPath = footer.getFile().toUri().getPath();
    if (!footerPath.startsWith(rootPath)) {
      throw new ParquetEncodingException(footerPath + " invalid: all the files must be contained in the root " + root);
    }
    footerPath = footerPath.substring(rootPath.length());
    while (footerPath.startsWith("/")) {
      footerPath = footerPath.substring(1);
    }
    fileMetaData = mergeInto(footer.getParquetMetadata().getFileMetaData(), fileMetaData);
    for (BlockMetaData block : footer.getParquetMetadata().getBlocks()) {
      block.setPath(footerPath);
      blocks.add(block);
    }
  }
  return new ParquetMetadata(fileMetaData.merge(), blocks);
}
 
开发者ID:apache,项目名称:parquet-mr,代码行数:22,代码来源:ParquetFileWriter.java


示例19: toParquetMetadata

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
public FileMetaData toParquetMetadata(int currentVersion, ParquetMetadata parquetMetadata) {
  List<BlockMetaData> blocks = parquetMetadata.getBlocks();
  List<RowGroup> rowGroups = new ArrayList<RowGroup>();
  long numRows = 0;
  for (BlockMetaData block : blocks) {
    numRows += block.getRowCount();
    addRowGroup(parquetMetadata, rowGroups, block);
  }
  FileMetaData fileMetaData = new FileMetaData(
      currentVersion,
      toParquetSchema(parquetMetadata.getFileMetaData().getSchema()),
      numRows,
      rowGroups);

  Set<Entry<String, String>> keyValues = parquetMetadata.getFileMetaData().getKeyValueMetaData().entrySet();
  for (Entry<String, String> keyValue : keyValues) {
    addKeyValue(fileMetaData, keyValue.getKey(), keyValue.getValue());
  }

  fileMetaData.setCreated_by(parquetMetadata.getFileMetaData().getCreatedBy());

  fileMetaData.setColumn_orders(getColumnOrders(parquetMetadata.getFileMetaData().getSchema()));

  return fileMetaData;
}
 
开发者ID:apache,项目名称:parquet-mr,代码行数:26,代码来源:ParquetMetadataConverter.java


示例20: testFailDroppingColumns

import org.apache.parquet.hadoop.metadata.ParquetMetadata; //导入依赖的package包/类
@Test
public void testFailDroppingColumns() throws IOException {
  MessageType droppedColumnSchema = Types.buildMessage()
      .required(BINARY).as(UTF8).named("string")
      .named("AppendTest");

  final ParquetMetadata footer = ParquetFileReader.readFooter(
      CONF, file1, NO_FILTER);
  final FSDataInputStream incoming = file1.getFileSystem(CONF).open(file1);

  Path droppedColumnFile = newTemp();
  final ParquetFileWriter writer = new ParquetFileWriter(
      CONF, droppedColumnSchema, droppedColumnFile);
  writer.start();

  TestUtils.assertThrows("Should complain that id column is dropped",
      IllegalArgumentException.class,
      new Callable<Void>() {
        @Override
        public Void call() throws Exception {
          writer.appendRowGroups(incoming, footer.getBlocks(), false);
          return null;
        }
      });
}
 
开发者ID:apache,项目名称:parquet-mr,代码行数:26,代码来源:TestParquetWriterAppendBlocks.java



注:本文中的org.apache.parquet.hadoop.metadata.ParquetMetadata类示例整理自Github/MSDocs等源码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
Java Plugin类代码示例发布时间:2022-05-22
下一篇:
Java ServiceDefinition类代码示例发布时间:2022-05-22
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap