Skip to content

Commit

Permalink
Implement ORC chunked reader (#15094)
Browse files Browse the repository at this point in the history
This implements ORC chunked reader, to support reading ORC such that:
 * The output is multiple tables instead of once, each of them is issue when calling to `read_chunk()`, and has limited size which stays within a given `output_limit` parameter.
 * The temporary device memory usage can be limited by a soft limit `data_read_limit` parameter, allowing to read very large ORC files without OOM.
 * ORC files containing many billions of rows can be properly read chunk-by-chunk without seeing the size overflow issue when the number of rows exceeds cudf size limit (`2^31` rows).

Depends on:
 * #14911
 * #15008
 * #15169
 * #15252

Partially contribute to #12228.

---

## Benchmarks

Due to some small optimizations in ORC reader, reading ORC files all-at-once (reading the entire file into just one output table) can be a little bit faster. For example, with the benchmark `orc_read_io_compression`:
```
## [0] Quadro RTX 6000

|      io       |  compression  |  cardinality  |  run_length  |   Ref Time |   Ref Noise |   Cmp Time |   Cmp Noise |          Diff |   %Diff |  Status  |
|---------------|---------------|---------------|--------------|------------|-------------|------------|-------------|---------------|---------|----------|
|   FILEPATH    |    SNAPPY     |       0       |      1       | 183.027 ms |       7.45% | 157.293 ms |       4.72% | -25733.837 us | -14.06% |   FAIL   |
|   FILEPATH    |    SNAPPY     |     1000      |      1       | 198.228 ms |       6.43% | 164.395 ms |       4.14% | -33833.020 us | -17.07% |   FAIL   |
|   FILEPATH    |    SNAPPY     |       0       |      32      |  96.676 ms |       6.19% |  82.522 ms |       1.36% | -14153.945 us | -14.64% |   FAIL   |
|   FILEPATH    |    SNAPPY     |     1000      |      32      |  94.508 ms |       4.80% |  81.078 ms |       0.48% | -13429.672 us | -14.21% |   FAIL   |
|   FILEPATH    |     NONE      |       0       |      1       | 161.868 ms |       5.40% | 139.849 ms |       2.44% | -22018.910 us | -13.60% |   FAIL   |
|   FILEPATH    |     NONE      |     1000      |      1       | 164.902 ms |       5.80% | 142.041 ms |       3.43% | -22861.258 us | -13.86% |   FAIL   |
|   FILEPATH    |     NONE      |       0       |      32      |  88.298 ms |       5.15% |  74.924 ms |       1.97% | -13374.607 us | -15.15% |   FAIL   |
|   FILEPATH    |     NONE      |     1000      |      32      |  87.147 ms |       5.61% |  72.502 ms |       0.50% | -14645.122 us | -16.81% |   FAIL   |
|  HOST_BUFFER  |    SNAPPY     |       0       |      1       | 124.990 ms |       0.39% | 111.670 ms |       2.13% | -13320.483 us | -10.66% |   FAIL   |
|  HOST_BUFFER  |    SNAPPY     |     1000      |      1       | 149.858 ms |       4.10% | 126.266 ms |       0.48% | -23591.543 us | -15.74% |   FAIL   |
|  HOST_BUFFER  |    SNAPPY     |       0       |      32      |  92.499 ms |       4.46% |  77.653 ms |       1.58% | -14846.471 us | -16.05% |   FAIL   |
|  HOST_BUFFER  |    SNAPPY     |     1000      |      32      |  93.373 ms |       4.14% |  80.033 ms |       3.19% | -13340.002 us | -14.29% |   FAIL   |
|  HOST_BUFFER  |     NONE      |       0       |      1       | 111.792 ms |       0.50% |  97.083 ms |       0.50% | -14709.530 us | -13.16% |   FAIL   |
|  HOST_BUFFER  |     NONE      |     1000      |      1       | 117.646 ms |       5.60% |  97.634 ms |       0.44% | -20012.301 us | -17.01% |   FAIL   |
|  HOST_BUFFER  |     NONE      |       0       |      32      |  84.983 ms |       4.96% |  66.975 ms |       0.50% | -18007.403 us | -21.19% |   FAIL   |
|  HOST_BUFFER  |     NONE      |     1000      |      32      |  82.648 ms |       4.42% |  65.510 ms |       0.91% | -17137.910 us | -20.74% |   FAIL   |
| DEVICE_BUFFER |    SNAPPY     |       0       |      1       |  65.538 ms |       4.02% |  59.399 ms |       2.54% |  -6138.560 us |  -9.37% |   FAIL   |
| DEVICE_BUFFER |    SNAPPY     |     1000      |      1       | 101.427 ms |       4.10% |  92.276 ms |       3.30% |  -9150.278 us |  -9.02% |   FAIL   |
| DEVICE_BUFFER |    SNAPPY     |       0       |      32      |  80.133 ms |       4.64% |  73.959 ms |       3.50% |  -6173.818 us |  -7.70% |   FAIL   |
| DEVICE_BUFFER |    SNAPPY     |     1000      |      32      |  86.232 ms |       4.71% |  77.446 ms |       3.32% |  -8786.606 us | -10.19% |   FAIL   |
| DEVICE_BUFFER |     NONE      |       0       |      1       |  52.189 ms |       6.62% |  45.018 ms |       4.11% |  -7171.043 us | -13.74% |   FAIL   |
| DEVICE_BUFFER |     NONE      |     1000      |      1       |  54.664 ms |       6.76% |  46.855 ms |       3.35% |  -7809.803 us | -14.29% |   FAIL   |
| DEVICE_BUFFER |     NONE      |       0       |      32      |  67.975 ms |       5.12% |  60.553 ms |       4.22% |  -7422.279 us | -10.92% |   FAIL   |
| DEVICE_BUFFER |     NONE      |     1000      |      32      |  68.485 ms |       4.86% |  62.253 ms |       6.23% |  -6232.340 us |  -9.10% |   FAIL   |

```


When memory is limited, chunked read can help avoiding OOM but with some sort of performance trade-off. For example, for reading a table of size 500MB from file using 64MB output limits and 640 MB data read limit:
```
|      io       |  compression  |  cardinality  |  run_length  |   Ref Time |   Ref Noise |   Cmp Time |   Cmp Noise |       Diff |   %Diff |  Status  |
|---------------|---------------|---------------|--------------|------------|-------------|------------|-------------|------------|---------|----------|
|   FILEPATH    |    SNAPPY     |       0       |      1       | 183.027 ms |       7.45% | 350.824 ms |       2.74% | 167.796 ms |  91.68% |   FAIL   |
|   FILEPATH    |    SNAPPY     |     1000      |      1       | 198.228 ms |       6.43% | 322.414 ms |       3.46% | 124.186 ms |  62.65% |   FAIL   |
|   FILEPATH    |    SNAPPY     |       0       |      32      |  96.676 ms |       6.19% | 133.363 ms |       4.78% |  36.686 ms |  37.95% |   FAIL   |
|   FILEPATH    |    SNAPPY     |     1000      |      32      |  94.508 ms |       4.80% | 128.897 ms |       0.37% |  34.389 ms |  36.39% |   FAIL   |
|   FILEPATH    |     NONE      |       0       |      1       | 161.868 ms |       5.40% | 316.637 ms |       4.21% | 154.769 ms |  95.61% |   FAIL   |
|   FILEPATH    |     NONE      |     1000      |      1       | 164.902 ms |       5.80% | 326.043 ms |       3.06% | 161.141 ms |  97.72% |   FAIL   |
|   FILEPATH    |     NONE      |       0       |      32      |  88.298 ms |       5.15% | 124.819 ms |       5.17% |  36.520 ms |  41.36% |   FAIL   |
|   FILEPATH    |     NONE      |     1000      |      32      |  87.147 ms |       5.61% | 123.047 ms |       5.82% |  35.900 ms |  41.19% |   FAIL   |
|  HOST_BUFFER  |    SNAPPY     |       0       |      1       | 124.990 ms |       0.39% | 285.718 ms |       0.78% | 160.728 ms | 128.59% |   FAIL   |
|  HOST_BUFFER  |    SNAPPY     |     1000      |      1       | 149.858 ms |       4.10% | 263.491 ms |       2.89% | 113.633 ms |  75.83% |   FAIL   |
|  HOST_BUFFER  |    SNAPPY     |       0       |      32      |  92.499 ms |       4.46% | 127.881 ms |       0.86% |  35.382 ms |  38.25% |   FAIL   |
|  HOST_BUFFER  |    SNAPPY     |     1000      |      32      |  93.373 ms |       4.14% | 128.022 ms |       0.98% |  34.650 ms |  37.11% |   FAIL   |
|  HOST_BUFFER  |     NONE      |       0       |      1       | 111.792 ms |       0.50% | 241.064 ms |       1.89% | 129.271 ms | 115.64% |   FAIL   |
|  HOST_BUFFER  |     NONE      |     1000      |      1       | 117.646 ms |       5.60% | 248.134 ms |       3.08% | 130.488 ms | 110.92% |   FAIL   |
|  HOST_BUFFER  |     NONE      |       0       |      32      |  84.983 ms |       4.96% | 118.049 ms |       5.99% |  33.066 ms |  38.91% |   FAIL   |
|  HOST_BUFFER  |     NONE      |     1000      |      32      |  82.648 ms |       4.42% | 114.577 ms |       2.34% |  31.929 ms |  38.63% |   FAIL   |
| DEVICE_BUFFER |    SNAPPY     |       0       |      1       |  65.538 ms |       4.02% | 232.466 ms |       3.28% | 166.928 ms | 254.71% |   FAIL   |
| DEVICE_BUFFER |    SNAPPY     |     1000      |      1       | 101.427 ms |       4.10% | 221.578 ms |       1.43% | 120.152 ms | 118.46% |   FAIL   |
| DEVICE_BUFFER |    SNAPPY     |       0       |      32      |  80.133 ms |       4.64% | 120.604 ms |       0.35% |  40.471 ms |  50.50% |   FAIL   |
| DEVICE_BUFFER |    SNAPPY     |     1000      |      32      |  86.232 ms |       4.71% | 125.521 ms |       3.93% |  39.289 ms |  45.56% |   FAIL   |
| DEVICE_BUFFER |     NONE      |       0       |      1       |  52.189 ms |       6.62% | 182.943 ms |       0.29% | 130.754 ms | 250.54% |   FAIL   |
| DEVICE_BUFFER |     NONE      |     1000      |      1       |  54.664 ms |       6.76% | 190.501 ms |       0.49% | 135.836 ms | 248.49% |   FAIL   |
| DEVICE_BUFFER |     NONE      |       0       |      32      |  67.975 ms |       5.12% | 107.172 ms |       3.56% |  39.197 ms |  57.66% |   FAIL   |
| DEVICE_BUFFER |     NONE      |     1000      |      32      |  68.485 ms |       4.86% | 108.097 ms |       2.92% |  39.611 ms |  57.84% |   FAIL   |

```
And if memory is too limited, chunked read with 8MB output limit/80MB data read limit:
```
|      io       |  compression  |  cardinality  |  run_length  |   Ref Time |   Ref Noise |   Cmp Time |   Cmp Noise |       Diff |   %Diff |  Status  |
|      io       |  compression  |  cardinality  |  run_length  |   Ref Time |   Ref Noise |   Cmp Time |   Cmp Noise |       Diff |   %Diff |  Status  |
|---------------|---------------|---------------|--------------|------------|-------------|------------|-------------|------------|---------|----------|
|   FILEPATH    |    SNAPPY     |       0       |      1       | 183.027 ms |       7.45% | 732.926 ms |       1.98% | 549.899 ms | 300.45% |   FAIL   |
|   FILEPATH    |    SNAPPY     |     1000      |      1       | 198.228 ms |       6.43% | 834.309 ms |       4.21% | 636.081 ms | 320.88% |   FAIL   |
|   FILEPATH    |    SNAPPY     |       0       |      32      |  96.676 ms |       6.19% | 363.033 ms |       1.66% | 266.356 ms | 275.51% |   FAIL   |
|   FILEPATH    |    SNAPPY     |     1000      |      32      |  94.508 ms |       4.80% | 313.813 ms |       1.28% | 219.305 ms | 232.05% |   FAIL   |
|   FILEPATH    |     NONE      |       0       |      1       | 161.868 ms |       5.40% | 607.700 ms |       2.90% | 445.832 ms | 275.43% |   FAIL   |
|   FILEPATH    |     NONE      |     1000      |      1       | 164.902 ms |       5.80% | 616.101 ms |       3.46% | 451.199 ms | 273.62% |   FAIL   |
|   FILEPATH    |     NONE      |       0       |      32      |  88.298 ms |       5.15% | 267.703 ms |       0.46% | 179.405 ms | 203.18% |   FAIL   |
|   FILEPATH    |     NONE      |     1000      |      32      |  87.147 ms |       5.61% | 250.528 ms |       0.43% | 163.381 ms | 187.48% |   FAIL   |
|  HOST_BUFFER  |    SNAPPY     |       0       |      1       | 124.990 ms |       0.39% | 636.270 ms |       0.44% | 511.280 ms | 409.06% |   FAIL   |
|  HOST_BUFFER  |    SNAPPY     |     1000      |      1       | 149.858 ms |       4.10% | 747.264 ms |       0.50% | 597.406 ms | 398.65% |   FAIL   |
|  HOST_BUFFER  |    SNAPPY     |       0       |      32      |  92.499 ms |       4.46% | 359.660 ms |       0.19% | 267.161 ms | 288.82% |   FAIL   |
|  HOST_BUFFER  |    SNAPPY     |     1000      |      32      |  93.373 ms |       4.14% | 311.608 ms |       0.43% | 218.235 ms | 233.73% |   FAIL   |
|  HOST_BUFFER  |     NONE      |       0       |      1       | 111.792 ms |       0.50% | 493.797 ms |       0.13% | 382.005 ms | 341.71% |   FAIL   |
|  HOST_BUFFER  |     NONE      |     1000      |      1       | 117.646 ms |       5.60% | 516.706 ms |       0.12% | 399.060 ms | 339.20% |   FAIL   |
|  HOST_BUFFER  |     NONE      |       0       |      32      |  84.983 ms |       4.96% | 258.477 ms |       0.46% | 173.495 ms | 204.15% |   FAIL   |
|  HOST_BUFFER  |     NONE      |     1000      |      32      |  82.648 ms |       4.42% | 248.028 ms |       5.30% | 165.380 ms | 200.10% |   FAIL   |
| DEVICE_BUFFER |    SNAPPY     |       0       |      1       |  65.538 ms |       4.02% | 606.010 ms |       3.76% | 540.472 ms | 824.68% |   FAIL   |
| DEVICE_BUFFER |    SNAPPY     |     1000      |      1       | 101.427 ms |       4.10% | 742.774 ms |       4.64% | 641.347 ms | 632.33% |   FAIL   |
| DEVICE_BUFFER |    SNAPPY     |       0       |      32      |  80.133 ms |       4.64% | 364.701 ms |       2.70% | 284.568 ms | 355.12% |   FAIL   |
| DEVICE_BUFFER |    SNAPPY     |     1000      |      32      |  86.232 ms |       4.71% | 320.387 ms |       2.80% | 234.155 ms | 271.54% |   FAIL   |
| DEVICE_BUFFER |     NONE      |       0       |      1       |  52.189 ms |       6.62% | 458.100 ms |       2.15% | 405.912 ms | 777.78% |   FAIL   |
| DEVICE_BUFFER |     NONE      |     1000      |      1       |  54.664 ms |       6.76% | 478.527 ms |       1.41% | 423.862 ms | 775.39% |   FAIL   |
| DEVICE_BUFFER |     NONE      |       0       |      32      |  67.975 ms |       5.12% | 260.009 ms |       3.71% | 192.034 ms | 282.51% |   FAIL   |
| DEVICE_BUFFER |     NONE      |     1000      |      32      |  68.485 ms |       4.86% | 243.705 ms |       2.09% | 175.220 ms | 255.85% |   FAIL   |

```

Authors:
  - Nghia Truong (https://github.com/ttnghia)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - https://github.com/nvdbaranec
  - Vukasin Milovanovic (https://github.com/vuule)

URL: #15094
  • Loading branch information
ttnghia committed May 2, 2024
1 parent e3ea523 commit 81f8cdf
Show file tree
Hide file tree
Showing 22 changed files with 3,685 additions and 641 deletions.
3 changes: 2 additions & 1 deletion cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -395,8 +395,9 @@ add_library(
src/io/orc/dict_enc.cu
src/io/orc/orc.cpp
src/io/orc/reader_impl.cu
src/io/orc/reader_impl_chunking.cu
src/io/orc/reader_impl_decode.cu
src/io/orc/reader_impl_helpers.cpp
src/io/orc/reader_impl_preprocess.cu
src/io/orc/stats_enc.cu
src/io/orc/stripe_data.cu
src/io/orc/stripe_enc.cu
Expand Down
106 changes: 83 additions & 23 deletions cpp/benchmarks/io/orc/orc_reader_input.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2022-2023, NVIDIA CORPORATION.
* Copyright (c) 2022-2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand All @@ -24,31 +24,59 @@

#include <nvbench/nvbench.cuh>

namespace {

// Size of the data in the benchmark dataframe; chosen to be low enough to allow benchmarks to
// run on most GPUs, but large enough to allow highest throughput
constexpr int64_t data_size = 512 << 20;
constexpr cudf::size_type num_cols = 64;
constexpr std::size_t data_size = 512 << 20;
constexpr std::size_t Mbytes = 1024 * 1024;

template <bool is_chunked_read>
void orc_read_common(cudf::size_type num_rows_to_read,
cuio_source_sink_pair& source_sink,
nvbench::state& state)
{
cudf::io::orc_reader_options read_opts =
cudf::io::orc_reader_options::builder(source_sink.make_source_info());
auto const read_opts =
cudf::io::orc_reader_options::builder(source_sink.make_source_info()).build();

auto mem_stats_logger = cudf::memory_stats_logger(); // init stats logger
state.set_cuda_stream(nvbench::make_cuda_stream_view(cudf::get_default_stream().value()));
state.exec(
nvbench::exec_tag::sync | nvbench::exec_tag::timer, [&](nvbench::launch& launch, auto& timer) {
try_drop_l3_cache();

timer.start();
auto const result = cudf::io::read_orc(read_opts);
timer.stop();

CUDF_EXPECTS(result.tbl->num_columns() == num_cols, "Unexpected number of columns");
CUDF_EXPECTS(result.tbl->num_rows() == num_rows_to_read, "Unexpected number of rows");
});
if constexpr (is_chunked_read) {
state.exec(
nvbench::exec_tag::sync | nvbench::exec_tag::timer, [&](nvbench::launch&, auto& timer) {
try_drop_l3_cache();
auto const output_limit_MB =
static_cast<std::size_t>(state.get_int64("chunk_read_limit_MB"));
auto const read_limit_MB = static_cast<std::size_t>(state.get_int64("pass_read_limit_MB"));

auto reader =
cudf::io::chunked_orc_reader(output_limit_MB * Mbytes, read_limit_MB * Mbytes, read_opts);
cudf::size_type num_rows{0};

timer.start();
do {
auto chunk = reader.read_chunk();
num_rows += chunk.tbl->num_rows();
} while (reader.has_next());
timer.stop();

CUDF_EXPECTS(num_rows == num_rows_to_read, "Unexpected number of rows");
});
} else { // not is_chunked_read
state.exec(
nvbench::exec_tag::sync | nvbench::exec_tag::timer, [&](nvbench::launch&, auto& timer) {
try_drop_l3_cache();

timer.start();
auto const result = cudf::io::read_orc(read_opts);
timer.stop();

CUDF_EXPECTS(result.tbl->num_columns() == num_cols, "Unexpected number of columns");
CUDF_EXPECTS(result.tbl->num_rows() == num_rows_to_read, "Unexpected number of rows");
});
}

auto const time = state.get_summary("nv/cold/time/gpu/mean").get_float64("value");
state.add_element_count(static_cast<double>(data_size) / time, "bytes_per_second");
Expand All @@ -57,6 +85,8 @@ void orc_read_common(cudf::size_type num_rows_to_read,
state.add_buffer_size(source_sink.size(), "encoded_file_size", "encoded_file_size");
}

} // namespace

template <data_type DataType, cudf::io::io_type IOType>
void BM_orc_read_data(nvbench::state& state,
nvbench::type_list<nvbench::enum_type<DataType>, nvbench::enum_type<IOType>>)
Expand All @@ -79,13 +109,11 @@ void BM_orc_read_data(nvbench::state& state,
return view.num_rows();
}();

orc_read_common(num_rows_written, source_sink, state);
orc_read_common<false>(num_rows_written, source_sink, state);
}

template <cudf::io::io_type IOType, cudf::io::compression_type Compression>
void BM_orc_read_io_compression(
nvbench::state& state,
nvbench::type_list<nvbench::enum_type<IOType>, nvbench::enum_type<Compression>>)
template <cudf::io::io_type IOType, cudf::io::compression_type Compression, bool chunked_read>
void orc_read_io_compression(nvbench::state& state)
{
auto const d_type = get_type_or_group({static_cast<int32_t>(data_type::INTEGRAL_SIGNED),
static_cast<int32_t>(data_type::FLOAT),
Expand All @@ -95,15 +123,21 @@ void BM_orc_read_io_compression(
static_cast<int32_t>(data_type::LIST),
static_cast<int32_t>(data_type::STRUCT)});

cudf::size_type const cardinality = state.get_int64("cardinality");
cudf::size_type const run_length = state.get_int64("run_length");
auto const [cardinality, run_length] = [&]() -> std::pair<cudf::size_type, cudf::size_type> {
if constexpr (chunked_read) {
return {0, 4};
} else {
return {static_cast<cudf::size_type>(state.get_int64("cardinality")),
static_cast<cudf::size_type>(state.get_int64("run_length"))};
}
}();
cuio_source_sink_pair source_sink(IOType);

auto const num_rows_written = [&]() {
auto const tbl = create_random_table(
cycle_dtypes(d_type, num_cols),
table_size_bytes{data_size},
data_profile_builder().cardinality(cardinality).avg_run_length(run_length));
data_profile_builder{}.cardinality(cardinality).avg_run_length(run_length));
auto const view = tbl->view();

cudf::io::orc_writer_options opts =
Expand All @@ -113,7 +147,23 @@ void BM_orc_read_io_compression(
return view.num_rows();
}();

orc_read_common(num_rows_written, source_sink, state);
orc_read_common<chunked_read>(num_rows_written, source_sink, state);
}

template <cudf::io::io_type IOType, cudf::io::compression_type Compression>
void BM_orc_read_io_compression(
nvbench::state& state,
nvbench::type_list<nvbench::enum_type<IOType>, nvbench::enum_type<Compression>>)
{
return orc_read_io_compression<IOType, Compression, false>(state);
}

template <cudf::io::compression_type Compression>
void BM_orc_chunked_read_io_compression(nvbench::state& state,
nvbench::type_list<nvbench::enum_type<Compression>>)
{
// Only run benchmark using HOST_BUFFER IO.
return orc_read_io_compression<cudf::io::io_type::HOST_BUFFER, Compression, true>(state);
}

using d_type_list = nvbench::enum_type_list<data_type::INTEGRAL_SIGNED,
Expand Down Expand Up @@ -146,3 +196,13 @@ NVBENCH_BENCH_TYPES(BM_orc_read_io_compression, NVBENCH_TYPE_AXES(io_list, compr
.set_min_samples(4)
.add_int64_axis("cardinality", {0, 1000})
.add_int64_axis("run_length", {1, 32});

// Should have the same parameters as `BM_orc_read_io_compression` for comparison.
NVBENCH_BENCH_TYPES(BM_orc_chunked_read_io_compression, NVBENCH_TYPE_AXES(compression_list))
.set_name("orc_chunked_read_io_compression")
.set_type_axes_names({"compression"})
.set_min_samples(4)
// The input has approximately 520MB and 127K rows.
// The limits below are given in MBs.
.add_int64_axis("chunk_read_limit_MB", {50, 250, 700})
.add_int64_axis("pass_read_limit_MB", {50, 250, 700});
64 changes: 60 additions & 4 deletions cpp/include/cudf/io/detail/orc.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,13 +38,15 @@ class chunked_orc_writer_options;

namespace orc::detail {

// Forward declaration of the internal reader class
class reader_impl;

/**
* @brief Class to read ORC dataset data into columns.
*/
class reader {
private:
class impl;
std::unique_ptr<impl> _impl;
std::unique_ptr<reader_impl> _impl;

public:
/**
Expand All @@ -68,10 +70,63 @@ class reader {
/**
* @brief Reads the entire dataset.
*
* @param options Settings for controlling reading behavior
* @return The set of columns along with table metadata
*/
table_with_metadata read(orc_reader_options const& options);
table_with_metadata read();
};

/**
* @brief The reader class that supports iterative reading from an array of data sources.
*/
class chunked_reader {
private:
std::unique_ptr<reader_impl> _impl;

public:
/**
* @copydoc cudf::io::chunked_orc_reader::chunked_orc_reader(std::size_t, std::size_t, size_type,
* orc_reader_options const&, rmm::cuda_stream_view, rmm::device_async_resource_ref)
*
* @param sources Input `datasource` objects to read the dataset from
*/
explicit chunked_reader(std::size_t chunk_read_limit,
std::size_t pass_read_limit,
size_type output_row_granularity,
std::vector<std::unique_ptr<cudf::io::datasource>>&& sources,
orc_reader_options const& options,
rmm::cuda_stream_view stream,
rmm::device_async_resource_ref mr);
/**
* @copydoc cudf::io::chunked_orc_reader::chunked_orc_reader(std::size_t, std::size_t,
* orc_reader_options const&, rmm::cuda_stream_view, rmm::device_async_resource_ref)
*
* @param sources Input `datasource` objects to read the dataset from
*/
explicit chunked_reader(std::size_t chunk_read_limit,
std::size_t pass_read_limit,
std::vector<std::unique_ptr<cudf::io::datasource>>&& sources,
orc_reader_options const& options,
rmm::cuda_stream_view stream,
rmm::device_async_resource_ref mr);

/**
* @brief Destructor explicitly-declared to avoid inlined in header.
*
* Since the declaration of the internal `_impl` object does not exist in this header, this
* destructor needs to be defined in a separate source file which can access to that object's
* declaration.
*/
~chunked_reader();

/**
* @copydoc cudf::io::chunked_orc_reader::has_next
*/
[[nodiscard]] bool has_next() const;

/**
* @copydoc cudf::io::chunked_orc_reader::read_chunk
*/
[[nodiscard]] table_with_metadata read_chunk() const;
};

/**
Expand Down Expand Up @@ -126,5 +181,6 @@ class writer {
*/
void close();
};

} // namespace orc::detail
} // namespace cudf::io

0 comments on commit 81f8cdf

Please sign in to comment.