Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Row-oriented reader needs to be able to skip fields #190

Open
GKrivosheev-rms opened this issue Apr 29, 2021 · 0 comments
Open

Row-oriented reader needs to be able to skip fields #190

GKrivosheev-rms opened this issue Apr 29, 2021 · 0 comments
Labels
enhancement New feature or request

Comments

@GKrivosheev-rms
Copy link

GKrivosheev-rms commented Apr 29, 2021

Currently, all fields and properties of TRow must be present in the file for Row-based reader to work.
Often row properties are computed, unfilled or otherwise do not needs to be read from the file. We need a way to mark the type so that those members are not deserialized from file.

Proposal:
Add IgnoreColumn attribute to mark columns that must be skipped, such as:

struct MyRow
{
    [IgnoreColumn]
    public DateTime CurrentDate => DateTime.Now;

    [MapToColumn("ColumnB")]
    public string MyValue;
}
using var reader = ParquetFile.CreateRowReader<MyRow>("example.parquet");
...

Alternatively, make reader and writer symmetrical, and allow reader to be customied with list of columns, such as below. Note that the columns are names of members in the class, not in the file. This will allow to set a subset of members in the type.

public static ParquetRowReader<TTuple> CreateRowReader<TTuple>(string path, string[] columnNames = null);
@GPSnoopy GPSnoopy added the enhancement New feature or request label May 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants