Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] 'pyarrow._parquet.SortingColumn' object has no attribute 'to_dict' #41699

Closed
djkapner opened this issue May 17, 2024 · 3 comments · Fixed by #41704
Closed

[Python] 'pyarrow._parquet.SortingColumn' object has no attribute 'to_dict' #41699

djkapner opened this issue May 17, 2024 · 3 comments · Fixed by #41704

Comments

@djkapner
Copy link

Describe the bug, including details regarding any error messages, version, and platform.

When a SortingColumn is present, the metadata of a ParquetFile can not be serialized with to_dict() because SortingColumn is missing this method.

import polars as pl
import pyarrow.parquet as pq

df = pl.DataFrame({"a": [1, 2], "b": [10, 11]})
fname = "tmp.parquet"
pq.write_table(
    df.to_arrow(),
    fname,
    sorting_columns=[pq.SortingColumn(0),],
)

pqf = pq.ParquetFile(fname)
print(pqf.metadata.row_group(0).sorting_columns[0])
print(pqf.metadata.to_dict())

results in :

SortingColumn(column_index=0, descending=False, nulls_first=False)
...

  File "pyarrow/_parquet.pyx", line 892, in pyarrow._parquet.FileMetaData.to_dict
  File "pyarrow/_parquet.pyx", line 790, in pyarrow._parquet.RowGroupMetaData.to_dict
AttributeError: 'pyarrow._parquet.SortingColumn' object has no attribute 'to_dict'

Component(s)

Parquet, Python

@mapleFU
Copy link
Member

mapleFU commented May 17, 2024

@tlm365 Can you send a reply here? I don't know why doesn't this pr be not assigned :-( Maybe you can first "take" or reply here and I'd like assign this to you

@tlm365
Copy link
Contributor

tlm365 commented May 17, 2024

take

@AlenkaF AlenkaF changed the title 'pyarrow._parquet.SortingColumn' object has no attribute 'to_dict' [Python] 'pyarrow._parquet.SortingColumn' object has no attribute 'to_dict' May 20, 2024
AlenkaF pushed a commit that referenced this issue May 21, 2024
…#41704)

### Rationale for this change
Resolves #41699 .

### What changes are included in this PR?
Add `to_dict` method and test case

### Are these changes tested?
Yes

### Are there any user-facing changes?
No

* GitHub Issue: #41699

Authored-by: Tai Le Manh <manhtai.lmt@gmail.com>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
@AlenkaF AlenkaF added this to the 17.0.0 milestone May 21, 2024
@AlenkaF
Copy link
Member

AlenkaF commented May 21, 2024

Issue resolved by pull request 41704
#41704

vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
…Column (apache#41704)

### Rationale for this change
Resolves apache#41699 .

### What changes are included in this PR?
Add `to_dict` method and test case

### Are these changes tested?
Yes

### Are there any user-facing changes?
No

* GitHub Issue: apache#41699

Authored-by: Tai Le Manh <manhtai.lmt@gmail.com>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants