-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: named axis for ak.Array
#3238
Draft
pfackeldey
wants to merge
24
commits into
scikit-hep:main
Choose a base branch
from
pfackeldey:feat/named_axes
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pfackeldey
changed the title
Feat: named axis for
feat: named axis for Sep 12, 2024
ak.Array
ak.Array
Progressgeneral
slicing
Unary and binary operations
high-level functionsNew:
Can be used with named axis:
Independent of named axis: improvements / bugs found that are fixed by this PR aswell:
|
And all the data types that can be passed into square brackets with |
…xing named axis propagation
…x named axis propagation in indexing for type tracers
…ak.mean; remove inplace addition of arrays from test
…tible with branched structures;fix regularize_axis in all highlevel ops
…ranch_depth using inner_shape property
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Proposal for named axis
This PR addresses #2596.
References for other named axis implementations:
Motivation
As argumented at PyHEP.dev 2023 and by the Harvard NLP group in their "Tensor Considered Harmful" write-up, named axis can be a powerful tool to make code more readable and less error-prone.
Design
ak.Array
with named axisNamed axis are implemented through a mapping from named axis to positional axis.
named axis are hashables (mostly strings), except for integers as they are reserved for positional axis.
By default a
ak.Array
uses positional axis, but named axis can be added to the array in the following ways:The
named_axis
argument of the constructor of anak.Array
is a tuple ofAxisName
, or a dict ofAxisName
to integers.It is stored in the
.attrs
attribute of the array with a reserved key"__named_axis__"
of typedict[AxisName, int]
.The two types of axis can be accessed through the
named_axis
andpositional_axis
property (always represented as a tuple):Named axis in high-level functions
Named axis can be used by all high-level functions, e.g.
ak.sum
,ak.max
, etc.:There are different scenarios how named axis are propagated to the resulting array:
ak.sum(array, axis="jets", keepdims=True)
orarray ** 2
.ak.sum(array, axis="jets")
.ak.Array
:Here, checks for matching named axis are possible, the rules are:
Named axis in indexing
In addition, named axis can be used to select data:
For synthatic sugar
ak.slice
is added:This PR has to touch a lot of code and needs to add custom named axis propagation to each high-level operation. Thus, this PR is currently in draft mode.
Looking forward to ideas, thoughts, feedback on this effort!