Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpolations for data operations #62

Draft
wants to merge 13 commits into
base: release_0.9.0
Choose a base branch
from

Conversation

caitwolf
Copy link
Contributor

Description

This PR adds interpolation to 1D data operations . Previously the q-values for each dataset had to match perfectly. Now the overlap range of the two datasets in q will be determined and the second dataset will be interpolated if points don't match within the specified tolerance. There is a paired PR in sasview (SasView/sasview#2782).

Fixes # (issue/issues) - I could not find the relevant issue; please comment below when found.

@caitwolf caitwolf marked this pull request as draft January 22, 2024 05:33
@caitwolf
Copy link
Contributor Author

There is a paired pull request in SasView SasView/sasview#2782

@caitwolf caitwolf marked this pull request as ready for review January 22, 2024 23:49
@butlerpd
Copy link
Member

Is the plan to also push this against 0.9 which gets bundled with SasView 6.0?

@caitwolf
Copy link
Contributor Author

Is the plan to also push this against 0.9 which gets bundled with SasView 6.0?

I think if the paired SasView pull request gets pushed into 6.0 then this would need to get pushed against 0.9.

@butlerpd
Copy link
Member

I will review as part of the review of reviewing SasView/sasview#2782.

@butlerpd
Copy link
Member

@butlerpd will do a functionality review on windows and check the code but this probably could use a better code review than that?

@krzywon krzywon changed the base branch from master to release_0.9.0 February 16, 2024 18:10
@butlerpd
Copy link
Member

I am also looking at the code as it turns out and it is unclear to me why data operations are being carried out in the dataloader\data_info module?! I realize that is where they were being done and moving everything is outside the scope of this PR. In fact the new interpolation module seems to be located in the place where all the operation classes should be IMO: in the data_util package?

Question (probably more for @krzywon and/or @lucas-wilkins or @rozyczko) : Changing that structure sounds like it would definitely be a breaking change? If so should we attempt to do it for 6.0? or do we wait for 7.0? Or am I completely misunderstanding the code structure here? If a change should eventually be made, a separate issue should be created.

Copy link
Member

@butlerpd butlerpd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new interpolation code looks good to me. The math seems right and the code seems clean to the level that I can judge such things. On the other hand the changes to data-info.py I found much harder to understand as noted in the code comments.

Also I have only skimmed the new unit tests though they seem reasonable. Somebody who understands that better than I might want to check

In any event, this PR should not be merged until we understand the issues reported in the paired sasview PR which are possibly caused by this code?

Comment on lines +921 to +922
if self.isSesans:
self._interpolation_operation(other, scale='linear')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be a second if here: if other.isSesans then the linear interpolation otherwise raise error with message: the two data types must be the same .. or some such. Maybe more specific about not mixing sas and sesans data?

Comment on lines +857 to +860
self_overlap_bool = np.abs((self.x[:, None] - other.x[None, :]) / self.x[:, None]).min(axis=1) <= tolerance
self_overlap_index = np.flatnonzero(self_overlap_bool)
if len(self_overlap_index) == self_overlap_index.max() - self_overlap_index.min() + 1:
# all the points in overlap region of self.x found close matches in overlap region of other.x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand what exactly this is doing (I'm just a poor white belt) but it does seem that the goal here was to either do no interpolation if the two data sets have exactly the same number of points AND all the points match up in x within tolerance, otherwise ALL the data will be interpolated even if the x points match up? Given the funtionality review of the paired sasview PR is this really what is happening?

Comment on lines +861 to +863
x_op = self.x[self_overlap_bool]
other._operation = other.clone_without_data(x_op.size)
other._operation.copy_from_datainfo(other)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, if my understanding from the previous comment is correct, why do we need more than x_op = self.x? i.e. why is it instead x_op = self.x[self_overlap_bool]?

If it is because my understanding is incorrect, and two data sets with different sizes and q ranges can still make it through to this code if there are matching x (within tolerance), what happens when setting other._operation to be that length before copying the data from other and the size of the other data is different?

Sorry for my naive questions but I got a bit lost here.

@caitwolf caitwolf marked this pull request as draft April 9, 2024 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants