Interpolations for data operations #62

caitwolf · 2024-01-22T05:33:19Z

Description

This PR adds interpolation to 1D data operations . Previously the q-values for each dataset had to match perfectly. Now the overlap range of the two datasets in q will be determined and the second dataset will be interpolated if points don't match within the specified tolerance. There is a paired PR in sasview (SasView/sasview#2782).

Fixes # (issue/issues) - I could not find the relevant issue; please comment below when found.

…wo datasets do not match

caitwolf · 2024-01-22T23:49:13Z

There is a paired pull request in SasView SasView/sasview#2782

butlerpd · 2024-02-11T19:46:40Z

Is the plan to also push this against 0.9 which gets bundled with SasView 6.0?

caitwolf · 2024-02-12T20:38:18Z

Is the plan to also push this against 0.9 which gets bundled with SasView 6.0?

I think if the paired SasView pull request gets pushed into 6.0 then this would need to get pushed against 0.9.

butlerpd · 2024-02-12T21:57:01Z

I will review as part of the review of reviewing SasView/sasview#2782.

butlerpd · 2024-02-13T13:47:59Z

@butlerpd will do a functionality review on windows and check the code but this probably could use a better code review than that?

butlerpd · 2024-02-18T23:47:34Z

I am also looking at the code as it turns out and it is unclear to me why data operations are being carried out in the dataloader\data_info module?! I realize that is where they were being done and moving everything is outside the scope of this PR. In fact the new interpolation module seems to be located in the place where all the operation classes should be IMO: in the data_util package?

Question (probably more for @krzywon and/or @lucas-wilkins or @rozyczko) : Changing that structure sounds like it would definitely be a breaking change? If so should we attempt to do it for 6.0? or do we wait for 7.0? Or am I completely misunderstanding the code structure here? If a change should eventually be made, a separate issue should be created.

butlerpd

The new interpolation code looks good to me. The math seems right and the code seems clean to the level that I can judge such things. On the other hand the changes to data-info.py I found much harder to understand as noted in the code comments.

Also I have only skimmed the new unit tests though they seem reasonable. Somebody who understands that better than I might want to check

In any event, this PR should not be merged until we understand the issues reported in the paired sasview PR which are possibly caused by this code?

butlerpd · 2024-02-21T00:54:04Z

sasdata/dataloader/data_info.py

+        if self.isSesans:
+            self._interpolation_operation(other, scale='linear')


I think there should be a second if here: if other.isSesans then the linear interpolation otherwise raise error with message: the two data types must be the same .. or some such. Maybe more specific about not mixing sas and sesans data?

butlerpd · 2024-02-21T01:01:20Z

sasdata/dataloader/data_info.py

+            self_overlap_bool = np.abs((self.x[:, None] - other.x[None, :]) / self.x[:, None]).min(axis=1) <= tolerance
+            self_overlap_index = np.flatnonzero(self_overlap_bool)
+            if len(self_overlap_index) == self_overlap_index.max() - self_overlap_index.min() + 1:
+                # all the points in overlap region of self.x found close matches in overlap region of other.x


I don't really understand what exactly this is doing (I'm just a poor white belt) but it does seem that the goal here was to either do no interpolation if the two data sets have exactly the same number of points AND all the points match up in x within tolerance, otherwise ALL the data will be interpolated even if the x points match up? Given the funtionality review of the paired sasview PR is this really what is happening?

butlerpd · 2024-02-21T01:11:50Z

sasdata/dataloader/data_info.py

+                x_op = self.x[self_overlap_bool]
+                other._operation = other.clone_without_data(x_op.size)
+                other._operation.copy_from_datainfo(other)


so, if my understanding from the previous comment is correct, why do we need more than x_op = self.x? i.e. why is it instead x_op = self.x[self_overlap_bool]?

If it is because my understanding is incorrect, and two data sets with different sizes and q ranges can still make it through to this code if there are matching x (within tolerance), what happens when setting other._operation to be that length before copying the data from other and the size of the other data is different?

Sorry for my naive questions but I got a bit lost here.

caitwolf added 3 commits January 21, 2024 21:45

enable interpolation during data operations if the q values between t…

b0626c2

…wo datasets do not match

adding test functions for interpolation of operations

0d6d8e6

fixed numpy version issue with dtype argument for zeros

5c090d5

caitwolf marked this pull request as draft January 22, 2024 05:33

caitwolf added 3 commits January 22, 2024 00:56

fixed failing unit test for the interpolation of data operations

b635bfd

cleaning up interpolation code and creating missing unit tests

f39d0ba

updating plot

142414f

caitwolf marked this pull request as ready for review January 22, 2024 23:49

caitwolf added 4 commits January 31, 2024 12:06

added functionality for sesans operations

5db8896

added logging message that indicates the operation was completed

2ec4bdb

updated handling of uncertainties and resolution during operations

70ca1a8

updating loggin messages

650f975

caitwolf mentioned this pull request Feb 12, 2024

Interpolation for operations SasView/sasview#2782

Draft

7 tasks

butlerpd self-requested a review February 12, 2024 21:52

butlerpd added the DiscussAtTheCall label Feb 13, 2024

Merge branch 'master' into interpolations_for_data_operations

a25e654

butlerpd removed the DiscussAtTheCall label Feb 13, 2024

krzywon changed the base branch from master to release_0.9.0 February 16, 2024 18:10

butlerpd reviewed Feb 21, 2024

View reviewed changes

butlerpd and others added 2 commits March 6, 2024 02:03

Merge branch 'release_0.9.0' into interpolations_for_data_operations

3c3999b

Merge branch 'release_0.9.0' into interpolations_for_data_operations

339a80f

caitwolf marked this pull request as draft April 9, 2024 13:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interpolations for data operations #62

Interpolations for data operations #62

caitwolf commented Jan 22, 2024

caitwolf commented Jan 22, 2024

butlerpd commented Feb 11, 2024

caitwolf commented Feb 12, 2024

butlerpd commented Feb 12, 2024

butlerpd commented Feb 13, 2024

butlerpd commented Feb 18, 2024

butlerpd left a comment

butlerpd Feb 21, 2024

butlerpd Feb 21, 2024

butlerpd Feb 21, 2024

		if self.isSesans:
		self._interpolation_operation(other, scale='linear')

Interpolations for data operations #62

Are you sure you want to change the base?

Interpolations for data operations #62

Conversation

caitwolf commented Jan 22, 2024

Description

caitwolf commented Jan 22, 2024

butlerpd commented Feb 11, 2024

caitwolf commented Feb 12, 2024

butlerpd commented Feb 12, 2024

butlerpd commented Feb 13, 2024

butlerpd commented Feb 18, 2024

butlerpd left a comment

Choose a reason for hiding this comment

butlerpd Feb 21, 2024

Choose a reason for hiding this comment

butlerpd Feb 21, 2024

Choose a reason for hiding this comment

butlerpd Feb 21, 2024

Choose a reason for hiding this comment