-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hamming distance #3185
Comments
Likely also useful for text. |
Good morning, i'm a Master student and our professor taught us how to use orange. i find it a very useful tool and i would like to help out with small things. if I understand correctly, it should be for example distance ([1,3,6,4], [2,3,6,9]) returns 2? |
Similarly to simple matching coefficient, increase the distance by 1 for each index at which two vectors differ. |
ok thanks. Now i'm going to explore the project a bit before doing this. Thanks a lot. |
Hey, @thocevar I can see that there is no development started on this issue and I would like to contribute to this issue. I was looping through the code and found out that in Euclidean distance metric we are using means as an offset for normalization and two standard deviations in scaling whereas medians are used as an offset for normalization and two MADS are used in scaling in Manhattan distance model. Can you please provide some insight into how do we select these parameters and what are the metrics we should use while implementing Hamming distance. Thanks :) |
See https://github.com/biolab/orange3/files/1128190/distances.pdf, referenced in #2454. As I recall a later changed some parts, but the reasoning for using two standard deviations is still as explained there. |
There is no distance in Orange that can be used for distance calculations between columns in data with discrete attributes.
Implement Hamming distance in distance.py and add it to the OWDistances widget.
The text was updated successfully, but these errors were encountered: