remove sklearn dependency from cohenkappa score calculation logic and applied custom calculation and updated tests#3731
remove sklearn dependency from cohenkappa score calculation logic and applied custom calculation and updated tests#3731avishkarsonni wants to merge 10 commits into
Conversation
… applied custom calculation and updated tests
|
Thanks for the pr! In this metric we inherited form the instead of using raw tensor values for final computation in |
|
so there is no need to bring all the raw tensors to the memory because there is no scikit-learn now, got it |
|
@avishkarsonni yes, instead of storing raw tensors in |
|
Nice implementation! Using torch.bincount for the confusion matrix is a good approach and keeps the computation on GPU. I had a couple of small observations: It may be safer to guard against the case where p_e == 1, since (1 - p_e) would cause a division by zero. Overall the implementation looks clean and matches the expected Cohen’s Kappa formulation. |
| n_classes = int(max(y_targets.max().item(), y_preds.max().item())) + 1 | ||
|
|
||
| indices = y_targets * n_classes + y_preds | ||
| conf = torch.bincount(indices, minlength=n_classes * n_classes).reshape(n_classes, n_classes).double() |
There was a problem hiding this comment.
Why can't we use ignite's confusion matrix class to compute this metric?
If we use ConfusionMatrix, we should provide num_classes as input argument... In order to keep backward compatibility, we can add num_classes arg to the constructor as optional kwargs.
We can have two private cohen kappa implementations: one using EpochMetric (current one) and second using ConfusionMatrix. The public CohenKappa can route depending on num_classes arg.
In the private implementation using EpochMetric we should still use ConfusionMatrix to compute the confusion matrix in compute() method instead of doing that manually (as currently) to avoid bugs.
There was a problem hiding this comment.
I used bincount as I was aware of it and from start planned as such but now as mentioned it is possible. in the start of this PR it was not intuitive to me as it needs num_classes at init. Current impl infers it dynamically from data, but it is indeed possible.
There was a problem hiding this comment.
And again this method is better only when the user knows num_classes if not we will need a fall back to infer it from the data on its own. Both of the methods have same time complexity, just advantage of using ConfusionMatrix is edge case coverage over current approach, would love to hear what would be a better approach!
There was a problem hiding this comment.
Benefit of using ConfusionMatrix is that we do not store the list of y_preds and y_true in RAM vs the current implementation using EpochMetric. Drawback of using ConfusionMatrix is that we should specify num_classes in the metric constructor and also it gives a backward compatibility break.
OK, the suggestion for this PR: let's use ConfusionMatrix in the _cohen_kappa_score in order to avoid using manual confusion matrix computation. We can instantiate an object of ConfusionMatrix with the number of classes and do a single update and call compute.
…) method to better preserve precision for mps
vfdev-5
left a comment
There was a problem hiding this comment.
@avishkarsonni thanks a lot for the update of the code using the suggested approach.
However, I'm not totally agree with Aaishwarya about passing to cpu from an accelerator to perform computation on double. If device is cuda supporting double, passing to cpu is not optimal.
I think we can use _double_dtype for that:
ignite/ignite/metrics/metric.py
Lines 389 to 390 in c830a06
I am not sure what you mean exactly, |
|
The suggestion is to use ignite/ignite/metrics/multilabel_confusion_matrix.py Lines 137 to 138 in c830a06 |
|
I too agree that wherever we can use the cuda acceleration we should use it, but it again creates a situation where we have to treat mps and cuda differently and cannot make the code agnostic, I don't know what is the better approach |
|
it is just a precision difference, we do not treat them essentially differently, so it is fine to use my suggestion. |
…e available device using _double_dtype.
|
@avishkarsonni tests are failing https://app.netlify.com/projects/pytorch-ignite-preview/deploys/69fcbba33087ea0008ff7a24. You may follow the steps given in |
| cm = ConfusionMatrix(num_classes=num_classes, device=y_pred.device) | ||
| y_pred_oh = F.one_hot(y_pred.long(), num_classes).float() | ||
| cm.update((y_pred_oh, y.long())) | ||
| double_dtype = torch.float32 if y_pred.device.type == "mps" else torch.float64 |
There was a problem hiding this comment.
Can't we fetch double dtype from cm._double_dtype?
| ck.update((torch.randint(0, 2, size=(10,)).long(), torch.randint(0, 2, size=(10, 5)).long())) | ||
|
|
||
|
|
||
| def test_check_shape(): |
There was a problem hiding this comment.
Why do you remove these tests?
There was a problem hiding this comment.
I accidentally commited it i was trying to do something with git and it pushed the commited changes
| ck.compute() | ||
|
|
||
|
|
||
| def test_input_types(): |
There was a problem hiding this comment.
Same question here, why do you remove these tests?
Fixes #3701
Description:
Remove scikit-learn dependency from
CohenKappaby implementing a native PyTorch version.sklearn.metrics.cohen_kappa_scorewith a pure PyTorch implementation.cpu().numpy()) — metric now runs fully on the configured devicetorch.bincount(single vectorised kernel) instead of a Python loopfloat64throughout to match sklearn's numerical precisionCheck list: