Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NMI and AMI use inconsistent definitions of mutual information #10308

Closed
kno10 opened this issue Dec 13, 2017 · 11 comments
Closed

NMI and AMI use inconsistent definitions of mutual information #10308

kno10 opened this issue Dec 13, 2017 · 11 comments
Labels
help wanted Moderate Anything that requires some knowledge of conventions and best practices

Comments

@kno10
Copy link
Contributor

kno10 commented Dec 13, 2017

There exist many defintions of NMI and AMI.

Vinh, N. X., Epps, J., & Bailey, J. (2010). Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. Journal of Machine Learning Research, 11(Oct), 2837-2854.

mention 5 different definitions of NMI, and based on that give 4 different AMI.

The NMI implemented in sklearn uses sqrt(H(U), H(V)) for normalization.
The AMI implemented in sklearn uses max(H(U), H(V)) for normalization.

There exists an NMI with the max normalization, and a AMI with the sqrt normalization, so this is inconsistent in sklearn. Ideally, they would both use the same definition by default, and allow using any of the others via an option.

@amueller
Copy link
Member

This is indeed not very nice.
The first (and possibly easiest) step would be to add normalization options to both that can be max or sqrt and possibly add min and sum if we think they might be useful. We should check out what is commonly used. Haven't read the paper, but it looks really nice.

Second step is to decide whether this is worth a deprecation cycle, and if so, what the new default should be. This is probably the hardest part (unless there is a clear consensus in the community, for example if that paper makes a clear recommendation that is widely accepted).

Third step is implementing the deprecation cycle.

@amueller
Copy link
Member

amueller commented Dec 14, 2017

Maybe @robertlayton can give us some insight into why these were chosen originally? See #402.

@amueller
Copy link
Member

Or maybe I should ask myself #776 ...

@amueller
Copy link
Member

OMG I just saw the paper I added by Andrew Rosenberg (who I now know personally) and Julia Hirschberg (the head of the CS department I'm in)... that's.. weird... anyhow, I digress...

I added the same paper you're referencing to the docs, I'm not sure why I did the sqrt... To make it identical to the V-measure? That seems strange.

@jnothman jnothman added Documentation help wanted Moderate Anything that requires some knowledge of conventions and best practices and removed Documentation labels Dec 15, 2017
@aryamccarthy
Copy link
Contributor

aryamccarthy commented May 23, 2018

Yang et al. claimed not to observe big differences based on the measure. Vinh et al. don't make a recommendation either. Danon et al. were the first to use it for community detection, and they followed a line of work that actually used sum.

It'll come down to whoever decides to implement it. I'm happy to do it and make a note in the docs that the normalization constants are different—but will converge in a later version. My vote is sum.

This could also be a good first step toward implementing multiple randomness models, like one-sided AMI

@amueller
Copy link
Member

I am against sum because that would require changing both and it looks like it's less used in the clustering literature. I think I'm leaning max but I really don't care that much ;)

@aryamccarthy
Copy link
Contributor

Ah, an instance of different preferences in different fields. We can do max. Do I have permission to implement this?

@amueller
Copy link
Member

Yes, please go ahead. The stronger argument for me is that we need to change behavior in one of the metrics then.

@aryamccarthy
Copy link
Contributor

aryamccarthy commented May 23, 2018 via email

@aryamccarthy
Copy link
Contributor

Ooh, a twist. Sum is actually what V-measure uses—not sqrt. It seems we've covered the entire gamut. I'm going to take that as another argument in favor of sum. << Thought I hit 'Comment' on this some time ago.

@qinhanmin2014
Copy link
Member

I think this one can be closed given #11124

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Moderate Anything that requires some knowledge of conventions and best practices
Projects
None yet
Development

No branches or pull requests

5 participants