• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

MATLAB聚类有效性评价指标(外部)MATLAB聚类有效性评价指标(外部成对度量)Mutualin ...

原作者: [db:作者] 来自: [db:来源] 收藏 邀请

作者:凯鲁嘎吉 - 博客园 http://www.cnblogs.com/kailugaji/

更多内容,请看:MATLAB聚类MATLAB: Clustering Algorithms

前提:数据的真实标签已知!

1. 归一化互信息(Normalized Mutual information)

定义

 

 

程序

function MIhat = nmi(A, B)
%NMI Normalized mutual information
% A, B: 1*N;
if length(A) ~= length(B)
    error('length( A ) must == length( B)');
end
N = length(A);
A_id = unique(A);
K_A = length(A_id);
B_id = unique(B);
K_B = length(B_id);
% Mutual information
A_occur = double (repmat( A, K_A, 1) == repmat( A_id', 1, N ));
B_occur = double (repmat( B, K_B, 1) == repmat( B_id', 1, N ));
AB_occur = A_occur * B_occur';
P_A= sum(A_occur') / N;
P_B = sum(B_occur') / N;
P_AB = AB_occur / N;
MImatrix = P_AB .* log(P_AB ./(P_A' * P_B)+eps);
MI = sum(MImatrix(:));
% Entropies
H_A = -sum(P_A .* log(P_A + eps),2);
H_B= -sum(P_B .* log(P_B + eps),2);
%Normalized Mutual information
MIhat = MI / sqrt(H_A*H_B); 

结果

>> A = [1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3];
>> B = [1 2 1 1 1 1 1 2 2 2 2 3 1 1 3 3 3];
>> MIhat = nmi(A, B)

MIhat =

    0.3646

2. Rand统计量(Rand index)

定义

程序

function [AR,RI,MI,HI]=RandIndex(c1,c2)
%RANDINDEX - calculates Rand Indices to compare two partitions
% ARI=RANDINDEX(c1,c2), where c1,c2 are vectors listing the 
% class membership, returns the "Hubert & Arabie adjusted Rand index".
% [AR,RI,MI,HI]=RANDINDEX(c1,c2) returns the adjusted Rand index, 
% the unadjusted Rand index, "Mirkin's" index and "Hubert's" index.

if nargin < 2 || min(size(c1)) > 1 || min(size(c2)) > 1
   error('RandIndex: Requires two vector arguments')
   return
end

C=Contingency(c1,c2);	%form contingency matrix

n=sum(sum(C));
nis=sum(sum(C,2).^2);		%sum of squares of sums of rows
njs=sum(sum(C,1).^2);		%sum of squares of sums of columns

t1=nchoosek(n,2);		%total number of pairs of entities
t2=sum(sum(C.^2));	%sum over rows & columnns of nij^2
t3=.5*(nis+njs);

%Expected index (for adjustment)
nc=(n*(n^2+1)-(n+1)*nis-(n+1)*njs+2*(nis*njs)/n)/(2*(n-1));

A=t1+t2-t3;		%no. agreements
D=  -t2+t3;		%no. disagreements

if t1==nc
   AR=0;			%avoid division by zero; if k=1, define Rand = 0
else
   AR=(A-nc)/(t1-nc);		%adjusted Rand - Hubert & Arabie 1985
end

RI=A/t1;			%Rand 1971		%Probability of agreement
MI=D/t1;			%Mirkin 1970	%p(disagreement)
HI=(A-D)/t1;	%Hubert 1977	%p(agree)-p(disagree)

function Cont=Contingency(Mem1,Mem2)

if nargin < 2 || min(size(Mem1)) > 1 || min(size(Mem2)) > 1
   error('Contingency: Requires two vector arguments')
   return
end

Cont=zeros(max(Mem1),max(Mem2));

for i = 1:length(Mem1)
   Cont(Mem1(i),Mem2(i))=Cont(Mem1(i),Mem2(i))+1;
end

程序中包含了四种聚类度量方法:Adjusted Rand index、Rand index、Mirkin index、Hubert index。

结果

>> A = [1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3];
>> B = [1 2 1 1 1 1 1 2 2 2 2 3 1 1 3 3 3];
>> [AR,RI,MI,HI]=RandIndex(A,B)

AR =

    0.2429


RI =

    0.6765


MI =

    0.3235


HI =

    0.3529

3. 参考文献

(simple) Tool for estimating the number of clusters

Mutual information and Normalized Mutual information 互信息和标准化互信息

Evaluation of clustering

 


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
Delphi判断某个类是否实现了某个接口发布时间:2022-07-18
下一篇:
Delphi10.1说明发布时间:2022-07-18
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap