k-means clustering

In order to clustering data to reduce the noise, we can use the simple k-means clustering algorithm.
The idea of k-means is quite simple. Here is the step of the k-means algorithm.

1. Randomly pick samples (depending on how many groups we want to cluster) as the references.
2. Compute the distance between each data point and references.
3. Comparing the distance to each reference, grouping the data points from the shortest distance.
4. Compute the centroid of each group as the new reference.
5. Repeat 2-4, until the centroids are the same with the previous result.

Here is the Matlab code:
=======================================

% An example for k-means clustering
%
% Renfong 2018/12/19

% create test data

% There are 3 types of sigal

% 1-20, 36-70, 91-100 are the group 1

% 21-35 are group 2

% 71-90 are group 3

xx=0:1:1024;

cen=[120, 360, 780];

amp=[100, 60, 55];

sig=[50, 10, 30];

% peak 1+3

for i=1:20

sp(i,:)=amp(1)*exp((-1*(xx-cen(1)).^2/(2*sig(1)^2))) + ...

amp(3)*exp((-1*(xx-cen(3)).^2/(2*sig(3)^2))) + 10 * rand(size(xx));

end

% peak 2+3

for i=21:35

sp(i,:)=amp(2)*exp((-1*(xx-cen(2)).^2/(2*sig(2)^2))) + ...

amp(3)*exp((-1*(xx-cen(3)).^2/(2*sig(3)^2))) + 10 * rand(size(xx));

end

% peak 1+3

for i=36:70

sp(i,:)=amp(1)*exp((-1*(xx-cen(1)).^2/(2*sig(1)^2))) + ...

amp(3)*exp((-1*(xx-cen(3)).^2/(2*sig(3)^2))) + 10 * rand(size(xx));

end

% peak 1+2

for i=71:90

sp(i,:)=amp(1)*exp((-1*(xx-cen(1)).^2/(2*sig(1)^2))) + ...

amp(2)*exp((-1*(xx-cen(2)).^2/(2*sig(2)^2))) + 10 * rand(size(xx));

end

% peak 1+3

for i=91:100

sp(i,:)=amp(1)*exp((-1*(xx-cen(1)).^2/(2*sig(1)^2))) + ...

amp(3)*exp((-1*(xx-cen(3)).^2/(2*sig(3)^2))) + 10 * rand(size(xx));

end

% k-means parameter

group=3;

[m,n]=size(sp);

% do k-means

distMap=zeros(m,group);

c=randsample(m,group);

cen=sp(c,:);

h1=figure(1);

set(h1,'Position',[220 378 560 420]);

set(h1,'Color','white');

for i=1:group

subplot(group,1,i);plot(cen(i,:));

end

temp=cen;

count=1;

while 1

fprintf('step %i. \n',count);

for i=1:m

for j=1:group

distMap(i,j)=sqrt(sum((sp(i,:)-cen(j,:)).^2));

end

[minD,ind]=min(distMap,[],2);

h1=figure(100+count);

set(h1,'Position',[220 378 560 420]);

set(h1,'Color','white');

for i=1:group

cen(i,:)=mean(sp(ind==i,:),1);

subplot(group,1,i);

plot(cen(i,:));

axis([0,n,0,inf]);

if i==1

title(['Iter=',num2str(count,'%03i')]);

end

h2=figure(200+count);

set(h2,'Position',[820 378 560 420]);

set(h2,'Color','white');

plot(ind);

title(['Iter=',num2str(count,'%03i')]);

axis([0,m,0,group+1])

if temp==cen

break;

else

temp=cen;

count=count+1;

end

===========
The results:

Renfong's Page

Search This Blog

k-means clustering

Labels

Comments

Post a Comment

Popular posts from this blog

Top hat filter

HyperSpy - read the calibration information in a dm3/dm4 file

MLLS in matlab