Smoothing - derivatives and variance - document layout analysis

2 views (last 30 days)
Dear everyone. I have now a prlem and really need your help.
Now I try to code step by step follow the paper: Paper
And there is my code by matlab
I copy here or link below to download all
(i'm sorry I don't know why I can't upload on this website)
%main------------------------------------------------------------------
close all;
clc;
clear all;
f=imread('test2.png');
f=rgb2gray(f);
f=im2double(f);
f=im2bw(f,0.5);
f=~f;
[a b]=size(f);
%makequadtree(f);
[PH,t1,t2] = findPH(f); % t1 and t2 is the number of 0 begin and end
figure,bar(PH);
s=floor(0.05*length(PH));% kernel size s ??? I don't know
MH = findMH(PH,s);
figure,bar(MH);
FH = findFH(MH);
figure,bar(FH);
Z1 = FindZ1(FH);
[D1,D]=distanceZ(Z1);
[m,V,P,DP] = varianceV(D);
%quadtree structure---------------------------------------------------
function bw1 = quadtree(f)
[a b]=size(bw); %kich co ban dau
if(rem(a,2)>0)
c=ones(1,b);
bw=[bw;c];
end
[u, b]=size(bw);
if(rem(b,2)>0)
d=ones(u,1);
bw=[bw d];
end
[m n]=size(bw);
bw1=ones(size(bw)/2);
for(i=2:2:m)
for(j=2:2:n)
if (bw(i,j)==1||bw(i-1,j)==1||bw(i,j-1)==1||bw(i-1,j-1)==1)
bw1(i/2,j/2)=0;
end
end
end
%bw1
end
%make quadtree---------------------------------------------------------
function [bw1,m,n] = makequadtree(f)
bw1=quadtree(f);
[m n]=size(bw1);
while (m>=100 && n>=100)%min size is 50 50
bw1=quadtree(bw1);
[m n]=size(bw1);
figure,imshow(bw1);
end
imwrite(bw1,'top level.jpg');
%figure,boundingbox(bw1);
end
%find PH------------------------------------------------------
function [PH,t1,t2] = findPH(f)
%projection horizontal
[m n]=size(f);
PH=zeros(m,1);
for i=1:m
for j=1:(n)
PH(i)=PH(i)+f(i,j);
end
end
t1=0; % clear the white uper border
while (1)
if PH(1)==0
PH = PH(2:length(PH));
t1=t1+1;
else break
end
end
n=length(PH);
t2=0; %clear the white lower border
while (1)
if PH(n)==0
t2=t2+1;
m=length(PH)-1;
PH = PH(1:m);
n=m;
else break
end
end
end
%finf MH------------------------------------------------
function [MH] = findMH(PH,kernel_size)
[m, ~]=size(PH);
MH=zeros(size(PH));
s=kernel_size;
%s=25;
for y=1:m
for j=(y-floor(s/2)):1:(y+floor(s/2))
if (j<m &&j>0)
MH(y)=(MH(y)+PH(j));
end
end
end
MH=floor(MH/s);
end
%find FH--------------------------------------------------------
function [FH] = findFH(MH)
m=length(MH);
FH=[;];
for i=2:m-1
FH(i)=MH(i)-MH(i-1);
end
FH(1)=MH(1);
FH(m)=-MH(m);
FH=transpose(FH);
%figure, bar(FH);
end
%find Z-------------------------------------------------------
function [Z] = FindZ1(FH)
[m n] = size(FH);
Z=zeros(size(FH));
for i=1:m-1
if ((FH(i)<0 && FH(i+1)>=0)||(FH(i)>0 && FH(i+1)<=0))
Z(i)=i;
end
end
end
%find distance d-------------------------------------------------
function [D1,D] = distanceZ(f)
[b,~]=size(f);
t=1;
D1=[];
for i=1:b
if f(i)~=0
D1(t)=f(i);
t=t+1;
end
end
D=[];
m=length(D1);
D(1)=D1(1);
t=2;
for i=1:m-1
D(t)=D1(i+1)-D1(i);
t=t+1;
end
D=transpose(D);
end
%find variance-----------------------------------------------------------
function [m,V,P,DP] = varianceV(D)
n = length(D);
m=sum(D)/n;
V1=0;
for i=1:n
V1 = V1+((D(i)-m)*(D(i)-m));
end
V=V1/n;
P = 2-(2/(1+exp(-V)));
DP=0;
if P > 0.5
DP=1;
end
end
Run with main But I have some problem here:
In this paper, at page 3 (3.2 Periodicity estimation), find MH, I see that s is kernel size but I not clear what is kernel size or kernel size of what ? I think it's very important because when I try with s = 0.05 of length of PH it make something wrong, my variance always big although I use very beautiful image. (test2.png) - (no skew detection needed) I think there is an problem with MH or find FH. I try with a lot of s but not right. If you have any comment please tell me and I can fix this problem
Thank you so much
  2 Comments
Star Strider
Star Strider on 5 May 2014
Use the ‘paperclip’ icon above the window in your original post to upload the paper and uncompressed code to the MathWorks website.
Tran Tuan Anh
Tran Tuan Anh on 5 May 2014
I'm sorry, but there are some function in different file and image in this folder so I don't know how to post it. I will remember and post file by file. Thank you so much

Sign in to comment.

Answers (1)

Tran Tuan Anh
Tran Tuan Anh on 9 May 2014
Please, help me

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!