Getting coordanates of text areas on the screenshot

3 views (last 30 days)
I have a screenshot with text, nothing more is on the screenshot. It contains black backgroung and white text on it, and can be coverted to white background and black text on it.
My task is getting an array that contains upper left and bottom right screen coordinates of each word in the text. The purpose is providing ability to check whether a mouse cursor or gazepoint the is gotten form eye-tracker device is on a word in the text and if yes, what particular word is pointed by mouse or gaze.
I have read about Image Processing Toolbox and some examples, but none of them helped me - I do not have a lot of experience with MATLAB yet.
Could someone help me with this? I would appreciate!

Answers (2)

Image Analyst
Image Analyst on 28 Apr 2014
The Computer Vision System Toolbox has OCR in it. See this: http://www.mathworks.com/help/vision/ref/ocr.html#bt548t1-2_1
  3 Comments
Image Analyst
Image Analyst on 29 Apr 2014
Take the vertical profile and threshold
verticalProfile = mean(grayImage, 2);
textLineLocations = verticalProfile > 0; % or < 0
Now use diff to find out where (what rows) the text lines start and stop on. Then for each line, extract the band of rows and do the horizontal profile to find where the line starts and stops:
horizontalProfile = mean(subImage, 1) > 0; % or < 0
leftColumn = find(horizontalProfile, 1, 'first');
rightColumn = find(horizontalProfile, 1, 'last');
You might want to take the mode of the image to see if the background is black or white so you know whether to use < or >.
Dima Lisin
Dima Lisin on 3 May 2014
Evgeny, the ocr() function returns an ocrText object, which contains lots of information, including the bounding box for every word. It is exactly what you need.

Sign in to comment.


Sean de Wolski
Sean de Wolski on 28 Apr 2014
Post an image!
More than likely you'll want the boundingbox option of regionprops. Perhaps a call to bwconvhull before that so that you get the bounding box of the convex hull of the text pieces. Just a guess though...

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!