Page 1 of 2

How to get data and images from a PDF?

Posted: Fri Jan 25, 2013 12:34 pm
by AutomationTester
Hai All,

How to get data and images from a PDF? Is there any Plugin available for it? If yes, then how to add it in ranorex and use it? Please Help me in this issue.

Regards,

Amir aka AutomationTester

Re: How to get data and images from a PDF?

Posted: Fri Jan 25, 2013 3:49 pm
by Support Team
Hi,

You should be able to automate a PDF without having to make any changes in Ranorex, there is no specific Ranorex plugin for PDF.
Could you please describe in detail which problems you are facing and which Ranorex version you are using?

Thanks,
Markus

Re: How to get data and images from a PDF?

Posted: Mon Jan 28, 2013 7:19 am
by AutomationTester
Hai Markus,

Good day. Actually, After my application finished it's execution,it generates a PDF document containing all details about it. It contains both images and text. As my friend suggested I downloaded "iTextSharp" dll. But, I don't know how to add the dll in ranorex and use it.


I'm attaching the zip file of dll and a PDF file (inside zip). Please help me with an example.

Thanks and Regards,
Amir aka AutomationTester

Re: How to get data and images from a PDF?

Posted: Tue Jan 29, 2013 4:18 pm
by Support Team
Hello,

Thank you for your files.

You do not need any plugin or DLL to identify elements in your PDF.
Please use the current Adobe Reader XI and set recommended accessibility options.

These options can be set if you select the menu 'Edit/Accessibility/Setup Assistant'.
Please click on 'Use recommended settings' in the assistant.
Please verify if you have 'Read the entire document' is selected in 'Edit/Accessibility/Change Reading Options'.

Regards,
Markus (T)

Re: How to get data and images from a PDF?

Posted: Tue May 06, 2014 6:54 pm
by strannik
AS well I need to compare values in pdf but In my case Edit/Accessibility/Change Reading Options is grayed out. What we should do?

Thanks

Re: How to get data and images from a PDF?

Posted: Wed May 07, 2014 7:28 am
by odklizec
Hi,

Check the properties of your PDF file...
pdf_settings.png
I guess the Accessibility option is "Not Allowed" in your case? The only thing you can probably do is to ask the PDF owner/creator to unlock this option? Then you should be able to access the content of PDF files from Ranorex.

Re: How to get data and images from a PDF?

Posted: Wed May 07, 2014 9:15 pm
by strannik
Content Coping for Accessibility in case is Allowed but when I want to validate data I'm getting this.
Please see an attachment.

Re: How to get data and images from a PDF?

Posted: Wed May 07, 2014 9:24 pm
by odklizec
That's OK. You just need to confirm this dialog and wait to finish the PDF processing. After that, you should be able to read and evaluate the PDF content.

Re: How to get data and images from a PDF?

Posted: Thu May 08, 2014 2:22 pm
by strannik
Unfortunately Ranorex doesn't recognized the value it self. Is it possible to get value from this validation?
At the moment I'm using Ranorex Spy. On the left right top of corner in Ranorex spy it is says Ranorex Spy(32bit) - Live is it correct? I'm using Windows 7 64 bit system may be that's why Spy cannot recognized the value?

Please see an attachment.

Re: How to get data and images from a PDF?

Posted: Thu May 08, 2014 2:52 pm
by odklizec
I guess you started the spy from Ranorex Studio? Because Ranorex Studio itself is 32bit application, it starts 32bit spy. You can always start 64bit spy outside the Studio. Just go to Start > Programs > Ranorex menu and here select 64bit spy (the one without bit extension). But I personally don't think this will help. What exactly do you see in spy if you select that highlighted section? Is there a "text" or "value" attribute containing the content of selection?

Re: How to get data and images from a PDF?

Posted: Thu May 08, 2014 4:34 pm
by strannik
Yes, if I select Text on left side of panel it will display Policy Number on right side but with additional data. I need only policy number to compare. See an attachment for details.

Re: How to get data and images from a PDF?

Posted: Thu May 08, 2014 6:38 pm
by odklizec
I think you will have to parse the text you obtain from the PDF. Maybe a clever regular expression could be useful here? Could you please post a snapshot file generated from the selected PDF element? See these instructions how to create one...
http://www.ranorex.com/support/user-gui ... files.html

Re: How to get data and images from a PDF?

Posted: Thu May 08, 2014 8:42 pm
by strannik
Thanks a lot for helping. here it is.

Re: How to get data and images from a PDF?

Posted: Fri May 09, 2014 9:17 am
by odklizec
OK, got it. The question now is, what exactly you want to do with that text? Just compare the number with a number stored in excel/csv file?

In the attached file, you can find an example project where you can learn how to validate the policy number in your selected PDF text, using AttributeContains action.
ValidateTextWithRegExp.zip
Next line in recording show you how to get the policy number using GetValue action and a regular expression searching for any number found after "#:" string. Last line simply wrote the obtained number to report. My knowledge of RegEx is far from ideal so there could be a better/simpler way to do it? ;) For example, if you know the Policy number is always 6 digits, instead of using (?<=#:\s)\d* pattern, it should be just enough to use \d{6} ?

If you prefer to use the coded way (instead of record-based actions), simply examine the code behind each action (right click on action and select View Code). Hope this helps?

Re: How to get data and images from a PDF?

Posted: Fri May 09, 2014 2:13 pm
by strannik
Thank you, Pavel. I will try to do it. I have just started to learn this tool.