How to get data and images from a PDF?

Class library usage, coding and language questions.
AutomationTester
Posts: 30
Joined: Mon Jan 21, 2013 1:31 pm

How to get data and images from a PDF?

Post by AutomationTester » Fri Jan 25, 2013 12:34 pm

Hai All,

How to get data and images from a PDF? Is there any Plugin available for it? If yes, then how to add it in ranorex and use it? Please Help me in this issue.

Regards,

Amir aka AutomationTester

User avatar
Support Team
Site Admin
Site Admin
Posts: 11709
Joined: Fri Jul 07, 2006 4:30 pm
Location: Graz, Austria

Re: How to get data and images from a PDF?

Post by Support Team » Fri Jan 25, 2013 3:49 pm

Hi,

You should be able to automate a PDF without having to make any changes in Ranorex, there is no specific Ranorex plugin for PDF.
Could you please describe in detail which problems you are facing and which Ranorex version you are using?

Thanks,
Markus
.
Image

AutomationTester
Posts: 30
Joined: Mon Jan 21, 2013 1:31 pm

Re: How to get data and images from a PDF?

Post by AutomationTester » Mon Jan 28, 2013 7:19 am

Hai Markus,

Good day. Actually, After my application finished it's execution,it generates a PDF document containing all details about it. It contains both images and text. As my friend suggested I downloaded "iTextSharp" dll. But, I don't know how to add the dll in ranorex and use it.


I'm attaching the zip file of dll and a PDF file (inside zip). Please help me with an example.

Thanks and Regards,
Amir aka AutomationTester
Attachments
itextsharp-all-5.3.5.zip
(4.47 MiB) Downloaded 599 times

User avatar
Support Team
Site Admin
Site Admin
Posts: 11709
Joined: Fri Jul 07, 2006 4:30 pm
Location: Graz, Austria

Re: How to get data and images from a PDF?

Post by Support Team » Tue Jan 29, 2013 4:18 pm

Hello,

Thank you for your files.

You do not need any plugin or DLL to identify elements in your PDF.
Please use the current Adobe Reader XI and set recommended accessibility options.

These options can be set if you select the menu 'Edit/Accessibility/Setup Assistant'.
Please click on 'Use recommended settings' in the assistant.
Please verify if you have 'Read the entire document' is selected in 'Edit/Accessibility/Change Reading Options'.

Regards,
Markus (T)
.
Image

strannik
Posts: 60
Joined: Tue Apr 29, 2014 3:00 pm

Re: How to get data and images from a PDF?

Post by strannik » Tue May 06, 2014 6:54 pm

AS well I need to compare values in pdf but In my case Edit/Accessibility/Change Reading Options is grayed out. What we should do?

Thanks

User avatar
odklizec
Ranorex Guru
Ranorex Guru
Posts: 3921
Joined: Mon Aug 13, 2012 9:54 am
Location: Zilina, Slovakia

Re: How to get data and images from a PDF?

Post by odklizec » Wed May 07, 2014 7:28 am

Hi,

Check the properties of your PDF file...
pdf_settings.png
pdf_settings.png (25.88 KiB) Viewed 6071 times
I guess the Accessibility option is "Not Allowed" in your case? The only thing you can probably do is to ask the PDF owner/creator to unlock this option? Then you should be able to access the content of PDF files from Ranorex.
Pavel Kudrys
Ranorex explorer at Descartes Systems

Please add these details to your questions:
  • Ranorex Snapshot. Learn how to create one >here<
  • Ranorex xPath of problematic element(s)
  • Ranorex version
  • OS version
  • HW configuration

strannik
Posts: 60
Joined: Tue Apr 29, 2014 3:00 pm

Re: How to get data and images from a PDF?

Post by strannik » Wed May 07, 2014 9:15 pm

Content Coping for Accessibility in case is Allowed but when I want to validate data I'm getting this.
Please see an attachment.

User avatar
odklizec
Ranorex Guru
Ranorex Guru
Posts: 3921
Joined: Mon Aug 13, 2012 9:54 am
Location: Zilina, Slovakia

Re: How to get data and images from a PDF?

Post by odklizec » Wed May 07, 2014 9:24 pm

That's OK. You just need to confirm this dialog and wait to finish the PDF processing. After that, you should be able to read and evaluate the PDF content.
Pavel Kudrys
Ranorex explorer at Descartes Systems

Please add these details to your questions:
  • Ranorex Snapshot. Learn how to create one >here<
  • Ranorex xPath of problematic element(s)
  • Ranorex version
  • OS version
  • HW configuration

strannik
Posts: 60
Joined: Tue Apr 29, 2014 3:00 pm

Re: How to get data and images from a PDF?

Post by strannik » Thu May 08, 2014 2:22 pm

Unfortunately Ranorex doesn't recognized the value it self. Is it possible to get value from this validation?
At the moment I'm using Ranorex Spy. On the left right top of corner in Ranorex spy it is says Ranorex Spy(32bit) - Live is it correct? I'm using Windows 7 64 bit system may be that's why Spy cannot recognized the value?

Please see an attachment.

User avatar
odklizec
Ranorex Guru
Ranorex Guru
Posts: 3921
Joined: Mon Aug 13, 2012 9:54 am
Location: Zilina, Slovakia

Re: How to get data and images from a PDF?

Post by odklizec » Thu May 08, 2014 2:52 pm

I guess you started the spy from Ranorex Studio? Because Ranorex Studio itself is 32bit application, it starts 32bit spy. You can always start 64bit spy outside the Studio. Just go to Start > Programs > Ranorex menu and here select 64bit spy (the one without bit extension). But I personally don't think this will help. What exactly do you see in spy if you select that highlighted section? Is there a "text" or "value" attribute containing the content of selection?
Pavel Kudrys
Ranorex explorer at Descartes Systems

Please add these details to your questions:
  • Ranorex Snapshot. Learn how to create one >here<
  • Ranorex xPath of problematic element(s)
  • Ranorex version
  • OS version
  • HW configuration

strannik
Posts: 60
Joined: Tue Apr 29, 2014 3:00 pm

Re: How to get data and images from a PDF?

Post by strannik » Thu May 08, 2014 4:34 pm

Yes, if I select Text on left side of panel it will display Policy Number on right side but with additional data. I need only policy number to compare. See an attachment for details.
Attachments
policynumber.PNG
policynumber.PNG (6.84 KiB) Viewed 6043 times

User avatar
odklizec
Ranorex Guru
Ranorex Guru
Posts: 3921
Joined: Mon Aug 13, 2012 9:54 am
Location: Zilina, Slovakia

Re: How to get data and images from a PDF?

Post by odklizec » Thu May 08, 2014 6:38 pm

I think you will have to parse the text you obtain from the PDF. Maybe a clever regular expression could be useful here? Could you please post a snapshot file generated from the selected PDF element? See these instructions how to create one...
http://www.ranorex.com/support/user-gui ... files.html
Pavel Kudrys
Ranorex explorer at Descartes Systems

Please add these details to your questions:
  • Ranorex Snapshot. Learn how to create one >here<
  • Ranorex xPath of problematic element(s)
  • Ranorex version
  • OS version
  • HW configuration

strannik
Posts: 60
Joined: Tue Apr 29, 2014 3:00 pm

Re: How to get data and images from a PDF?

Post by strannik » Thu May 08, 2014 8:42 pm

Thanks a lot for helping. here it is.

User avatar
odklizec
Ranorex Guru
Ranorex Guru
Posts: 3921
Joined: Mon Aug 13, 2012 9:54 am
Location: Zilina, Slovakia

Re: How to get data and images from a PDF?

Post by odklizec » Fri May 09, 2014 9:17 am

OK, got it. The question now is, what exactly you want to do with that text? Just compare the number with a number stored in excel/csv file?

In the attached file, you can find an example project where you can learn how to validate the policy number in your selected PDF text, using AttributeContains action.
ValidateTextWithRegExp.zip
(35.36 KiB) Downloaded 396 times
Next line in recording show you how to get the policy number using GetValue action and a regular expression searching for any number found after "#:" string. Last line simply wrote the obtained number to report. My knowledge of RegEx is far from ideal so there could be a better/simpler way to do it? ;) For example, if you know the Policy number is always 6 digits, instead of using (?<=#:\s)\d* pattern, it should be just enough to use \d{6} ?

If you prefer to use the coded way (instead of record-based actions), simply examine the code behind each action (right click on action and select View Code). Hope this helps?
Pavel Kudrys
Ranorex explorer at Descartes Systems

Please add these details to your questions:
  • Ranorex Snapshot. Learn how to create one >here<
  • Ranorex xPath of problematic element(s)
  • Ranorex version
  • OS version
  • HW configuration

strannik
Posts: 60
Joined: Tue Apr 29, 2014 3:00 pm

Re: How to get data and images from a PDF?

Post by strannik » Fri May 09, 2014 2:13 pm

Thank you, Pavel. I will try to do it. I have just started to learn this tool.