Page 1 of 1

Compare PDF reports

Posted: Sun Aug 16, 2015 2:28 pm
by Praveen597
Hi All,

Can Ranorex support comparison of 2 PDF files ? Is there any inbuilt API ? (paid or unpaid, any thing is fine)

Thanks,
Praveen

Re: Compare PDF reports

Posted: Sun Aug 16, 2015 3:58 pm
by odklizec
Hi,

There is no built-in support to directly compare two PDF files. By default, Ranorex is able to track and validate the elements in PDF file with enabled accessibility support (search this forum for PDF validation topics). Any other advanced validation (like comparing entire PDF files) could be done via user code, eventually by using a 3rd party PDF comparison library.

Re: Compare PDF reports

Posted: Tue Oct 13, 2015 3:51 am
by jasoncleo
PDF comparison is tricky, and it can be complicated by how the new and baseline PDFs are created. I don't know the technical explanation for it, but PDF allows for "layers", and each layer can have differing properties and attributes based on the content it holds.

This is important to know for a few reasons, because you can have two PDFs that look the same, but cannot be compared because the layers are different. For example, a PDF may have been "flattened" so that all the layers are merged and converted into simply a single layer image.

There are a few commercial PDF tools out there which also support comparison, and allow for integration with a .Net environment. Many of them are limited though as they'll compare basic layer structure and text content of the layers that have text, but they won't handle graphics/tables/charts well in the comparison, so some just skip those in the comparison altogether.

We ended up using the approach of an image comparison. We leveraged Spire .Net as the library to convert our PDF documents into page-by-page images, and then used a simple algorithm to build a 2-D array of pixel brightness/colour and then compare them that way with a set tolerance, going through a page at a time.

It was a crude mechanism, but works well enough for what we need. The benefit of the pixel array, is that it allowed us to output additional images that circled the spots where pages differed from the baseline for easy analysis.

Spire .Net isn't free (unless you want to use their hobbled 3 page version). It is possible to use GhostScript which is an opensource library, but you'd need to do a bit of work, and I didn't have the time for that.