Searching for words with specific format in Microsoft Word

Technology specific object identification, supported applications, web technologies, and 3rd party controls.
dmg
Posts: 4
Joined: Tue Mar 01, 2016 5:06 pm

Searching for words with specific format in Microsoft Word

Post by dmg » Wed May 09, 2018 10:27 am

Hi! First of all, I want to apologize if I should have written this message in another section or if the topic already exists, I've done some research within the forum but I haven't come up with a solution. I've already read these topics (I can't link URLs as my account is new):
Forum: ...automation-microsoft-word-2013-t6550
Forum: ...automating-outlook-t1239
Ranorex website: help/latest/technology-instrumentation/testing-of-legacy-applications

I'm working with Windows 10 Enterprise, Ranorex 8.1.1 and Microsoft Word 2010. My objective is to find all the words in a document (Microsoft Word) that fulfill a specific format, which is: red and underlined or red and strike-through. See some examples in the picture I've attached.
First of all, I've tried to perform it by means of Ranorex Spy, adding class name to GPI capture list. However, Spy can't detect every word, it seems to detect some kind of text blocks whose pattern is unknown for me. Please, check the Ranorex Spy snapshot and compare it the Word document appearance that I've attached. As well as not detecting word by word, I can't see the properties involving underlined and strike-through, I just can see the color property. Has Ranorex got something to perform that?

My alternative has been to use some code in C# adapted to Microsoft, like explained in msdn.microsoft.com/en-us/library/kw65a0we.aspx and docs.microsoft.com/en-us/dotnet/csharp/programming-guide/interop/how-to-access-office-onterop-objects
I've managed to do researches by words, although not by color, underlined or strike-through properties. For the color research:
Word.Range auxrng = myDocument.Content;
auxrng.Find.ClearFormatting();
auxrng.Find.Forward = true; 
auxrng.Find.Font.ColorIndex=Word.WdColorIndex.wdRed; //red color search.
auxrng.Select(); // It is selecting the whole previous range, no distinction for the first red word
auxrng.Find.Execute("");
auxrng.Find.Font.Underline=Word.WdUnderline.wdUnderlineSingle;         		
auxrng.Select(); /it is selecting the whole previous range again
auxrng.Find.Execute("");
Before this code, I had written previous code to communicate Ranorex and Microsoft Word, no problems with that.
Word.Application wordApp = new Word.ApplicationClass();
Word.Document myDocument = wordApp.Documents.Open(wordComparedFilePath);
wordApp.Visible = true;
Even, as said, I've managed to define ranges starting by a specific searched word and ending with another word.
Regarding the strike-through words, I haven't found a code for searching them, just this: auxrng.Find.Font.Underline.StrikeThrough
I'm afraid this method returns a boolean but does not modify the range.
I would appreciate very much if someone could help me on this. If you need more files or information, ask me ;)
Attachments
document_appearance-Word.jpg
document_appearance-Word.jpg (27.92 KiB) Viewed 331 times
Document_snapshot.rxsnp
(108.49 KiB) Downloaded 11 times

User avatar
RobinHood42
Posts: 238
Joined: Fri Jan 09, 2015 3:24 pm

Re: Searching for words with specific format in Microsoft Word

Post by RobinHood42 » Mon May 14, 2018 8:49 am

Hi dmg,
My objective is to find all the words in a document (Microsoft Word) that fulfill a specific format, which is: red and underlined or red and strike-through. See some examples in the picture I've attached.
First of all, I've tried to perform it by means of Ranorex Spy, adding class name to GPI capture list. However, Spy can't detect every word, it seems to detect some kind of text blocks whose pattern is unknown for me. Please, check the Ranorex Spy snapshot and compare it the Word document appearance that I've attached. As well as not detecting word by word, I can't see the properties involving underlined and strike-through, I just can see the color property. Has Ranorex got something to perform that?
The words within a Word doc are drawn, so there is no object recognition, besides GDI, available. The only way to "check" the format of the text and formatting would be to compare images, which is not really reliable.

Anyhow, I found the following article, which might help you with the Interop test you created: https://bit.ly/2rEzu3S

Hope this helps.

Cheers,
Robin :mrgreen:

dmg
Posts: 4
Joined: Tue Mar 01, 2016 5:06 pm

Re: Searching for words with specific format in Microsoft Word

Post by dmg » Tue May 15, 2018 1:23 pm

RobinHood42 wrote:Hi dmg,
My objective is to find all the words in a document (Microsoft Word) that fulfill a specific format, which is: red and underlined or red and strike-through. See some examples in the picture I've attached.
First of all, I've tried to perform it by means of Ranorex Spy, adding class name to GPI capture list. However, Spy can't detect every word, it seems to detect some kind of text blocks whose pattern is unknown for me. Please, check the Ranorex Spy snapshot and compare it the Word document appearance that I've attached. As well as not detecting word by word, I can't see the properties involving underlined and strike-through, I just can see the color property. Has Ranorex got something to perform that?
The words within a Word doc are drawn, so there is no object recognition, besides GDI, available. The only way to "check" the format of the text and formatting would be to compare images, which is not really reliable.

Anyhow, I found the following article, which might help you with the Interop test you created: https://bit.ly/2rEzu3S

Hope this helps.

Cheers,
Robin :mrgreen:
Thank you Robin, I'm going to try this way ;)

dmg
Posts: 4
Joined: Tue Mar 01, 2016 5:06 pm

Re: Searching for words with specific format in Microsoft Word

Post by dmg » Tue May 15, 2018 4:35 pm

Finally I can detect words with this format (red and underlined or strikethrough,I write the solution at the bottom of the post) but I'm facing another problem. When I use a document created by the tool Compare in Word (Review-->Compare) to compare two documents, the new document shows in red and underlined or strikethrough format all those words causing a difference. However,when I select those words manually, the color (right-click + font color + more colors) is declared as 'automatic', not red. And I guess that's why my algorithm cannot detect these words. How could I deal with this? I want Word to show me the 'real' color. I guess it's the same for underlined and strikethrough. At the beginning of the post, I've said I can detect words of the desired format because these words have been formatted by myself manually, not by the Compare tool.

My solution to detect red and underlined or strikethrough is:
foreach (Word.Range rngword in rng.Words)
	        {
        		count++;
        		string countstring = count.ToString();
        		if ((rngword.Font.ColorIndex==Word.WdColorIndex.wdRed)) { //red word
        			foundword=true;
        			rngword.Select();
        			if (( rngword.Font.Underline) == Word.WdUnderline.wdUnderlineSingle) //red and underlined word
		            {
        				lineRange.Select();
		        		Report.Log(ReportLevel.Failure, rngword.Text + "causes a difference, see word number" + countstring);
	        		
		            }
        			else if (rngword.Font.StrikeThrough==(Convert.ToInt32(true))) //red and strikethrough word
		            {
        				lineRange.Select();
		        		Report.Log(ReportLevel.Failure, rngword.Text + "causes a difference, see word number" + countstring);
		            }
		            else //just a plain red word
		            {
		            	lineRange.Select();
		            	Report.Info( rngword.Text + "causes a difference, see word number" + countstring
		            }
		            continue;
        		}
        		else {
        			continue;
        		}
        		Delay.Milliseconds(1000);   		
	        }
        	if (foundword==false) {
        		Report.Info("No difference was found");
        	}
Best regards
Attachments
automatic color.jpg
Color is shown as Automatic, despite being red
automatic color.jpg (66.84 KiB) Viewed 259 times

User avatar
RobinHood42
Posts: 238
Joined: Fri Jan 09, 2015 3:24 pm

Re: Searching for words with specific format in Microsoft Word

Post by RobinHood42 » Wed May 16, 2018 10:29 am

Hi,

I'm pretty sure that the format is "not really" applied to the text or at least can't be externally accessed in case the compare tool is used. That's why there is a completely different context menu as well ("Apply/Reject" changes...). I'm afraid that there is not much you can do about that fact.

If your use case is to compare files per code, why do you actually use Word? I would definitely suggest using some .NET methods to do so instead.

Cheers,
Robin :mrgreen:

dmg
Posts: 4
Joined: Tue Mar 01, 2016 5:06 pm

Re: Searching for words with specific format in Microsoft Word

Post by dmg » Wed May 16, 2018 2:02 pm

RobinHood42 wrote:Hi,

I'm pretty sure that the format is "not really" applied to the text or at least can't be externally accessed in case the compare tool is used. That's why there is a completely different context menu as well ("Apply/Reject" changes...). I'm afraid that there is not much you can do about that fact.

If your use case is to compare files per code, why do you actually use Word? I would definitely suggest using some .NET methods to do so instead.

Cheers,
Robin :mrgreen:
Hi Robin! Thank you for your support. I've finally found an Interop method to perform it: Word.Revision. That allows me to track changes obtained by the tool Compare. This is the code which works for me:
foreach (Word.Revision rngrevision in rng.Revisions)
		    {
	             
		    	if (rngrevision.Type != Word.WdRevisionType.wdNoRevision)
	             {
	             	foundword=true;
	             	Report.Log(ReportLevel.Failure, rngrevision.Range.Text + " causes a difference of the type: " + rngrevision.Type);
	             	
	             }
	             else {
	             	
	             }
		    }
Revision.Type is a enumeration of the ways related to the changes between documents, which is absolutely useful for me because it detects not only word changes but also format changes.
Thank you for your help.
Best regards

User avatar
RobinHood42
Posts: 238
Joined: Fri Jan 09, 2015 3:24 pm

Re: Searching for words with specific format in Microsoft Word

Post by RobinHood42 » Thu May 17, 2018 7:52 am

Hi,

Great! I'm glad you found the correct API.

Cheers,
Robin :mrgreen: