Abbreviation near end of Table cell not found

General comments and questions. Technical support.
Post Reply
Alan
Posts: 274
Joined: Wed Jul 24, 2002 11:57 am
Location: TN USA

Abbreviation near end of Table cell not found

Post by Alan »

I have developed a Find with string to locate most abbreviations (text ending in a period) that have a printing character immediately following, string follows:
|(.)([!^w^#^11^8203^8239^65279^p"])|[!^11^8203^p]
It works except when the period has 1 character after it in a Table cell as seen in the example following:
oz.)
When I completely remove the Right conditional (|[!^11^8203^p]) the abbreviation is found in the Table cell. That Right conditional is there to weed out the many finds of end of paragraph sentences ending in a period and marked text which would give me an overwhelming number of results. What can I do to get this Find with string to work in normal text as it currently does and in the above Table cell example?
Atlantis 5
Windows 11 Pro for Workstations
User avatar
admin
Site Admin
Posts: 2926
Joined: Wed Jun 05, 2002 10:48 pm
Contact:

Post by admin »

I don't quite understand how to use your regex. Could you please compose a short sample document (with a table) to search for your pattern?
Alan
Posts: 274
Joined: Wed Jul 24, 2002 11:57 am
Location: TN USA

Post by Alan »

I have attached the file where I noticed this problem. There are 4 cells in the ingredients table (only table) that have the abbreviation oz. at the end as follows: "oz.)", otherwise the file is fully marked up and formatted. The Find what string should find these 4 instances according to my understanding, which is obviously wrong. The abbreviation lb. in the table was found by this string and is now marked up so it will no longer be found.

Alan
Attachments
Venison burger chow mein_BHM83_34.rtf
(8.02 KiB) Downloaded 442 times
Atlantis 5
Windows 11 Pro for Workstations
Robert
Posts: 1906
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

What about this wildcard search:

Code: Select all

^32|^L{2}.|
Robert
Posts: 1906
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

or this one which will catch "lb." as well:

Code: Select all

<|^L&#123;2&#125;.|
Alan
Posts: 274
Joined: Wed Jul 24, 2002 11:57 am
Location: TN USA

Post by Alan »

Hi Robert,
In general I am doing a search for abbreviations daily in multiple files along with (currently) 11 other conditions. That leaves 18 entries open in a Find what search. I catch abbreviations within sentences as long as they are followed by a space and at the end of a sentence but not at the end of a paragraph. There are not enough entries to be searching for every possible abbreviation. If I can figure out why the search string I have for abbreviations that are immediately followed by a printing character with the right-hand condition fails at the end of a table cell then I will have a way to catch 95% of abbreviations and mark them as such. I need that right-hand condition to limit results as the manual search I do for this case would be overwhelming.

Alan
Atlantis 5
Windows 11 Pro for Workstations
Robert
Posts: 1906
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

If I use the regex you posted:

Code: Select all

|&#40;.&#41;&#40;&#91;!^w^#^11^8203^8239^65279^p"&#93;&#41;|&#91;!^11^8203^p&#93;
Atlantis does not find any matching string in “Venison burger chow mein_BHM83_34.rtf”.
If I convert the table in “Venison burger chow mein_BHM83_34.rtf” to text:
1 lb.&#65279; of venison burger (Actually, a pound or so of chopped venison steak will work too).
1 stalk of celery diced
1 medium carrot diced
1 small onion diced
1 can vegetable beef soup (10-3/4 oz.)
1 can cream of mushroom soup (10-3/4 oz.)
1–&#65279;1/2 soup cans of water
1 can water chestnuts (8 oz.)
1 can bean sprouts (14 oz.)
&#8197; chow mein noodles
And apply the regex you posted, Atlantis still does not find any matching string.
Are you sure that the regex you posted is working in your version of Atlantis?
User avatar
admin
Site Admin
Posts: 2926
Joined: Wed Jun 05, 2002 10:48 pm
Contact:

Post by admin »

Your regex will not match abbreviations at the end of paragraph/cell because the right conditional [!^11^8203^p] cannot match a paragraph end mark. You clearly instruct Atlantis not to report instances followed by a paragraph end mark.
User avatar
admin
Site Admin
Posts: 2926
Joined: Wed Jun 05, 2002 10:48 pm
Contact:

Post by admin »

I might suggest a regex if you posted a sample document with all possible types of abbreviations (including their "environment") that should be reported by the Find tool.
Alan
Posts: 274
Joined: Wed Jul 24, 2002 11:57 am
Location: TN USA

Post by Alan »

You did not find any of the abbreviations that were not the 4 I pointed out because they have all been marked as abbreviations. Is the character at the end of a Table cell a paragraph character (^p)? I had assumed it was not since it is not a pilcrow which shows at the end of paragraphs when Special Symbols is turned on. If ^p sees it as a paragraph mark then that explains why the search string does not work.
Atlantis 5
Windows 11 Pro for Workstations
Robert
Posts: 1906
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

For technical (programmatic) reasons, neither Atlantis nor MS Word provide a code for finding end-of-cell markers. In a way, end-of-cell markers act as paragraph end marks but they cannot be searched as such.
We could help if you created a glossary of the abbreviations you are searching for (using). And posted it here.
Such a glossary would come in handy at the end of a recipe book.
HTH
Robert
User avatar
admin
Site Admin
Posts: 2926
Joined: Wed Jun 05, 2002 10:48 pm
Contact:

Post by admin »

Alan wrote:You did not find any of the abbreviations that were not the 4 I pointed out because they have all been marked as abbreviations. Is the character at the end of a Table cell a paragraph character (^p)? I had assumed it was not since it is not a pilcrow which shows at the end of paragraphs when Special Symbols is turned on. If ^p sees it as a paragraph mark then that explains why the search string does not work.
^p doesn't see an "end of cell" character as a paragraph end mark.

Searching within tables has some limitations. The search scope of Atlantis never includes "end of cell" characters. This is because selecting an "end of cell" character automatically selects the entire cell. So Atlantis searches table cells excluding their "end of cell" characters.
Alan
Posts: 274
Joined: Wed Jul 24, 2002 11:57 am
Location: TN USA

Post by Alan »

Hi Robert,
I gave a recipe ingredient table as an example because it is the most often variety that I make, which resulted in it being the first table that I saw this occurrence of an abbreviation period as the next to last character in a cell. Given that a search (thank you also Admin) cannot tell the end of a cell I will now look at the end of cells for abbreviations and not expect a search to find candidates. I catch abbreviations as various search patterns go through articles or books by me recognizing them (mainly when putting double spaces between sentences). I know a lot of common abbreviations and have also corrected misspelt abbreviations. I will attach a sample of some of the abbreviations, military and non-food abbreviations far outweigh those found in recipes.
Attachments
Abbreviations sample.rtf
(3.66 KiB) Downloaded 538 times
Atlantis 5
Windows 11 Pro for Workstations
Robert
Posts: 1906
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

The following regex finds all abbreviations in “Venison burger chow mein_BHM83_34.rtf” and all but one in “Abbreviations sample.rtf” (“c.” being left out):

Code: Select all

|&#91;^$.&#93;&#123;2,7&#125;.|&#91;!^32^32^p&#93;
But this is not a catch-all regex. It might not work in other documents (environments).
HTH
Robert
Alan
Posts: 274
Joined: Wed Jul 24, 2002 11:57 am
Location: TN USA

Post by Alan »

Thanks Robert,
As it is now with the 13 searches I do, I catch all abbreviations but end of paragraph and the end of cell of this thread. Since I manually format the paragraphs of the ingredients list to form a table I am working on forming the habit of looking at the end of each ingredient to see if there is an abbreviation. Normally the abbreviations are at the beginning.
It just bothered me that I did not understand why the search did not work in tables, which both you and Admin explained to my satisfaction.

Regards,
Alan
Atlantis 5
Windows 11 Pro for Workstations
Post Reply