Autocorrect Replace Text

General comments and questions. Technical support.
Post Reply
cjseasyaspie
Posts: 34
Joined: Tue Jul 03, 2012 9:10 am
Contact:

Autocorrect Replace Text

Post by cjseasyaspie »

Can someone help me with this?

I can't make the replace text feature work to remove extra paragraph returns or manual page breaks.

I see that both are there in the "replace this with that" list, but they don't get replaced in my test files.

I also need to know how to repair broken sentences... such as are common when working with a file that has come from a scanner, or has been converted from one format to another.

Typically, a sentence will just get broken for no known reason, with the second half ending up on the line below. So there is one sentence (or half a sentence) ending without an "end of sentence" punctuation mark, and the line below starts without the usual capitalized letter.

The best I can figure out, the need is to find a pilcrow that is NOT preceded with an "end of sentence" punctuation mark... but I don't know how to write out the formula to do that.

Any help would be much appreciated.

Thanks!
Happy Kindling,

CJ, at CJ's Easy as Pie Kindle Tutorials
http://www.cjs-easy-as-pie.com/
Robert
Posts: 1906
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

Here is how to replace manual page breaks with a paragraph end mark (removing the manual page breaks only might result in undesirable consequences):

Image

Here is how to replace two paragraph end marks with only one:

Image

As you noticed, a common feature of documents which result from the work of an OCR application, is that some sentences are broken over 2 lines for no apparent reason. Most of the time, the following patterns can be found:

1. pilcrow+space

In such cases, use this setup:
Find box: ^p(+space)
Replace with box: space

2. space+pilcrow+space

Use this setup:
Find box: (space)^p(space)
Replace with box: (space)

3. comma+pilcrow

Use this setup:
Find box: ,^p
Replace with box: ,(space)

4. comma+pilcrow+space

Use this setup:
Find box: ,(space)^p(space)
Replace with box: ,(space)

Note that the Atlantis spellchecker (“Tools | Spellcheck…”) and the Atlantis AutoCorrect after-you-type feature (“Tools | AutoCorrect…”) will show you real misspellings or punctuation problems, but also typos created by the OCR application that you might have overlooked.

HTH.
Cheers,
Robert
cjseasyaspie
Posts: 34
Joined: Tue Jul 03, 2012 9:10 am
Contact:

Post by cjseasyaspie »

Thank you!

I did try a simple search and replace, but maybe I did something wrong.

But I had no idea how to attack the broken sentence problem.

I'll go through your steps.

btw... I write tutorials for newbies... the reason I need to work everything down to the easiest possible routines.

Thanks again!

CJ
Happy Kindling,

CJ, at CJ's Easy as Pie Kindle Tutorials
http://www.cjs-easy-as-pie.com/
Post Reply