PPTools/Ppgen/Tutorial/NbspNgram
Non-breaking spaces look like any other space. The difference is that Ppgen will not put a line break where there is a non-breaking space. Let's see this in action. In a recent book, I noticed this in the generated text version:
Source Code
“Say!” Maben turned on him in mock fierceness, “I’m of a mind to kick you for overstunting on that plane wing. No use being too risky—just plain foolishness, that. But, kid,” the aviator’s habitually tense face relaxed into a boyish grin. “I’ll say you made that come down O. K.,—all jake! An oldtimer couldn’t have done it prettier. Listen, I got a proposition I want to make you!” // 064.png
Look at that next-to-last line. The "K." is split from the "O." and that's certainly not what we want. The solution is to put a non-breaking space between the "O." and the "K.".
There are three ways to put a non-breaking space into the source code. One is to put an actual UTF-8 non-breaking space character. This works but I don't recommend it because it's not apparent that it's there. I'd rather see it in the source code to make sure it's there. Ppgen provides two forms: "\ " and "\_" to indicate a non-breaking space. (The second form is useful at the end of a line where many editors would strip the trailing space if the first form were used.) So I'll change the source to use "\ " and regenerate.
Source Code
“Say!” Maben turned on him in mock fierceness, “I’m of a mind to kick you for overstunting on that plane wing. No use being too risky—just plain foolishness, that. But, kid,” the aviator’s habitually tense face relaxed into a boyish grin. “I’ll say you made that come down O.\ K.,—all jake! An oldtimer couldn’t have done it prettier. Listen, I got a proposition I want to make you!” // 064.png
Looking at that, the "O. K." is correct but now I notice that "oldtimer" doesn't look right. Might that be hyphenated? Google's Ngram viewer is helpful in this situation. For example, in olden-days, the word "tonight" was hyphenated. In modern spelling, it isn't. When did that change? Let's take a look. Red is "to-night," blue is "tonight," and the bottom axis is by decade:
This is using Google's Ngram Viewer. Let's try it on "oldtimer" and "old-timer".
Hmmm. This book was printed in 1930. Let's look at the original page image.
We see in the source it was ambiguous. It was correctly flagged as a maybe-hyphen with "old-*timer" by the proofers but this somehow got lost in post-processing (for the sake of this example). The PPer makes the final call, and seeing the ngram for "old-timer" in 1930 as well as finding it hyphenated in another place in this same book makes it an easy decision. It becomes "old-timer" and we're done.
Source Code
“Say!” Maben turned on him in mock fierceness, “I’m of a mind to kick you for overstunting on that plane wing. No use being too risky—just plain foolishness, that. But, kid,” the aviator’s habitually tense face relaxed into a boyish grin. “I’ll say you made that come down O.\ K.,—all jake! An old-timer couldn’t have done it prettier. Listen, I got a proposition I want to make you!” // 064.png