ePub - page breaks and font attributes

Request new features or suggest modifications to existing features of Atlantis.
Post Reply
gyuen
Posts: 34
Joined: Tue Jun 23, 2009 6:13 pm

ePub - page breaks and font attributes

Post by gyuen »

Hey,

Am making a new ePub. It's a book of poetry and short pargraphs. Each one is an H3 heading. Seems Atlantis automatically creates a page break. It would be nice if, for each heading, the paragraph style setting was used : break or no break. In this particular book, there are hundreds of H3 sections and it'd be nice for the page breaks to be as they're set, in H1 and H2.

I'm also trying some font attributes like small caps. I think the way Atlantis does it is as a work around so it'll show up correctly in ADE. Is it possible to have an option to use "font-variant: small-caps" ? It works with iBooks and I think it's a nice option rather than getting around the lack of features of ADE.

Word's settings for uppercase and lowercase would also be nice. Like if I change the case of something, if I change it back (re-apply the same setting, like uppercase, then uppercase again to turn it off), the original case is re-applied, so seems Word keeps track of that info. Would it be possible, for Atlantis to do the same? Like use "text-transform:uppercase" and the other css supported properties?

The main thing is the H3 page break. Hope you'll consider that and maybe even have a beta kinda soon (you guys are great when something comes up). Hoping to finish the ePub within the next month or so.

Gary
Robert
Posts: 1890
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

Hi Gary,
If I have a text in lowercase in Word 2007, then apply the “change case > uppercase” command to it, it is changed into uppercase. This is as expected. And if I reapply the same “change case > uppercase” command to that text which is now uppercase, it stays in uppercase. The same is happening in Atlantis.

Note that if you have applied some formatting to a text, and want to revert to the original text, you can always press “Ctrl+Z” to undo the last action.

“All Caps” and “Small Caps” are supported by the XHTML and EPUB formats. But currently very few eReaders have support for these font formatting attributes. Maybe iBooks has such support, but many eReaders simply ignore the EPUB “All Caps” and “Small Caps” code. They render both with the lowercase attribute. Atlantis emulates “All Caps” and “Small Caps” so that the vast majority of eReaders do display them as such.

Now if I have a “Heading 3” paragraph whose properties do not have the “page break before” attribute in a source document, and I save that document to EPUB in Atlantis, I don’t get any page break before the corresponding EPUB paragraph. Are you sure that your “Heading 3” styles do not have the “page break before” attribute set? Or maybe you have inadvertently formatted these paragraphs with the “page break before” attribute found in the “Format | Paragraph…” dialog?

HTH.
Cheers,
Robert
gyuen
Posts: 34
Joined: Tue Jun 23, 2009 6:13 pm

Post by gyuen »

Robert,

About the uppercase, I'm using a keyboard shortcut (in Word) for it, and pressing it again does toggle the case. If I close and re-open the document, and try the shortcut it again, it reverts back to what it was before, so seems Word does keep track of the original case.

For small caps, understand most readers don't support it, but an ePub option to use the font-variant would be nice. I suspect eventually readers will support it and it'd be nice to have.

For the H3 page-breaks, I definitely don't have that set. I can see it in the document. When I check it in Atlantis, it's that way as well. Perhaps it's in my document that is about 800 pages and has hundreds of H3 headings already. I could send a link to it so you could check.

Gary
Robert
Posts: 1890
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

Gary,
It seems your document includes very long “chapters”. In such cases, when the corresponding XHTML file would exceed 300k for a single “chapter”, Atlantis splits it up so that the resulting XHTML file stays within the 300k limit. Atlantis then uses any Heading style it finds as “delimiter” to split up the book contents into manageable chunks. This is because ADE, and a number of eReaders are known to choke on XHTML files bigger than 300k. Here is from http://www.prepressure.com/library/file-formats/epub:
EPUB is based on three open standards:

Open Publication Structure (OPS) – An EPUB 2.0 file uses XHTML 1.1 to construct the content of a publication. In essence this means that an EPUB file consists of one or more web pages. Even though you could include the entire content of a book or newspaper in a single page, it is better that such a file doesn’t exceed 300K, both for performance and compatibility reasons. Just like with regular web pages, the styling and layout is defined using cascading style sheets (CSS). In EPUB files a subset (limited series of commands) of CSS2 needs to be used. Many of the new features of CSS3, such as rounded boxes or drop shadows, are not available yet.
And from https://blogs.adobe.com/digitaleditions ... al_ed.html:
File size limits in Digital Editions

While the Best Practices document (over in the Adobe Developer Connection) had the recommendation that books be broken up into chapters, it wasn’t clear when you needed to break a document up further, or when a book was small enough so that one XHTML file was sufficient.

With the recent update to the document (version 1.0.2) we’ve given you information on how big the chapters can be.

Those limits on the file sizes are as follows:* XHTML/DTBook chapters: 300k uncompressed per chapter/100k compressed.* Images: 10M uncompressed. Note that these limits are for a single chapter document (or image) within the package, not a limitation on the size of the ePub document. The ePub can of course be much larger (limited by available storage space.) The limits are important for reading ePub documents on a mobile device.
If you want the Atlantis developers to have a closer look at your document and suggest solutions, please send it as attachment to support@AtlantisWordProcessor.com.

Now about toggling case pattern in Atlantis, have you tried “Shift+F3” for Uppercase, “Ctrl+F3” for Lowercase, and “Ctrl+Shift+F3” to “invert case”? Maybe “Ctrl+Shift+F3” is what you are looking after?

HTH.
Cheers,
Robert
gyuen
Posts: 34
Joined: Tue Jun 23, 2009 6:13 pm

Post by gyuen »

Hey Robert,

Thanks for the help. I'm aware that Atlantis splits up the files at 300k. The problem with this one, is let's say one H2 section has a hundred H3s. Each of them vary in size from a few lines to a few pages. Every H3 has it's own page. Perhaps a problem is I created the document in a localized version of Word (French). So when I look at the heading in Atlantis, I forget what it was, but it was something like >> Heading 3 <<. I forgot exactly.

About the toggling case, not such a big deal. I'm pretty sure I was using uppercase. It's not toggle since if a word is capitalized, it'll go from that to uppercase and back, even between saves and closing and re-opening the document. So word is keeping track of the info. Yes, it'd be nice if the original case was kept in the ePub and a css style was applied, but I can manually change. Though it'd be nice to have.

The small-caps, would be great though. Perhaps in the future, there could be some options for ePub where things like that could be set.

Have sent an email to support. Thanks for the help.
Robert
Posts: 1890
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

When you talk about “H2 section” or “H3 section”, do you use the word loosely, meaning “the paragraphs following the Heading3-styled-paragraph”, or do you mean that your document is split up into as many word processor “sections”? Proper word processor (document) “sections” are introduced by “section breaks” (“Insert | Break > Section Break”).

When you say “Every H3 has its own page”, do you mean to say that your Heading3-styled-paragraphs are preceded and followed by page breaks?

You are partly right about toggling case in Word. Word does not actually remember case pattern changes between sessions, but it has keyboard shortcuts like “Ctrl + Shift + A”, and “Shift + F3” that will toggle between several case patterns in succession. Atlantis does not have such shortcuts.
gyuen
Posts: 34
Joined: Tue Jun 23, 2009 6:13 pm

Post by gyuen »

Really Robert, I'm pretty good with Word, and haven't missed anything (I don't think), otherwise I wouldn't have asked this. :)

I'm using the default Word styles Heading 1, 2 and 3. In my case (French Word), titled Titre 1, 2 and 3. Heading 1 and 2 are set for page break before, and Heading 3 is not.

When I say each H3 has it's own page, that's the resulting ePub. The word file is not like that. It's supposed to just have some space before and after ; that's it. So strange strange strange.

As for the case change, I went into customize keyboard shortcuts and found the command for uppercase; I applied a new shortcut to that. So it's not the default. Word is keeping track of a character style. When I make it uppercase, close and re-open the document, and then re-apply, it changes back (like from Example, to EXAMPLE, back to Example). So somewhere in the Word doc, it keeps track of that. Maybe Atlantis doesn't have that capability, but if I could read the style and apply it to the ePub (as a future feature), that'd be cool.
Robert
Posts: 1890
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

You might have changed the original shortcut in Word, but in a standard version of Word, if you have “Example” selected and press “Ctrl + Shift + A”, you get “EXAMPLE”. If you press “Ctrl + Shift + A” again, you go back to “Example”. There is no memory trick there!

Also in Word, if you have “Example” selected, press the Shift key and hold it down, then press F3 you get “EXAMPLE”. If you keep the Shift key down, and press F3 again, you get “example”. Still with the Shift key down, if you press F3 again, you get “Example”, and you are back to the starting point. There is no memory trick there either!
gyuen
Posts: 34
Joined: Tue Jun 23, 2009 6:13 pm

Post by gyuen »

I see. And Word does keep track of it as a character style. So back to my original request, hope someday Atlantis will use the original text and then apply a css style.

The main reason I want it is for the TOC. I'd prefer the TOC to be normal case, while in the actual text, each heading is all uppercase. For a big book, with dozens (or hundreds of headings), it's kinda a pain to change afterwards.
Robert
Posts: 1890
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

If you right-click a TOC in Atlantis, you can select the whole of it:

Image

When the whole TOC is selected, you can apply the “Sentence” or “Heading” case pattern to it:

Image
gyuen
Posts: 34
Joined: Tue Jun 23, 2009 6:13 pm

Post by gyuen »

Ah. Thanks.

Though sometimes headers have proper names and the case really needs to be looked at for each one. What I can do is just use normal case, and then modify the css afterwards to add the text-transform:uppercase.
Robert
Posts: 1890
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

It seems you could do this directly in Atlantis. You would not need to tweak the CSS afterwards.

In Atlantis, you could format your headings with the “Sentence” case pattern. In so doing, you would preserve the capitalization of proper names.
Then you could build the TOC and have the TOC entries as you wish them to be.

When the TOC is built, you could use the Atlantis Control Board to select all the paragraphs associated with your Heading 1, Heading 2, Heading 3 styles, then apply the uppercase pattern to the selections.
gyuen
Posts: 34
Joined: Tue Jun 23, 2009 6:13 pm

Post by gyuen »

Sounds like I'll do something like that. Most of the ePubs I make are book scans for personal reading. Before, if I found errors, I would fix the Word file but now it's a bit of work and I just fix the ePub.

For the Word doc, I could go back and change the paragraph style to be uppercase and that'd be ok.

As for ever editing things in Atlantis and saving the document, I tried it once and it messed things up. I really forgot what happened, styles changed (in names and maybe other things), styles disappeared, paragraphs went back to the Normal style, character styles went away, a bunch of things happened that I forgot about (maybe not all of those, but stuff like that), so I really just use Atlantis for making ePubs.
gyuen
Posts: 34
Joined: Tue Jun 23, 2009 6:13 pm

Post by gyuen »

I hope it's ok to ask here in the same thread. I noticed that paragraph styles with italics don't get the same in css. A span style is applied instead. Any idea if there's a reason for that? The other way seems cleaner.

One of the issues I had with Atlantis was a paragraph style with italics. Within the paragraph, sometimes I applied italics, to make the text normal. When I opened my Word doc in Atlantis and saved it, I forgot exactly what happened. It might have inverted or it did something weird. I had ~400 paragraphs I had to go thru and fix.
Robert
Posts: 1890
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

Note that, as a general rule it is not a good idea to edit a document in both Atlantis and another word processor, going back and forth between the 2. In doing so, you greatly increase the chances of mishap.

I don’t quite understand what you mean by your latest post.

By “a paragraph style with italics”, do you mean a paragraph style including “italics” among its font formatting properties?

When you say “Within the paragraph, sometimes I applied italics, to make the text normal”, is this something you did in Word or Atlantis? What do you mean by “normal”? Do you mean non-italics?

In Atlantis, if you have a paragraph associated with a style including “italics” among its font formatting properties, and you remove all the italic formatting from that paragraph in the document window, the italics will be restored if you reapply the style to the paragraph, or if you press “Ctrl+Spacebar” to reset the paragraph properties to those of the associated (italics) style.
gyuen
Posts: 34
Joined: Tue Jun 23, 2009 6:13 pm

Post by gyuen »

Robert,

Thanks for the continued help.

I don't normally go back and forth between Word and Atlantis. In one case, I had a 300meg .doc file that for some reason, didn't get shrunk (by removing unused objects) when saving. So I tried Atlantis. One of the things that happened, was I did little to the document but open it and save it. When going back to Word, I found all the paragraph and character attributes were ok, but many of the paragraphs, instead of having the applied paragraph style, went back to "Normal" (with keeping all the original attributes). It happened again today when I tried a test though it didn't happen all the time. Maybe the .doc support isn't perfect and sometimes things happen.

Since I mainly just make ePubs and do everything in Word, it's not so bad.

btw, any idea about the previous thing? Paragraph styles that do include italics (or some other attribute) in it's definition, when I save to ePub, the italics isn't included in the paragraph style but as a span. Like <p class="something"><span class="italics">text</span></p>. I'd prefer the paragraph style to contain everything supported by ePub (3.0?) if possible.

I think there are cases also where unused styles are put in the css but it's not such a big thing. I'll probably start renaming the styles in the finished ePub, so I can make more easily keep track of things as I make corrections (like spell-check, and formatting errors) directly to the ePub.
Robert
Posts: 1890
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

It seems you are mixing up character and paragraph styles, just like Word actually does.

This does not go well with Atlantis which has no support for character styles. So it is not surprising that the Word files that you convert to EPUB in Atlantis have glitches of all kinds.

To convert files to EPUB in Atlantis, it is most advisable to create documents in Atlantis, with the features that are supported in Atlantis.

If there are unused styles in an EPUB file, most likely it is because there are unused styles in the source document.
gyuen
Posts: 34
Joined: Tue Jun 23, 2009 6:13 pm

Post by gyuen »

Ah I see what you mean Robert.

I think I'm all good for now. Thanks for everything. Hopefully someone can check out the page break issue sometime.
Robert
Posts: 1890
Joined: Fri Aug 15, 2003 8:27 pm

Post by Robert »

Hi Gary,

I think you are still mixing up 3 things:

1. “Character” styles (for which Atlantis has no support),

2. “Paragraph” styles (which Atlantis supports),

3. and what is commonly referred to as “direct formatting” or “non-styled” formatting (which Atlantis supports too). “Non-styled” formatting is what happens when you select a fragment of text in the document window, and apply some formatting to it through the “Format | Font…” or “Format | Paragraph…” dialogs. This “direct” or “non-styled” formatting can be different from the formatting normally provided by the associated style. Sometimes “non-styled” formatting can even run contrary to the basic characteristics of the associated style.

Let’s take paragraphs of your document as examples.

The first paragraph after “INTRODUCTION BIOGRAPHIQUE” on page 10 is associated with the “Normal noindent” style. This “Normal noindent” style has only one distinctive characteristic, it is associated with the “Calibri” font face. This “Normal noindent” style is also based on the “Normal” style. Neither the “Normal” nor the “Normal noindent” styles include “bold” and “italic” among their properties. Despite this, the first paragraph after “INTRODUCTION BIOGRAPHIQUE” on page 10 is formatted with “bold” and “italic” attributes. This can only mean that “direct”, “non-styled” formatting was applied to it to make it “bold” and “italic”.

Still in your document. If you take the next paragraph associated with the same “Normal noindent” style on page 10, that paragraph is “pure” “Normal noindent” style. It hasn’t been reformatted with “bold” and “italic”.

I think I understand why the first of these 2 paragraphs is formatted with the HTML “SPAN” tag in the EPUB code. It is because this first paragraph actually, and literally, is a “non-styled” paragraph. Its “direct” formatting is the reason why Atlantis puts it between 2 “SPAN” tags.

Now, as a general rule, if you are going to reformat paragraphs in the document window, in a way that is very different, or even that runs contrary to the basic characteristics of the associated style, it is preferable to create a new style with appropriate characteristics. Here is how to go about this:

Suppose that you have a paragraph associated with your “Normal noindent” style, and you reformat it entirely with “bold” and “italic” using the "Format | Font" dialog.

1. Select the whole of that paragraph and type a name for a new style in the Atlantis “Style” box on the toolbars, “Normal bold italic” for example, like this:

Image

2. With the insertion cursor still within the “Style” box, press the “Enter” key.

3. Atlantis will ask you if you want to create a new style with similar characteristics:

Image

4. Answer “Yes”. Atlantis will create that new style and associate the current paragraph with it.

You can then apply the new style to any appropriate paragraph. The paragraph formatting in the document window will then be congruent with that of the associated style.

This is a rule you should follow when using styles. Do not overuse “direct” or “non-styled” formatting. This will make the management of formatting in documents all the more easy.

There is a further reason for this. In Atlantis, if you associate a paragraph with a style, then change its whole formatting to something different directly in the document window, that direct formatting will be completely removed if you reapply the style to the paragraph, or if you reset the paragraph to its font style with “Ctrl+Spacebar”. Let’s take an example. Suppose that you select the whole of your first “Normal noindent” paragraph on page 10, then press “Ctrl+Spacebar” or reapply the “Normal noindent” style to it. All the bold italic direct (non-styled) formatting will be removed, and your paragraph will display as a “normal” “Normal noindent” paragraph should look.

One final remark. I hope you are aware of this. Your document is full of typos due to imperfect OCR work. You are in for a lot of extremely ungrateful proofreading! But Atlantis will help a lot: it will show you where there are potential "spelling" mistakes.

HTH.

Cheers,

Robert
gyuen
Posts: 34
Joined: Tue Jun 23, 2009 6:13 pm

Post by gyuen »

Robert,

Thank you so much for your continued help. I got an email yesterday with a bug fix for the page breaks. You guys are amazing.

Yes, I'm aware of the differences between paragraph and character styles, and direct formatting. In the case of this document, there are only a few paragraphs that are entirely italics : the intro, and the first paragraph of the biographical intro that you note. Sometimes in essays, poems, and plays, there are also entire lines or paragraphs in italics. In some cases, if they are repeatedly used (like in the case of stage directions for a play), I'll make a new style as I've done. In the cases, where I haven't, I might go back and do as you say; I agree it is the best way.

The bold italics are really just italics. I make 'em bold in the process of proofing so it makes 'em easier to see ; the OCR is not entirely accurate in italicizing text.

I previously mentioned the unnecessary spans, because earlier I had made an ePub and noticed all the normal paragraphs of the biographical introduction had a span. In those cases, the entire paragraph was just 'Normal noindent' (Calibri regular) but had a span for that font style applied for some reason. I'm still not exactly sure why but the final ePub will be made with 'Don't save' fonts so it's ok.

As for all the OCR errors, the book uses a font with lots of ligatures ('s' and 'c' before t) so yeah, there are more errors than normal, and it's gonna be a chore going thru it. The ePubs I've made so far are for proofing. I'm not even done with scanning ; the total work is nearly 4 times it's current length ; so yeah it'll be quite some work to go thru the whole thing. But the author is considered the most important poet since Poe, so worth it for me. :)

btw, I'm curious if there's a reason for placing ePub files in an Ops folder rather than OEBPS. I thought the latter was the standard. I noticed that only with some ePubs ; like when I edit them in Sigil, it re-arranges the files and places them in OEBPS. A minor thing since I don't use Sigil that much anyway since it messes up some things.

Thanks again.

Cheers,

Gary
Post Reply