Friday, March 26, 2021

Advanced Comic Book Formatting

Peruse is a comic book reader application which also comes with a creation tool, made by the KDE Community, and since a while now, we've been working on support for the Advanced Comic Book Format, or ACBF. ACBF is a way of augmenting Comic Book Archives (commonly referred to by their suffix, usually cbr or cbz).
One of the things ACBF supports is what is referred to in the definition as Text Layers. In reality, these are translations: Each Textlayer is tied to a specific language, and each contains a number of Textareas, which in turn contain paragraphs of semi-rich text, all of which can be styled in a variety of ways, including with truetype fonts also shipped in the book archive.
Until recently, Peruse lacked support for this crucial functionality, which yielded the result of ACBF books being read in Peruse being untranslated, which, with that being one of the core features of the format, meant that it was just not reasonable to say that the tool properly supported the format. For a little while now, while our highly engaged newest member of KDE's Peruse subcommunity, Mahmoud Khalil, has been working on supporting the interactive-fiction focused Jump feature of ACBF, i took it upon myself to finally getting around to adding in support for Textareas, and writing these words, i pushed the merge button a few minutes ago, so it's all manner of exciting and new and shiny! :)
A frame from Pepper & Carrot Volume 1, Episode 1: The Potion of Flight (or get it from Peruse's built-in store!)

But Why?

To get the most basic question out of the way quickly, i'm taking as a given the question isn't "why translate", which just seems obvious and i won't go into that. The main reason for having support for ACBF's particular implementation of translations is that this allows for a single archive to contain all the translations of a comic, without requiring there to be more than one piece of art per page. Without this approach, we'd end up having to bake the text into each page, or perhaps superimposing smaller bits of imagery on top. Using a simple set of paragraphs and a font means that the size of the archive is much smaller (no multiple piles of pictures, whether just the speech bubbles or the entire page), and consequently that there is room for a whole lot more of them.


To be able to really work out what's going on with this, i'll first describe the data that ACBF gives us for the purposes of actually deciding what text to draw where. As you might imagine, it's not just a question of a square box and a string - most speech bubbles are more involved than that, and comic books also tend to have some strong opinions on how the text should look. What that means is that what we have to base our decisions on painting the text are:
  • Textarea: The text itself, which consists of an arbitrarily shaped polygon (a set of points), a set of paragraphs, as well as a type (basically a style class), what colour the background should be (including the option to make it transparent), and an angle given in degrees, describing how the text should be rotated (but not the polygon)
  • Textlayer: One of these per page per language the book says it supports. This defines a background colour that the Textareas it contains can fall back to if they don't set one (and aren't set to transparent). This in turn might fall back to the Page's background colour, which in turn again might fall back to the one set for the entire document.
  • Style: A set of style information, such as font family (a list, which might include a font filename, see below), font weight, font stretch, and font style, but importantly for what comes next, not font size.
  • Binaries: A lump of literally any binary data, which in the case of it being relevant to Textareas means a truetype font. If a Style lists a font by its filename, we should look in ACBF's Data section first, and then in the archive itself, for some file with the file title given there. This is commonly just the filename, without the path, which means we really just have to pick the first and the best, and hope there aren't any duplicate files in the archive.

And How?

Alright, so there's a couple of important things to consider here. Apart from the styling information above, what we in essence end up with is a number of paragraphs of text, using some known style (font family, weight, etc), which need to fit snugly into an arbitrarily shaped polygon.
As an example, the following cell of the Pepper & Carrot comic shows what is meant by arbitrarily shaped polygon: In this case, it is what's called a concave polygon - that is, a polygon which has bits of it that point into itself, basically meaning an angle of more than 180 degrees (you can see there are two of those here, one on either side of the text "everything for me.").
The font is one included in the archive, and not on the system, so we need to inform Qt that this is a font that's available. Luckily, Qt has a handy function, QFontDatabase::addApplicationFontFromData, which we can use to load truetype fonts from binary data, and then just sniff whatever the first family name is in the font once added, so that becomes sort of easy, and we can add a handy dandy couple of functions that lets the data structure Peruse uses to handle books in an archive return us that information.
Which brings us to the layouting of the text into that polygon. Now, there's a bit more to it than this, and of course you can check the MR to see the nitty-gritty, but what follows is an overview of the logic behind what's being done to make this happen.
At its most basic, what we do is create a QQuickItem, and then using a QQuickTextNode to do the actual painting for us. However, before that can do any painting, it needs to know what to paint, which is where QTextLayout comes in. Now, unfortunately we cannot simply give a QTextLayout a polygon and ask it to fit some lines of text into it at some text size, until it fits. At least, not without assistance. Which is what we'll give it. The logic, in a super basic form, is:
Decide on a minimum and maximum font size (we'll just start at 2, because it's already impossibly tiny, and cap the size at the height of the bounding box of our polygon, minus twice whatever margin we're working with). Now recursively attempt to fit the text into the polygon by splitting the font size range and trying to fit the text with the smallest size first, then the middle point between the smallest and largest, and finally the largest, recursively until we find the largest size that fits. Layouting in each of those instances means using the given size as the font size (which is why we are not given a font size in the Styles), and to do this, we do...
For each paragraph of text, create a QTextLayout instance, with the font we're using, and using the text from the given paragraph
  1. Start layouting from the top, using our margin as the first y position, by asking the QTextLayout to create a text line for us.
  2. Create a rectangle spanning the width of our polygon's bounding box, and the height of a line of text with our chosen font size
  3. Get the intersection of that rectangle and our polygon, and then find the innermost x values for both the top line and the bottom line
  4. Go through our intersection polygon, and find all the points between the top-rightmost and bottom-rightmost points, and use those as our innermost right hand side value, and the same for the left hand side of the polygon. This step is what makes it work with the concave polygon in the picture.
  5. Now set the position and width of our current text line to the values we just discovered
  6. The text line object will now be able to tell us whether it fits the first word of its text.
    • If it at least fits a single word, we can try to create a new line, and then jump back up to point 2 and lay out that new line.
    • If it didn't fit any text at all, we push the line down one pixel and do the same, without creating a new line (essentially just jump back to line 2 again, one pixel further down, to try to to fit the text slightly further down).
    • If it did manage to fit at least some text, but there's more to do, create a new text line for that text and jump back up to point 2.
    • If we've actually managed to fit all the text, we report back that we have done so.
  7. If at any point we end up in a situation where there is more text to be laid out, but we don't have the vertical space left to do so, the function aborts and tells the caller that it could not lay out everything.
The end result is that we will end up with text that invariably fits inside the polygon, at the largest pixel size that will allow it to fit. Additionally, we also do a bit of fun stuff with the semi-rich text formatting, but that's more just some general text layouting type stuff.
Text rotation support is also a thing (this is much simpler than it seems, and uses QQuickItem's internal rotation support)

One final point to highlight is the anchors that Textareas can also have. Much like Mahmoud's work with Jumps, Textareas can link to other parts of the book, but also to things like References, which is a free-text type thing that ACBF also includes (and which we are working figuring out how to sensibly show in Peruse), as well as to external resources like websites and email addresses (and don't worry, we'll warn you before doing any of those so you can feel safe clicking on stuff in your books). More on that in a later post on this very blog, so stay tuned ;)
Outside of the other shiny things, as mentioned in the introduction, the main purpose of text layers in ACBF is to allow for translations, so, an example :)
If you run and build a copy immediately, take note that the default is currently the "No Translation" option, so you will need to pick one from the list. Note also that it will not be remembered between sessions, though that is in the plans (likely on a per-book basis). If you have opinions on how this should work, give us a poke! :)

OK, sold, gimme!

So eager! Well, here's the thing: Peruse is, with this feature added, getting very near to being ready for its next release. There's polishing to be done, and the initial release of this will be, for the first time in Peruse's meandering life, a beta release.
What this means is that over the next little while, i'm going to be getting a bit of experience with the releaseme script that KDE uses to create new software releases, as well as working out how to get KDE's binary factory to spit out not only windows packages, but also AppImages. I previously used OBS to build these, but given the binary factory is now able to do this, it seems much more reasonable to do it that way around.
You do not have to wait for that, though: For now, you can grab yourself some source code and build Peruse yourself. Hopefully this should be straightforward (and much more so than it used to be, no more submodules or anything like that), but if you run into trouble, give me a poke, either here, via any of the various other social networky things. Or even better, you could drop by our chat over on Matrix, where a few of us have been hanging out for a little while now :)

The word of the day is: Adequate. Because it may be worth striving for perfection, but then nothing would get published.

Labels: , ,


Blogger Carl Schwan said...

This looks like a very nice feature :) I see that you are using QQuickTextNode and sometimes I really wish that this sort of API would be public. It would make the life of everyone way easier :)

26 March, 2021 14:58  
Blogger Dan Leinir Turthra Jensen said...

@Carl Schwan It's all manner of powerful, yeah :D And yes, definitely so on the other part, having that be public would be more than a bit useful, not everybody's got the kind of freedom to just use the private stuff like this.

26 March, 2021 15:01  

Post a comment

<< Home