Enable multiple font use within each text #494

okomestudio · 2019-08-28T21:56:19Z

Sometimes the use of more than one font is desired, as in cases where multiple languages are mixed within a text and using a font with a fuller set of symbols as well as emojis is necessary. This PR implements this by allowing extra_font_paths to be defined, which uses regex to decide which font to apply for the matched (sub)text pattern.

Some comments on the code and other changes:

The LayoutItem class is introduced to make it easier to keep track of each item in WordCloud.layout_, so that unpacking becomes simpler. It also exposes methods for drawing a text and recoloring to reduce code redundancy.
Text rendering with font(s) is refactored into the FontInfo class. The basic idea is for a text that contains patterns that match regex objects defined in extra_font_paths, it keeps track of which font to apply at which text indices and switch fonts accordingly when actually drawing mask. If the text should be rendered with default (i.e., WordCloud.font_path) rendering behaves exactly the same way as before.
examples/emoji.py is renamed to examples/emoji_example.py to avoid a package import issue with an external package named emoji (which is only used in the new example script).
See examples/mix_fonts.py, as well as an example output examples/mix_fonts.png for usage.

Note that I have not fully taken care of Python 2.7/3.x compatibility (I'm on 3.7 personally), documentation, etc., as I would first like to know if the upstream benefits from this PR implementing this new feature, which involves modest refactoring. If a general agreement arises to eventually merge this PR, I would then work on cleaning things up to bring the PR to the standards.

amueller · 2019-09-03T19:08:17Z

Honestly I'm not sure how useful this is. If your goal is to support multiple languages and emojis, you can just combine multiple fonts into a single font.
Your regex approach is clearly more flexible, but if your goal is supporting multiple languages, it might not be the easiest to use as you need to specify regex for different languages, which might be tricky in general?

amueller · 2019-09-03T19:11:24Z

you can probably use https://github.com/fonttools/fonttools and pyftmerge to merge them on the fly even, right?

amueller · 2019-09-03T19:12:15Z

There's also tools for merging multiple noto fonts: https://github.com/googlefonts/nototools/blob/master/nototools/merge_fonts.py

changbowen · 2020-01-22T15:58:43Z

I would vote for supporting multiple fonts... even only for displaying English words in different fonts
perhaps make font_path accept a folder with font files inside?

amueller · 2020-01-22T20:19:53Z

@changbowen and how would you assign the fonts to words? Randomly? That could work though it's a different feature than this.

jnothman · 2020-01-22T20:57:49Z

Similar API to colors?

amueller · 2020-01-22T21:01:35Z

@jnothman yeah though colors can be done afterwards while changing the font changes the geometry.

okomestudio · 2020-01-22T22:08:22Z

I'd be happy to adjust the PR for whatever makes sense so that it can be merged eventually (though I'll be tied up for about a week and won't be able to start working again on this immediately).

A feature like @changbowen suggested would probably require a minor API adjustment, but if people want the feature, then this PR could be a good timing to discuss how it could be implemented without intrusive changes.

As for @amueller's initial suggestion on creating a new font (sorry I didn't respond earlier), I find it very tedious to merge fonts only for temporary use of rendering images. My initial motivation was to generate word clouds in multiple languages (like English and Japanese, as well as math/emoji symbols), and it's not very practical to repeat font merging every time I desire even a slight design change.

At the same time, I wanted to make sure that the effort would be warranted before pushing the multiple-font feature (as well as relatively modest refactoring involved). I'd rather contribute the code to the upstream library than keeping around a forked version for my personal use, but I do understand if there exists a desire to keep the library minimal and light for general use.

I needed to see the PR is aligned in the direction the library maintainers would wish to go.

changbowen · 2020-01-23T07:25:42Z

@amueller Yes a random font for each string would be enough for most people I think. Probably it's not something as complicated as this PR :)
@okomestudio Totally agree that merging fonts shouldn't be something required to generate a word cloud.

okomestudio · 2020-02-12T00:03:26Z

The PR has been updated to propagate changes to the recently included SVG output rendering. Three new examples are added with their SVG outputs (which should probably removed at some point) to demonstrate how font_path accommodates multiple fonts as well as the substring match feature.

I can spend some more time to reflect feedback on the code and APIs, adding more tests and documentation, etc., if this PR were to be pushed at some point. I'm fairly neutral as to what and how to do with the feature itself; if the feature itself is desired but the PR is too much of change, I can adjust the code accordingly, but I need feedback.

Either way, I would like to know at some point if I should keep trying to improve the PR with the expectation of eventually merging to the upstream, or just keep the feature as a fork for my personal use. Thanks!

amueller · 2020-09-04T14:39:23Z

I think it might actually be nice to merge this, but the PR is quite hard to review right now because it moves around a lot of code. I'm not sure if it's possible to make the refactoring be a separate PR?
It would also be nice not to let the main file get too big.

okomestudio · 2020-09-08T01:29:15Z

I can help with refactoring (in separate PRs) if you (or some other commiter(s) who are well-acquainted with the repo) give me some guideline as to how I should go about refactoring the code into a few components/submodules. Then I could comment and make suggestions based on what's included in this PR, if any.

The code is the way it is in the PR only because of the need to carefully keep track of text positioning when multiple fonts are mixed within a text, while sticking to DRY as much a possible. I certainly didn't want to move some many components around at once, and I'm obviously not the best person to make design decisions for the whole code base.

But I can help with coding, since I actually like cleaning up code and refactoring.

okomestudio · 2020-09-18T03:55:42Z

If possible, I'd like to take care of this sooner than later now that I have some time to work on this. (With open-source efforts it's often difficult to expect sustained, consistent commitment due to job change, etc.)

So let me make refactoring suggestion here, so that I don't have to change so much in this single PR, which is potentially intrusive due to necessary internal change.

My suggestion would be to break up into two steps:

Refactor image rendering code into the renderers submodule, so that the common pattern does not need to be duplicated for rendering bitmap images and SVG images from a WordCloud object. The submodule will basically consists of the Renderer classes, ImageRendere for bitmap and SVGRenderer for SVG, just as in this PR (but without the multi-font support to keep things simple). This should be almost pure refactoring of the current master, without any new features introduced.
Implement the multi-font feature which is the main thing in this PR.

If there are smaller components that can be independently refactored more or less, the two steps above should be further subdivided to smaller PRs to make code review easier, but currently the major reason why this PR appears harder to review is that the two things above are done at once.

Once there is some sort of consensus on this, I'll make a PR for each item, starting from (1).

How does this sound?

walkowl · 2022-04-12T23:10:37Z

Such a shame it wasn't finished and merged...I vote for it anyway!

AnkushChharri · 2023-12-29T05:23:39Z

We needed multiple fonts support to resolve all glyphs or unicodes. Why you not do that? @amueller @okomestudio

Enable multiple font use within each text

2533429

okomestudio added 2 commits February 6, 2020 11:30

Merge branch 'master' into ts/multifonts-rev1

8126348

Update to adapt to SVG rendering

81103bb

amueller mentioned this pull request Sep 4, 2020

Multiple fonts #584

Closed

amueller mentioned this pull request Sep 22, 2020

Removing glyphs (squares/rectangles) where character is not available #586

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable multiple font use within each text #494

Enable multiple font use within each text #494

okomestudio commented Aug 28, 2019

amueller commented Sep 3, 2019

amueller commented Sep 3, 2019

amueller commented Sep 3, 2019

changbowen commented Jan 22, 2020 •

edited

Loading

amueller commented Jan 22, 2020

jnothman commented Jan 22, 2020 via email

amueller commented Jan 22, 2020

okomestudio commented Jan 22, 2020

changbowen commented Jan 23, 2020

okomestudio commented Feb 12, 2020

amueller commented Sep 4, 2020

okomestudio commented Sep 8, 2020

okomestudio commented Sep 18, 2020

walkowl commented Apr 12, 2022

AnkushChharri commented Dec 29, 2023

Enable multiple font use within each text #494

Are you sure you want to change the base?

Enable multiple font use within each text #494

Conversation

okomestudio commented Aug 28, 2019

amueller commented Sep 3, 2019

amueller commented Sep 3, 2019

amueller commented Sep 3, 2019

changbowen commented Jan 22, 2020 • edited Loading

amueller commented Jan 22, 2020

jnothman commented Jan 22, 2020 via email

amueller commented Jan 22, 2020

okomestudio commented Jan 22, 2020

changbowen commented Jan 23, 2020

okomestudio commented Feb 12, 2020

amueller commented Sep 4, 2020

okomestudio commented Sep 8, 2020

okomestudio commented Sep 18, 2020

walkowl commented Apr 12, 2022

AnkushChharri commented Dec 29, 2023

changbowen commented Jan 22, 2020 •

edited

Loading