Spineplots and spinograms for factor y-variables #233

zeileis · 2024-10-10T16:45:47Z

Fixes #2

This PR will eventually support spineplots (factor ~ factor) and spinograms (factor ~ numeric) using the type_ infrastructure from the epic #222

There is already some good progress. For example, you can do:

library("tinyplot")
 aq = transform(
   airquality,
   Month = factor(Month, labels = month.abb[unique(Month)]),
   Hot = Temp > median(Temp)
 )
 tinyplot(Hot ~ Wind, facet = ~ Month, data = aq, type = type_spineplot(), breaks = 4)

ttnc <- as.data.frame(Titanic)
tinyplot(Survived ~ Sex, facet = ~ Class, data = ttnc, weights = ttnc$Freq, type = type_spineplot())

But as you can see the axis labeling is not great and the by handling does not work properly, yet. I will ask various questions, mostly to you Vincent @vincentarelbundock, I'm afraid. But I'm optimistic that we can sort out the details.

tbc

zeileis · 2024-10-10T17:08:04Z

First questions:

The xaxs = "i" and yaxis = "i" are still not passed to the right place. Do I need to add these as explicit arguments somewhere else?
Some arguments that I set in the data_ function I want to access in the draw_ function. What is the best way to do that? Currently I include these in the return value (e.g., https://github.com/grantmcdermott/tinyplot/blob/spineplot/R/type_spineplot.R#L156) and then fetch them from the parent.frame() (https://github.com/grantmcdermott/tinyplot/blob/spineplot/R/type_spineplot.R#L21-L22). But that's probably not a good solution...

vincentarelbundock · 2024-10-10T17:22:43Z

* The `xaxs = "i"` and `yaxis = "i"` are still not passed to the right place. Do I need to add these as explicit arguments somewhere else?

What functions consume these arguments? With your current code, the variables themselves should already be available in the tinyplot() scope, but I don't see any call anywhere that uses these variables a inputs. Should they be passed to the facet drawing function, the window creating, or to draw_spineplot()? I think things are setup correctly in your type_spineplot() code. It's probably just a matter of carrying through as inputs to the proper functions.

* Some arguments that I set in the `data_` function I want to access in the `draw_` function. What is the best way to do that? Currently I include these in the return value (e.g., https://github.com/grantmcdermott/tinyplot/blob/spineplot/R/type_spineplot.R#L156) and then fetch them from the `parent.frame()` (https://github.com/grantmcdermott/tinyplot/blob/spineplot/R/type_spineplot.R#L21-L22). But that's probably not a good solution...

What "shape" do these arguments have? One simple option might be to use data_spineplot() to insert this info as new columns with idiosyncratic names into datapoints. Then, draw_spineplot() has access to that data and can retrieve the info directly.

zeileis · 2024-10-11T01:33:06Z

The xaxs and yaxs properties have to be set when creating the outer plot to which the types are then adding the actual content. So I think it needs to be passed to draw_facet_window() and then ultimately plot.window(). Is there some way to achieve this? Or do I need to add arguments xaxs and yaxs explicitly for tinyplot() and then pass them through everything?

Regarding the extra arguments: The most prominent case are the breaks for the spinograms. These are the points at which I split the numeric x variable into categories. I want to compute them on the entire data only once, that's why I put them in data_spineplot().

As they are neither a scalar, nor of length "n", I did not put them as a column in datapoints. But I could put them there if I pad with NAs. So it's technically possible but also not elegant.

You could argue that I ought to cut() the x into categories in the data_spineplot() function already. And that's probably a good idea. But I would still need to pass along the underlying breaks because I need these for making nice axis labels.

vincentarelbundock · 2024-10-11T10:07:04Z

The xaxs and yaxs properties have to be set when creating the outer plot to which the types are then adding the actual content. So I think it needs to be passed to draw_facet_window() and then ultimately plot.window(). Is there some way to achieve this? Or do I need to add arguments xaxs and yaxs explicitly for tinyplot() and then pass them through everything?

If we think users will want to specify xaxs explicitly themselves in other plots, then we could add them to the main function. But if you think it's mostly an internal thing, then you could add:

xaxs = yaxs = NULL

just before data_type() is called. Then, your function overrides the default NULL value. draw_facet_window() can then be modified to accept the internal xaxs value, ignore it if is NULL, or act correctly if it is non-null.

As they are neither a scalar, nor of length "n", I did not put them as a column in datapoints. But I could put them there if I pad with NAs. So it's technically possible but also not elegant.

Here's one idea: In the main tinyplot function, just before calling type_data(), create an empty list called type_info. Then, data_spineplot() overrides that empty list with a named list of whatever you need in the drawing function. Finally, we modify the main tinyplot() function to pass type_info to type_draw().

Since every type_draw() function accepts ..., type_info will be ignored most of the time. But then custom types like yours have an easy mechanism to pass arbitrary data from type_data() to type_draw().

…ass them on to draw_facet_window() where they are set via par()

zeileis · 2024-10-11T10:50:46Z

Great, Vincent @vincentarelbundock, very useful. I've added the xaxs and yaxs arguments now - also to tinyplot() as they are standard par() arguments. Grant @grantmcdermott, let me know if you disagree and would not have exported them.

…ata to type_draw

zeileis · 2024-10-11T11:14:10Z

And nice idea with the type_info, I've also added that now! 💡

I didn't do extensive tests, yet, but I think that tinyplot() and plot() now give the same output for factor ~ factor and factor ~ numeric! 🎉

Next I want to polish the faceted displays and then have a stab at handling by variables. For the facets I have two questions:

Is there a recommended way how to increase the margins between the displays? Because spine plots employ both the left and the right y-axis for labels, we need a little bit more space here.
Because for spineplots type_draw is drawing the axes rather than draw_facet_window: Can type_draw know whether it is in facet on the very left or very right and at the top or at the bottom, respectively? Then, we could draw fewer axes, if we want.

vincentarelbundock · 2024-10-11T11:28:00Z

I didn't do extensive tests, yet, but I think that tinyplot() and plot() now give the same output for factor ~ factor and factor ~ numeric! 🎉

Very, very cool!

1. Is there a recommended way how to increase the margins between the displays? Because spine plots employ both the left and the right y-axis for labels, we need a little bit more space here.

I don't know about margins. Paging @grantmcdermott

2. Because for spineplots `type_draw` is drawing the axes rather than `draw_facet_window`: Can `type_draw` know whether it is in facet on the very left or very right and at the top or at the bottom, respectively? Then, we could draw fewer axes, if we want.

I see a facet_window_args object in the main tinyplot function with a bunch of information in it. I bet if you pass this to type_draw(), you could match it to ifacet which gives you the index of the current facet.

grantmcdermott · 2024-10-11T16:17:22Z

Very exciting 🚀

Is there a recommended way how to increase the margins between the displays? Because spine plots employ both the left and the right y-axis for labels, we need a little bit more space here.

Yes. That's the fmar parameter, which can be accessed/set either: 1) temporarily as part of the list passed to tinyplot(...., facet.args = list(fmar = xx)), or 2) permanently via tpar(fmar).

Reading and typing quickly on my phone, so I hope I didn't misunderstand the question. I'll be able to look properly in an hour or so.

Edit: Details and default values here. https://grantmcdermott.com/tinyplot/man/tpar.html#additional-graphical-parameters

zeileis · 2024-10-11T17:04:50Z

Thanks, Grant. Then I see two ways of setting this:

We include facet.args in the fargs list for type_data so that it can be modified and subsequently passed on to draw_facet_window.
We just call tpar(fmar = ...) within the type_data function.

2 is leaner but I guess 1 is cleaner?

vincentarelbundock · 2024-10-11T17:27:47Z

Yeah, I don't see a good reason to keep away too many things from type_draw(). Seems like a general design.

grantmcdermott · 2024-10-11T18:46:43Z

RE: facet margin adjustments. Another option would be to check for type=='spineplot' and then do an automatic adjustment similar to what we do for other special cases here:

tinyplot/R/facet.R

Lines 127 to 152 in 068b431

 # Set facet margins (i.e., gaps between facets) 

 if (is.null(facet.args[["fmar"]])) { 

 fmar = tpar("fmar") 

 } else { 

 if (length(facet.args[["fmar"]]) != 4) { 

 warning( 

 "`fmar` has to be a vector of length four, e.g.", 

 "`facet.args = list(fmar = c(b,l,t,r))`.", 

 "\n", 

 "Resetting to fmar = c(1,1,1,1) default.", 

 "\n" 

 ) 

 fmar = tpar("fmar") 

 } else { 

 fmar = facet.args[["fmar"]] 

 } 

 } 

 # We need to adjust for n>=3 facet cases for correct spacing... 

 if (nfacets >= 3) { 

 ## ... exception for 2x2 cases 

 if (!(nfacet_rows == 2 && nfacet_cols == 2)) fmar = fmar * .75 

 } 

 # Extra reduction if no plot frame to reduce whitespace 

 if (isFALSE(frame.plot)) { 

 fmar = fmar - 0.5 

 }

E.g. In the last bit of the above code chunk, we subtract 0.5 lines from the fmar values if the plot frame is turned off (to reduce unnecessary whitespace between the individual facets).

Summarising: maybe we just try adding the following below line 152?

if (type == "spineplot") fmar = fmar + 1    # or however many lines you want to increase by

…is(4) in the panels on the right, increase default facet margins

zeileis · 2024-10-12T19:55:12Z

Thanks for the advice, as usual very helpful. I now did the following:

Avoid any type == "spineplot" to keep type processing as modular as possible.
Pass facet.args to type_data so that type_spineplot can increase the default fmar.
Pass facet_window_args to type_draw so that the axis(4) is only drawn in the right-most panel in each row.
For this I added a new helper function is_facet_position (in facet.R) which can determine whether the current facet panel is on the "left" or the "right" and at the "top" or the "bottom" of the facet grid. Maybe we want to leverage this in other types as well?
I added an interpretation of xaxt/yaxt to type_spineplot although I had to adjust their meaning a little bit because the axes are non-standard.

grantmcdermott · 2024-10-12T20:18:24Z

For this I added a new helper function is_facet_position (in facet.R) which can determine whether the current facet panel is on the "left" or the "right" and at the "top" or the "bottom" of the facet grid. Maybe we want to leverage this in other types as well?

Thanks @zeileis, I'll take a look. Would this supplant (duplicate?) the existing logic that we use here for only drawing axes of "outer" facets if the plot frame is turned off?

tinyplot/R/facet.R

Lines 38 to 39 in 068b431

 oxaxis = tail(ifacet, nfacet_cols) 

 oyaxis = seq(1, nfacets, by = nfacet_cols)

and
https://github.com/grantmcdermott/tinyplot/blob/main/R/facet.R#L250-L265

…port, refined axis type and lwd handling

zeileis · 2024-10-12T23:39:01Z

Thanks, Grant, I overlooked that feature. Why is that logic only applied if there is no frame.plot? Shouldn't this be disentangled? This would be helpful in general I guess. But for spineplots in particular because I don't have a standard plotting region so that frame.plot has to be treated differently/

For spineplots, at the moment, I always repeat axis 1 and 3 but axis 4 is only shown for the last panel in a row. But I'm happy to adapt

zeileis · 2024-10-13T00:15:25Z

Summary

The type_spineplot() is pretty decent now. The main missing feature is by which I will try to tackle next. Some fine details of margins and legends in the case of facets (see above) can still be improved but are no show-stoppers, I think.

My latest additions are:

flip is supported now (by actually flipping the split direction and not just swapping the variables)
more granular control of axes/xaxt/yaxt so that frames around the rectangles can be switched off
facet colors are now supported in the usual way - by deriving simple sequential HCL-based palettes within each facet

Examples

library("tinyplot")
ttnc = as.data.frame(Titanic)
tinyplot(Survived ~ Sex | Class, facet = "by", data = ttnc, weights = ttnc$Freq, type = type_spineplot(),
  palette = "Dark 2", facet.args = list(nrow = 1), axes = "t", lwd = 5)

tinyplot(Survived ~ Class | Sex, facet = "by", data = ttnc, weights = ttnc$Freq, type = type_spineplot(),
  palette = "Dark 2", facet.args = list(ncol = 1), axes = "t", lwd = 5, flip = TRUE)

Problems

Legend symbols: Note that I have to set lwd = 5 in order to produce thick lines in the legend. It would be better to create filled rectangles there. Can I modify the type_draw function to achieve this?

Colors: Often one would want to select a single set of colors coding the levels of the y-variable like this:

Users transitioning from plot() would expect the following code to work but it leads to an error:

p = palette.colors(3, "Pastel 1")
tinyplot(Species ~ Sepal.Width, data = iris, breaks = 4, type = type_spineplot(), col = p)
## Error: `col` must be of length 1 or 1.

For now, I have worked around this by giving the type_spineplot() function another col argument but this isn't ideal. Maybe we would have to special case this once type_spineplot() is an official plot type?

tinyplot(Species ~ Sepal.Width, data = iris, breaks = 4, type = type_spineplot(col = p))

grantmcdermott · 2024-10-13T00:22:06Z

Whoa, these look great. I'll try to do a proper review tomorrow. (@vincentarelbundock please feel free to jump in first if you have time.) Really excited to see this long-standing issue nearing a resolution!

zeileis · 2024-10-13T00:39:20Z

Honestly, I wasn't sure whether we would really get here because there were so many special cases in the old monolithic code 🙈

Also I expected the modularization to be even more complicated. But Vincent's trick of passing a lot of arguments to the workers which can then overwrite them via listenv() is really net. I hadn't seen this before. 💡

vincentarelbundock · 2024-10-13T00:50:09Z

I hadn't seen this before. 💡

Me neither 😭

zeileis · 2024-10-13T01:40:06Z

How did you come across this idea? Was this Grant's input? (I didn't follow the details about the initial discussion of the modularization.)

Bonus question: The standard design for type_draw is to cycle through ever facet level within each by level. For spineplots I would need to draw all by levels simultaneously within a given facet level. In this case I would only draw anything if iby == 1L but I would need to get the relevant subset of the data. I guess I could piece it together from data_by but I wondered whether you see a more elegant approach?

vincentarelbundock · 2024-10-13T01:59:37Z

No, I just started by returning a bunch of arguments in a list and reassigning them. Then, I got lazy and used list2env() as a hack. Only later I realized it was kind of a neat trick.

Don't have a great idea now, and I'm not going to be able to concentrate on this for a few days at least since it's holiday here. Sorry!

zeileis · 2024-10-13T02:20:15Z

That's perfectly fine! I should really be doing other things as well (not vacations unfortunately). So I'll wait for Grant's feedback first and then return to handling the by variable later. Enjoy the vacations.

grantmcdermott · 2024-10-15T23:37:04Z

Sorry, I haven't had time to review this properly. I also have to head out of town now... but I just pushed a simple tweak (workaround) that gives square legend symbols.

pkgload::load_all("~/Documents/Projects/tinyplot")
#> ℹ Loading tinyplot

ttnc = as.data.frame(Titanic)

tinyplot(Survived ~ Sex | Class, facet = "by", data = ttnc, weights = ttnc$Freq, type = type_spineplot(),
         palette = "Dark 2", facet.args = list(nrow = 1), axes = "t")

It's a bit hacky and I'll also flag that the bespoke coloring override here means that we don't match the offset black correctly. For example, see the legend key for "1st" is darker than the plot region here.

tinyplot(Survived ~ Sex | Class, facet = "by", data = ttnc, weights = ttnc$Freq, type = type_spineplot(),
         facet.args = list(nrow = 1), axes = "t")

More generally, I need to think about the best way to pass arguments like col, lty, lwd etc. back and forth between the type_spineplot() constructors and the other drawing arguments. (I know that this is tricky b/c of spineplot's unique axes and colouring requirements, which are currently done "just in time" as part of draw_spineplot().

zeileis · 2024-10-17T11:12:00Z

Grant, thanks for this! Some comments and thoughts below. Nothing urgent, so feel free not to respond any time soon...

I did the special casing of the default black (#000000FF) because (a) to match the default colors of plot()/spineplot() in base R and (b) because starting from black makes the sequential palette rather dark.
Currently the color computations are indeed done just in time in type_draw but I could easily move them to type_data.
However, as far as I can tell this would still not enable me to pass the right arguments to draw_legend, or am I overlooking something?
In general, I think it would be good give the type constructors some control over the legend as well. Either via type_data or possibly via an additional type_legend or so, if necessary.
It would also be desirable to get rid of most of the special cases for certain types, especially in draw_legend. For the basic type = "p", "l", "o", and friends such special cases are ok IMO but everything else would ideally be in the type functions.

grantmcdermott · 2024-10-22T02:18:18Z

However, as far as I can tell this would still not enable me to pass the right arguments to draw_legend, or am I overlooking something?

I don't think you're missing something. Or, at least, I didn't see how to do it either. Speaking of which...

In general, I think it would be good give the type constructors some control over the legend as well. Either via type_data or possibly via an additional type_legend or so, if necessary.

Yeah, this is a great shout-out. I don't know if we (collectively) have the time to fix this before the next release... which I was hoping to push through within the next week or two, once this gets merged. But doing so would greatly simplify / negate some of the legacy workaround that we've carried through from pre-modularization. (Basically, just +1 to your final bullet point.)

zeileis added 2 commits October 10, 2024 12:30

start working on spineplot interface via new type_* infrastructure

4f3d882

Merge branch 'main' into spineplot

76fe6e4

introduce additional arguments 'xaxs' and 'yaxs' for tinyplot() and p…

80e31ee

…ass them on to draw_facet_window() where they are set via par()

introduce internal type_info variable to pass information from type_d…

11c8ce3

…ata to type_draw

zeileis added 2 commits October 12, 2024 21:46

pass facet.args to type_data and facet_window_args to type_draw

e35abc9

improve spineplot axis handling: increase facet margins, only draw ax…

24103f3

…is(4) in the panels on the right, increase default facet margins

avoid <- assignment

a174ca8

by/facet color support (via simply sequential HCL palettes), flip sup…

83c3fa4

…port, refined axis type and lwd handling

simplify is_facet_position()

7f010a4

zeileis and others added 2 commits October 13, 2024 11:12

import grDevices functions for sequential HCL palettes

4bd8920

legend squares

4232c90

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spineplots and spinograms for factor y-variables #233

Spineplots and spinograms for factor y-variables #233

zeileis commented Oct 10, 2024

zeileis commented Oct 10, 2024

vincentarelbundock commented Oct 10, 2024

zeileis commented Oct 11, 2024

vincentarelbundock commented Oct 11, 2024

zeileis commented Oct 11, 2024

zeileis commented Oct 11, 2024

vincentarelbundock commented Oct 11, 2024

grantmcdermott commented Oct 11, 2024 •

edited

Loading

zeileis commented Oct 11, 2024

vincentarelbundock commented Oct 11, 2024

grantmcdermott commented Oct 11, 2024 •

edited

Loading

zeileis commented Oct 12, 2024

grantmcdermott commented Oct 12, 2024

zeileis commented Oct 12, 2024

zeileis commented Oct 13, 2024

grantmcdermott commented Oct 13, 2024

zeileis commented Oct 13, 2024

vincentarelbundock commented Oct 13, 2024

zeileis commented Oct 13, 2024

vincentarelbundock commented Oct 13, 2024 •

edited

Loading

zeileis commented Oct 13, 2024

grantmcdermott commented Oct 15, 2024

zeileis commented Oct 17, 2024

grantmcdermott commented Oct 22, 2024

Spineplots and spinograms for factor y-variables #233

Are you sure you want to change the base?

Spineplots and spinograms for factor y-variables #233

Conversation

zeileis commented Oct 10, 2024

zeileis commented Oct 10, 2024

vincentarelbundock commented Oct 10, 2024

zeileis commented Oct 11, 2024

vincentarelbundock commented Oct 11, 2024

zeileis commented Oct 11, 2024

zeileis commented Oct 11, 2024

vincentarelbundock commented Oct 11, 2024

grantmcdermott commented Oct 11, 2024 • edited Loading

zeileis commented Oct 11, 2024

vincentarelbundock commented Oct 11, 2024

grantmcdermott commented Oct 11, 2024 • edited Loading

zeileis commented Oct 12, 2024

grantmcdermott commented Oct 12, 2024

zeileis commented Oct 12, 2024

zeileis commented Oct 13, 2024

Summary

Examples

Problems

grantmcdermott commented Oct 13, 2024

zeileis commented Oct 13, 2024

vincentarelbundock commented Oct 13, 2024

zeileis commented Oct 13, 2024

vincentarelbundock commented Oct 13, 2024 • edited Loading

zeileis commented Oct 13, 2024

grantmcdermott commented Oct 15, 2024

zeileis commented Oct 17, 2024

grantmcdermott commented Oct 22, 2024

grantmcdermott commented Oct 11, 2024 •

edited

Loading

grantmcdermott commented Oct 11, 2024 •

edited

Loading

vincentarelbundock commented Oct 13, 2024 •

edited

Loading