Saturday, June 17, 2017

Cubing the sphere

I wrote back in 2015 about an improvement on standard latitude/longitude gridding for fields on Earth. That is essentially representing the earth on a cylinder, with big problems at the poles. It is much better to look to a more sphere-like shape, like a platonic solid. I described there a mesh derived from a cube. Even more promising is the icosahedron, and I wrote about that more recently, here and here.

I should review why and when gridding is needed. The original use was in mapping, so you could refer to a square where some feature might be found. The uniform lat/lon grid has a big merit - it is easy to decide which cell a place belongs in (just rounding). That needs to be preserved in any other scheme. Another use is in graphics, where shading or contouring is done. This is a variant of interpolation. If you know some values in a grid cell, you can estimate other places in the cell.

A variant of interpolation is averaging, or integration. You calculate cell averages, then add up to get the global. For this, the cell should be small enough that behaviour within it can be regarded as homogeneous. One sample point is reasonably representative of the whole. Then they are added according to area. Of course, the problem is that "small enough" may mean that many cells have no data.

A more demanding use still is in solution of partial differential equations, as in structural engineering or CFD, including climate GCMs. For that, you need to not only know about the cell, but its neighbors.

A cubed sphere is just a regular rectangular grid (think Rubik) on the cube projected, maybe after re-mapping on the cube, onto the sphere. I was interested to see that this is now catching on in the world of GCMs. Here is one paper written to support its use in the GFDL model. Here is an early and explanatory paper. The cube grid has all the required merits. It's easy enough to find the cell that a given place belongs in, provided you have the mapping. And the regularity means that, with some fiddly bits, you can pick out the neighbors. That supported the application that I wrote about in 2015, which resolved empty cells by using neighboring information. As described there, the resulting scheme is one of the best, giving results closely comparable with the triangular mesh and spherical harmonics methods. I called it enhanced infilling.

I say "easy enough", but I want to make it my routine basis (instead of lat/lon), so that needs support. Fortunately, the grids are generic; they don't depend on problem type. So I decided to make an R structure for standard meshes made by bisection. First the undivided cube, then 4 squares on each face, then 16, and so on. I stopped at 64, which gives 24576 cells. That is the same number of cells as in a 1.6° square mesh, but the lat/lon grid has some cells larger. You have to go to 1.4° to get equatorial cells of the same size.

I'll give more details in an appendix, with a link to where I have posted it. It has a unique cell numbering, with an area of each cell (for weighting), coordinated of the corners on the sphere, a neighbor structure, and I also give the cell numbers of all the measurement points that TempLS uses. There are also functions for doing the various conversions, from 3d coordinates on sphere to cube, and to cell numbering.

There is also a WebGL depiction of the tesselated sphere, with outline world map, and the underlying cube with and without remapping.

The basic process of mapping from sphere surface to cube goes as following (with options) starting from the lat/lon data
  1. You may want to rotate around the axis relative to the cube; add something to the longitudes
  2. Convert lat/lon to radians θ,φ by miltiplying by π/180.
  3. Convert to z (3D on unit sphere) by z=(cos(θ)*sin(φ),sin(θ), -cos(θ)*cos(φ)). This gives a conventional start view with 0N,0E as 0,0 in the x-y plane.
  4. Project to u on unit cube. For each z, divide by max(abs(z)), abs being abs of coordinates.
  5. Remap on cube. I use v = tan(b*u)/tan(b), applied to each coordinate, with b=0.86.
The reason for the last mapping is that the projection concentrates points towardd the face centers. As you go away from the center, the area ratio (sphere to cube) is (1/cos(α))^3, where α is angle from line through face center. So it reaches a maximum of sqrt(27) at corners. The last mapping counters this, reducing the discrepancy to about 20%. Any function with cubic-like behaviour will do; I use tan() because it is easily reversed, which is often needed.

So I'll show here a WebGL active plot of how it works. This is a 16x16 mesh on each face. The plot initially looks overly crowded, but you can hide things with the checkboxes top right. It shows the grid on the surface, with a map and dots for the stations. Underneath there is the cube that is projected onto the surface. It shows in mauve the logical regular grid, and in green the result of the local mapping. It is the green grid that is projected, but the regular grid is what the process seeks to match in area to the cells on the sphere surface.

Appendix - structure details.

The structure (list) as a 1 MB R data file, is here. At top level it has a readme string and 6 lists that are called cube$n1, ..n2 up to ..n64, or you can call them cube[[1]] etc. Each has material for a grid of the nominated degree of face division (1x1, 2x2 etc). I normally start use by writing, say, p=cube$n16, and thereafter dea only with p.

Most of the data is in the form of a "score". This is derived from numbering and underlying cubic grid of twice the division. Nodes and cells have scores, the latter derived from the face center. You don't need to know how it is derived, just that nodes and faces have them.

The main data for the cube are p$inodes and p$ifaces, which are the scores for the logical (regular) grid. There is also a triangular mesh, pointing to the nodes in score order. There is a neighbor list p$nb, which lists each cell and the scores of its 4 neighbors. There is also a list p$stns of the cells, by score, that each station in the TempLS database is mapped into. That is just what you need for gridded integration, along with p$wt, which are the areas of the cells on the sphere. There is also a convenience function p$lines, which gives the nodes in the order you need to draw the gridlines.

The main functionality is from the conversion programs, with names like c2s etc. The ones provided are c2s and s2c, c2m and m2c, s2i, i2s, c2i, i2c, c2m, m2c. The symbols stand for
  • s are 3d coordinates on the unit sphere
  • i are the scores. So s2i tells what cell score a surface point belongs in, and i2s gives the 3D coords of the face score i. Every function involving i must have the list p as a second argument.
  • c are the 3D coordinated of the regular grid on the unit cube
  • m are the mapped points (using that tan() function. You won't often need them.

You can of course nest the conversions. Each of these sublists such as cube$n16 has a $readme string.

The dataset is located as an R datafile here.


  1. Elsewhere Olof questions whether Gerald Browning is a talker or a doer.

    The issue with scientists such as Browning is that they create these complex worlds of math that they live in during the course of their career and are surprised that nothing ever comes of it. In his case, Browning doesn't ride into the sunset but instead stirs the pot by claiming :

    "Climate Models

    1. the climate models are based on the wrong set of differential equations, i.e., the hydrostatic equations instead of the well posed multi-scale hyperbolic system or the well posed reduced system (Browning and Kreiss 2002).

    2. The use of the hydrostatic system leads to columnar heating for the vertical velocity. In order to reduce the noise introduced by the point wise (lat,lon) heating, an unrealistically large dissipation must be applied and this reduces the numerical accuracy of the spectral method by two orders of magnitude (Browning, Hack and Swarztrauber and ECMWF plots shown on this site).

    3. For a model based on the hydrostatic system the accuracy of the numerical approximation is destroyed by the boundary layer parameterization within a matter of days (Sylvie Grsvel et al. on this site).

    4. There are no mathematical or numerical justifications for running a numerical model beyond the point where numerical accuracy is lost, let alone when it is based on the wrong equations and inaccurate parameterizations.

    Who knows, maybe he is right about all this. Yet, I doubt anything will come of it even if he is listened to. All the correct math in the world won't matter if one hasn't set down the right premise and that it can be applied, as in applied math leading to applied physics.

    Pierre-Simon Laplace was probably more responsible than anyone for Browning's consternation when he created his set of hydrostatic tidal equations in 1776. These equations are a reduced form of the complete set of primitive equations that go into every GCM developed, but also can be simplified to the extent that they can be used for applied physics. The ultimate example of this is how the tidal equations reduce to almost nothing (i.e. input forcing => scaling transform => output) when applied for straightforward tidal analysis.

    Something slightly more advanced than the reduced complexity of tidal analysis is likely what goes into ENSO and QBO analysis. We don't have to listen to Browning, but take the path of Laplace and iterate from there. Browning was likely down in the weeds from when he started decades ago and never really emerged.

    1. This paper "Recent progress and review of issues related to Physics Dynamics Coupling in geophysical models"

      They say at the end "Probably the most sensitive parameter is the time step."

      That is true if the forcing is not strong and the problem relies on uncertain initial conditions. If the forcing is strong and the response is stable and converges, that forcing flows through to the response as clear as a bell irrespective of the chosen time step. For example, allowing a 60 Hz hum to enter an amplifier's input stages will certainly result in a 60 Hz hum visible in the output.

      So what do they recommend?
      "One option is to reduce the equation set, as in section 5, which then renders the generation of a reference solution more straight forward and allows for more rigorous mathematical analysis. Another option, as discussed in section 6, is to reduce the complexity of the GCM. Obviously the balance has to be right. Oversimplification does not challenge the coupling as the real model would, overly complex setups make the analysis intractable."

      So why don't we do this?

      And that coupling may not be coupling but a common mode response to other external factors, such as a gravitational forcing. I have shown how the ocean (via ENSO) and atmosphere (via QBO) respond to precisely the same lunar forcing.

      Contrary to what Mr.Browning says, it appears that the consensus climate science community is well aware of the issues. I really think there will be a path forward if we start with the obvious candidates to model, i.e. ENSO and QBO.

    2. Follow on to this post

      If one thinks that wind is the forcing agent for ENSO and not angular momentum variations, consider the following physics of tidal forcing: Imparting a 1 millisecond slowdown (or speedup) on the rotation of the earth with a surface velocity of almost 500 meters/second over the course of a couple of weeks (a fortnight) will result in an inertial lateral movement of ~ 1/2 a meter in the volume of the Pacific ocean due to Newton's first law.

      This does not seem like a big deal until you realize that the thermocline can absorb this inertial impulse as a vertical sloshing, since the effective gravity is reduced by orders of magnitude due to the slight density differences above and below the thermocline. This is reflected as an Atwood number and shows up in Rayleigh-Taylor instability experiments, e.g. SEE THIS PAPER

      With an Atwood number less than 0.001 which is ~0.1% density differences in a stratified fluid, the 0.5 meter displacement that occurs over two weeks now occurs effectively over half an hour, or alternately is 1000X more sensitive than an unstratified volume. Either way, its an elementary scaling exercise to evaluate the impact.

      So intuitively, one has to ask the question of what would happen if the ocean was translated laterally by 1/2 a meter over the course of a 1/2 an hour? We know what happens with earthquakes in something as basic as a swimming pool or as threatening as a tsunami. But this is much more subtle because we can't obviously see it, and why it has likely been overlooked as a driver of ENSO.

      All that math modeling of ENSO described in the first link works backwards to this point. The actual forcing working on the earth's rotation can lead to the response shown, both in the dynamic sense of precisely tracking the measured ENSO time-series and now in terms of a physical order-of-magnitude justification.

      This effect is real and not imagined, and so should be accommodated in ENSO models.