Skip to main content

Geometry of color perception. Part 1: structures and metrics of a homogeneous color space

Abstract

This is the first half of a two-part paper dealing with the geometry of color perception. Here we analyze in detail the seminal 1974 work by H.L. Resnikoff, who showed that there are only two possible geometric structures and Riemannian metrics on the perceived color space \(\mathcal{P} \) compatible with the set of Schrödinger’s axioms completed with the hypothesis of homogeneity. We recast Resnikoff’s model into a more modern colorimetric setting, provide a much simpler proof of the main result of the original paper, and motivate the need of psychophysical experiments to confute or confirm the linearity of background transformations, which act transitively on \(\mathcal{P} \). Finally, we show that the Riemannian metrics singled out by Resnikoff through an axiom on invariance under background transformations are not compatible with the crispening effect, thus motivating the need of further research about perceptual color metrics.

1 Introduction and state of the art

This first half of a two-part paper provides a thorough review and a critical analysis of the pioneering work of H.L. Resnikoff on color perception developed within the papers [13] and the book [4]. These works are amongst the major inspirations for a modern program of re-foundation of colorimetry that will be discussed in the second part, in which it will be shown how to recast Resnikoff’s model in a quantum-like theory via the framework of Jordan algebras.

Even if it may be surprising at first glance, Resnikoff belongs to a vast ensemble of mathematicians, physicists, biologists, philosophers, and even poets fascinated by the concept of color. The list is impressive, ranging from Plato to Wittgenstein, passing through Aristotle, Descartes, Hook, Newton, Euler, Young, Helmholtz, Maxwell, Grassmann, Riemann, Goethe, Schopenhauer, Locke, Weber, Fechner, Dalton, Hering and, last but not least, Schrödinger (see e.g. [5] for a modern translation of Schrödinger’s work on color).

In fact, it is the research about the mathematical analogies between optics and color on one side and the oscillating behavior of quantum particles on the other that led Schrödinger to propose the famous equation which bears his name in quantum mechanics [6]. As it will be recalled in Sect. 2, Schrödinger performed a synthesis of the most important findings about the mathematical theory of color perception in a coherent set of axioms, introducing one of his own. This can be thought of as a psycho-physical counterpart of what Maxwell did for electromagnetism.

The experiments of Wright and Guild, see e.g. [7], could have paved the way for further development in the mathematical understanding of color perception; however, the then recently founded Commission Intérnational de l’Éclairage (CIE) took a more practical path by building up geometrically flat color spaces which had the advantage to be easier to handle for engineering purposes. While the XYZ space still stands today as a handy color space for colorimetry, its developments until recent years, see e.g. [8], lacked mathematical rigor and introduced ad-hoc parameters to adapt newly discovered phenomena to the existing color space structures instead of modifying their geometry to fit the new observations.

Resnikoff’s papers and book, instead, put in question the flat geometry of the color space. They were written in the middle of the 70s of the twentieth century, about the same period when researchers in relativistic quantum field theory developed the standard model of fundamental physical interactions and when some first attempts to fuse quantum mechanics and general relativity into a single theory were proposed. This zeitgeist could explain the reason why Resnikoff decided to use techniques which are quite common in theoretical physics (as Riemannian geometry, homogeneous spaces, Lie groups and algebras) to study color perception. In this sense, his achievements could be considered a very elegant example of ‘theoretical psycho-physics’.

In spite of its extreme originality and deepness, Resnikoff’s work remained practically unnoticed until today, probably due to the fact that the mathematical knowledge needed to understand the meaning of his findings is quite vast and does not belong to the typical mathematical background of scientists working on colorimetry.

One of the aims of this paper is to rewrite Resnikoff’s results in more modern and pedagogical terms, thus making them accessible to a wider range of researchers in colorimetry, vision science, and image processing.

We will discuss in particular detail the homogeneity axiom for the space of perceived colors \(\mathcal {P}\), which led Resnikoff to show that, if we accept the homogeneity hypothesis, \(\mathcal {P}\) can only be the canonical Helmoltz–Stiles space well-known to colorimetrists, or a completely new space of hyperbolic nature. We will provide an alternative and much simpler proof of this result, underlining also an error in the original demonstration proposed by Resnikoff in [1].

Differently from the standard colorimetric setting, Resnikoff established his theory in what we call today color in context i.e. a color stimulus embedded in a uniform background. This aspect appears to be crucial for the development of his theory, because the group of transformations acting transitively on \(\mathcal {P}\) are identified with background changes. However, we remark that a major issue remains open: background transformations must be linear to fit in Resnikoff’s theory, but no experimental data is available yet to confirm or confute this hypothesis. For this reason, we will discuss a psycho-physical experiment that can be used to verify the linearity of background transformations.

Finally, we will show that the crispening effect contradicts Resnikoff’s hypothesis that background changes are isometries with respect to the perceptual color metric. Thus underlying the need of further research about perceptual color metrics.

2 Review of Schrödinger’s axioms for the space of perceived colors

Schrödinger’s axioms for the space of perceived colors can be resumed with the following statement: the space of perceived colors is a regular convex cone embedded in a real vector space of dimension less or equal to 3.

The validity of this sentence, however, is bounded within the limits of the standard observational conditions of colorimetry, which, as we are going to discuss, are very restrictive and far from those of a natural visual scene.

We start with the notation and nomenclature that we need for the rest of the paper.

  • \(\varLambda=[\lambda_{\text{min}},\lambda_{\text{max}}]\) denotes the visual spectrum, the extrema of Λ are left unspecified because their numerical values are not important and because there is no agreement about them. Typically, one chooses \(\lambda_{\text{min}}=380\mbox{ nm}\) (extreme violet) and \(\lambda_{\text{max}}=780\mbox{ nm}\) (extreme red).

  • \(\textbf{x}:\varLambda\to \mathbb {R}^{+}\): is the light function representing the electromagnetic radiation associated with the color stimulus or physical light. A spectral color stimulus \(\textbf{x}_{\lambda_{0}}(\lambda)\) relative to the wavelength \(\lambda_{0}\) is a narrow-band visible radiation, generally modeled by a Gaussian centered in \(\lambda_{0}\) and with a small standard deviation (typically of the order of 1 nm), or a piece-wise constant function everywhere null in Λ apart from a small interval of wavelengths containing \(\lambda_{0}\).

  • Since light stimuli have finite energy i.e. \(\int_{\varLambda}\textbf{x}(\lambda)^{2} \, d\lambda< +\infty\), a physical light can be considered as an element of \(L^{2}(\varLambda)\subset L^{1}(\varLambda)\), where the inclusion holds because the Lebesgue measure of Λ is finite. Light stimuli are real valued, so \(L^{2}(\varLambda)\) will be considered as a real vector space, and we will write

    $$ L_{+}^{2}(\varLambda)=\bigl\{ x:\varLambda\to[0,+\infty), x\in L^{2}(\varLambda)\bigr\} $$
    (1)

    for the space of color stimuli.

  • Standard colorimetric observational conditions: in the conventional tests, a physical light x is presented in a dim room to an observer with an aperture angle of either 2 degrees (foveal vision) or 10 degrees (extra-foveal vision). As we will see in Sect. 3.2, Resnikoff does not consider this standard observational conditions since, in his model, x is presented as a small central area seen against a uniformly illuminated background. In this latter case we talk about color in (uniform) context. Experiments about color in non-uniform context are still quite rare, see e.g. [9, 10], and confined to a very simple geometric configuration, far from the complexity of natural scenes. In this section, only the standard colorimetric observational conditions will be considered.

  • Color matching: the typical way to compare the perception of two physical lights \(\textbf{x},\textbf{y} \in L_{+}^{2}(\varLambda)\) is to divide the field of view into two (creating a bipartite field) and putting the two color stimuli side-by-side. x and y are said to match if no edge is perceived between them. We stress that this is not the only way to perceptually compare color stimuli, but it is by far the more common for the standard colorimetric observational conditions described above.

  • For any \(\textbf{x},\textbf{y} \in L_{+}^{2}(\varLambda)\), we write \(\textbf{x}\sim\textbf{y}\) to mean that x and y are perceived as identicalFootnote 1 in a color matching experiment. We stress that, when we write the expression \(\textbf{x}\sim\textbf{y}\), the color stimulus x is shown on one side of the bipartite field and y on the other side. The symbol is well posed since is indeed an equivalence relation [11].

  • The space of perceived colors in the standard colorimetric observational condition is defined as the set of equivalence classes of equally perceived lights i.e. the quotient space

    $$ \mathcal {P}= L_{+}^{2}(\varLambda)/{\sim} . $$
    (2)

    The equivalence class of x will be simply denoted by x:

    $$ x=[\textbf{x}]_{\sim}. $$
    (3)

    Given \(x,y\in \mathcal {P}\), \(x=y\) means \([\textbf{x}]_{\sim}= [\textbf {y}]_{\sim}\) i.e. \(\textbf{x} \sim\textbf{y}\), so that equality in \(\mathcal {P}\) means perceptual match of color stimuli. The 0 of \(\mathcal {P}\) is the equivalent class of all physical lights that are so dim to be perceived as black.

  • We can endow \(\mathcal {P}\) with the operations of sum and multiplication by a positive real scalar. For any \(\textbf {x},\textbf{y} \in L_{+}^{2}(\varLambda)\) and any positive scalar \(\lambda \in \mathbb {R}^{+}\), \(\textbf{x}+\textbf{y}\) is interpreted as superposition of light beams and λx as the intensity modulation of x by the factor λ. These operations defined on \(L^{2}_{+}(\varLambda)\) can be passed to the quotient space \(\mathcal {P}\) simply by defining

    $$ \lambda_{1} x+\lambda_{2} y=[ \lambda_{1} \textbf{x}+\lambda_{2} \textbf {y}]_{\sim}, \quad\forall\lambda_{1}, \lambda_{2} \in \mathbb {R}^{+}, \forall \textbf{x},\textbf{y} \in L_{+}^{2}(\varLambda), $$
    (4)

    so \(z=[\textbf{z}]_{\sim}\) can be written as \(z=\lambda_{1} x + \lambda _{2} y\) if and only if \(\textbf{z}\sim\lambda_{1} \textbf{x}+\lambda_{2} \textbf{y}\). Dubois proved in [11] that, if we consider the standard observational conditions of colorimetry, these operations are well defined in the sense that they do not depend on the choice of the representative in the perceptual equivalence class. To resume, in \(\mathcal {P}\) it is possible to operate conical combinations of perceived colors i.e. linear combinations with positive real coefficients. In particular, convex combinations of elements of \(\mathcal {P}\) i.e. conical combinations with coefficients summing to 1, are well defined.

  • The smallest vector space containing \(\mathcal {P}\) is

    $$ V=\mathcal {P} - \mathcal {P}=\{x-y, x,y\in\mathcal {P}\}, $$
    (5)

    see e.g. [12]. The elements of V written as −y, with \(y\in\mathcal {P}\), will be called virtual colors. To understand the role of virtual colors in colorimetry, it is worthwhile recalling the famous Wright and Guild experiments, see e.g. [7] or [8], which have shown that, for a fixed color stimulus \(z\in L^{2}_{+}(\varLambda)\), there are three spectral color stimuli \(\textbf{x},\textbf{y},\textbf{w}\in L^{2}_{+}(\varLambda)\) and three real positive coefficients \(a,b,c\in \mathbb {R}^{+}\) such that either \(a\textbf{x}+b\textbf{y}+c\textbf{w}\sim\textbf {z}\) or \(a\textbf{x}+b\textbf{y}\sim\textbf{z}+c\textbf{w}\). In this last case, by recalling the convention above, we must superpose z with cw on one side of a bipartite field to color match the superposition \(a\textbf{x}+b\textbf{y}\) on the other side. This is where virtual colors enter into play: given \(x,y,w,z\in \mathcal {P}\), \(a,b\in \mathbb {R}^{+}\), and \(c<0\), \(z=ax+by+cw\) belongs to V but not to \(\mathcal {P}\), because c is negative. The colorimetric interpretation is the following: the color stimulus \(a\textbf{x}+b\textbf{y}\) shown to an observer on one side of a bipartite field matches \(\textbf {z}+(-c)\textbf{w}\) shown on the other side i.e. \(a\textbf {x}+b\textbf{y}\in[\textbf{z}+(-c)\textbf{w}]_{\sim}\).

With this notation, Schrödinger’s axioms, see [13], can be stated like this.

  • Axiom 1 (Newton 1704, [14]) If \(x\in \mathcal {P}\) and \(\alpha\in \mathbb {R}^{+}\), then \(\alpha x \in \mathcal {P}\).

  • Axiom 2 (Schrödinger 1920, [13]) If \(x \in \mathcal {P}\), \(x\neq0\), then it does not exist any \(y \in \mathcal {P}\), \(y\neq0\) such that \(x+y=0\).

  • Axiom 3 (Grassmann 1853, [15] & Helmholtz 1866 [16]) For every \(x,y \in \mathcal {P}\) and for every \(\alpha\in[0,1]\), \(\alpha x+(1-\alpha)y \in \mathcal {P}\).

  • Axiom 4 (Grassmann 1853, [15]) For all quadruple of perceived lights \(x_{k}\in \mathcal {P}\), \(k=1,\ldots,4\), there are coefficients \(\alpha_{k}\in \mathbb {R}\), not all simultaneously null, such that \(\sum_{k=1}^{4} \alpha_{k} x_{k} =0\).

Let us now discuss the colorimetric and mathematical meaning of the axioms. A finer version of Axiom 4 will be obtained mixing Axioms 1, 2, and 4; furthermore, an important property of \(\mathcal {P}\) will be underlined as a consequence of Axioms 1 and 3.

Mathematically speaking, the meaning of Axiom 1 is simple: \(\mathcal {P}\) is an infinite cone embedded in V. However, notice that this is an idealization: when α gets very large, photoreceptors saturate until the glare limit is reached and we lose sight abilities. Instead, as α gets small, we pass to the mesopic or to the scotopic conditions, in which both cones and rods or only the rods, respectively, are activated. If α approaches 0, then we lose our ability to see. Thus, the cone \(\mathcal {P}\) is truncated both from above and below, with the shift from the photopic to the scotopic condition (the so-called Purkinje effect [7]) still representing a major challenge from both a mathematical and a colorimetric point of view. The discussion of these important issues deserves a paper by its own and here we will just consider the idealized model of \(\mathcal {P}\) as an infinite cone.

Axiom 2 means that no superposition of perceived colors different from 0 is perceived as the absence of light.Footnote 2 Mathematically speaking, this implies that the cone \(\mathcal {P}\) is regular (also said proper).

Axiom 3 means that the line segment that joins the perceived colors x and y consists entirely of perceived colors, thus Axioms 1, 2, and 3 altogether imply that \(\mathcal {P}\) is a regular convex cone. Axiom 3 is well known to be equivalent to be closed under conical convex combinations i.e. linear combinations with positive coefficients between 0 and 1 whose sum is 1.

This fact, along with Axiom 1, implies that \(\mathcal {P}\) is actually closed under conical combinations, in fact, for all \(\alpha_{1},\alpha _{2}\in \mathbb {R}^{+}\) and \(x_{1},x_{2}\in \mathcal {P}\), \(\frac{1}{\alpha_{1}+\alpha _{2}} (\alpha_{1}x_{1}+\alpha_{2}x_{2} )\equiv z\) is a convex combination of elements of \(\mathcal {P}\), so \(z\in \mathcal {P}\) thanks to Axiom 3, but then also \((\alpha_{1}+\alpha_{2})z=\alpha_{1}x_{1}+\alpha_{2}x_{2}\in \mathcal {P}\) thanks to Axiom 1. By iterating this argument, we have that \(\sum_{k=1}^{n} \alpha_{k} x_{k} \in \mathcal {P}\) \(\forall\alpha_{k}\in \mathbb {R}^{+}\), \(x_{k}\in \mathcal {P}\), \(k=1,\ldots,n\).

Axioms 1 and 3 imply that \(\mathcal {P}\) is a connected and contractible cone i.e. \(\mathit{id}_{\mathcal{P}}\) is homotopic to a constant map [17]. Intuitively, this means that \(\mathcal {P}\) can be continuously contracted to a single point. This simple remark will turn out to be crucial for the analysis of \(\mathcal {P}\).

Axiom 4 means that every collection of more than three perceived colors is a linearly dependent family in V.

A finer version of Axiom 4 can be obtained with the following argument. First of all, notice that Axioms 1–3 prevent the \(\alpha_{k}\)s from having all the same sign. In fact, let us imagine that all the coefficients \(\alpha_{1},\ldots,\alpha_{4}\) are positive, then \(\bar{x} = \sum_{k=1}^{3} \alpha_{k} x_{k} \in \mathcal {P}\) (thanks to what has just been proven) and \(\bar{y} = \alpha_{4} x_{4} \in \mathcal {P}\) (thanks to Axiom 1), then \(\sum_{k=1}^{4} \alpha_{k} x_{k} =0\) implies \(\bar{x} + \bar{y} = 0\), which is prevented by Axiom 2. A similar argument can be used when all the coefficients \(\alpha_{1},\ldots,\alpha_{4}\) are negative.

To resume, Axioms 1–4 imply this stronger version of Axiom 4.

  • Axiom 4’ For all quadruple of perceived lights \(x_{k}\in \mathcal {P}\), \(k=1,\ldots,4\), there are coefficients \(\alpha_{k}\in \mathbb {R}\), not all with the same sign, such that \(\sum_{k=1}^{4} \alpha_{k} x_{k} =0\).

There are only two options coherent with Axiom 4’. The first option is that three coefficients have the same sign and the remaining one has an opposite sign. Since equality in V is color matching and a negative coefficient means that the corresponding light stimulus must be shown on the other side of the bipartite field, this means that one light stimulus color matches the superposition of other three light stimuli.

In the second option, two coefficients are positive and two are negative: this means that the superposition of two lights stimuli color match the superposition of other two lights stimuli. We thus see that Schrödinger’s axioms are coherent with the experiments of Wright and Guild [8].

Another direct consequence of Axiom 4 is that the dimension of V is either 1, 2, or 3. In particular, we call an observer for which:

  • \(\operatorname{dim}(V)=3\): trichromate;

  • \(\operatorname{dim}(V)=2\): dichromate;

  • \(\operatorname{dim}(V)=1\): monochromate;

  • \(\operatorname{dim}(V)=0\): blind.

Following [18], we observe that the projection map

$$ \begin{aligned} \pi: L_{+}^{2}(\varLambda) & \longrightarrow \mathcal {P}\\ \textbf{x} & \longmapsto x \end{aligned} $$
(6)

implies that infinitely many spectrally different lights coincide perceptually.

In what follows, we will fix our attention only on the trichromatic case i.e. \(\operatorname{dim}(V)=3\), so that, from now on, \(\mathcal {P}\) will be interpreted as a regular convex cone embedded in a three-dimensional real vector space.

3 Resnikoff’s homogeneity axiom for \(\mathcal {P}\)

As stated in the introduction, in [1] Resnikoff used the theory of homogeneous spaces to study the geometry and the metrics of the perceived color space \(\mathcal {P}\). This is far from being a trivial task, since the equivalence classes that make up \(\mathcal {P}\) are very difficult to characterize from a mathematical point of view. Thus, a theory of \(\mathcal {P}\), which bypasses the use of these equivalent classes, is highly desirable.

Understanding why Resnikoff considered homogeneity a paramount property in the analysis of \(\mathcal {P}\) is a key point. For this reason, it is worth recapping the basic information about homogeneous spaces.

If X is a nonempty set and G is a group, then a map \(\eta: G\times X \to X\), \((x,g)\mapsto g(x)\) is said to be a left action of G on X if \(e(x)=x\), e being the neutral element of G, and if \((gh)(x)=g(h(x))\) for all \(g,h\in G\) and \(x\in X\). In this case, X is called a G-space. If we fix any g, then the map \(\eta_{g}:X\to X\), \(x\mapsto g(x)\) is bijective, its inverse being \(\eta_{g^{-1}}\), because \(g(x) \mapsto g^{-1}(g(x))=e(x)=x\). So, a left G action on X can equivalently be defined as a group homomorphism \(G\to\operatorname{Aut}(X)\), where \(\operatorname{Aut}(X)\) is the group of all automorphisms (one-to-one functions) on X. Any element \(g\in G\) is called a symmetry for X.

Since \(\mathcal {P}\) is not merely a set but a convex cone embedded in a vector space, we are more interested in G-spaces with an intrinsic structure. In this case, we call X a G-space if the elements of G preserve the structure of X i.e. if \(\operatorname{Aut}(X)\) is the group of the automorphisms of the category in which X belongs e.g. homeomorphisms for topological spaces, diffeomorphisms for smooth manifolds, invertible linear maps for vector spaces, and so on.

Fixed arbitrary \(x\in X\), the set \(G\cdot x=\{g(x) \in X : g\in G\}\) is called the G-orbit of x. X is said to be a G-homogeneous space if \(G\cdot x=X\) for every \(x\in X\) i.e. if the action of G on X is transitive. In this case, for every couple of elements \(x,y \in X\), there exists a transformation \(g\in G\) such that \(g(x)=y\), which explains why the concept of homogeneity is said to translate into mathematical terms the fact that no point of the space is special. If \(Y\subset X\), then we define the group \(\operatorname{Aut}(Y)=\{\varphi\in\operatorname{Aut}(X) : \varphi(Y)=Y\}\) of automorphisms of X that restricted to Y become automorphisms of Y, then Y is G-homogeneous if the action defined by the group homomorphisms \(G\to\operatorname{Aut}(Y)\) is transitive on Y.

Let us apply this to the case of interest for us i.e. that of a general cone \(\mathcal {C}\) embedded in a real vector space V of finite dimension n. In this case, if we define \(\operatorname{GL}(\mathcal {C})=\{T\in\operatorname{GL}(V), T(\mathcal {C})=\mathcal {C}\}\), then \(\mathcal {C}\) is said to be a homogeneous cone in V if, for any two points \(a,b\in \mathcal {C}\), there exists \(T\in\operatorname{GL}(\mathcal {C})\) such that \(b=T(a)\). We will also need a localized version of this property: \(\mathcal {C}\) is a locally homogeneous cone in V if, for every \(a\in C\), there is an open neighborhood \(U_{a}\) of a such that, for every \(b\in U_{a}\), there exists \(T\in\operatorname{GL}(\mathcal {C})\) such that \(b=T(a)\), where open is referred to the Euclidean topology of \(\mathcal {C}\) inherited by V.

3.1 The one-dimensional motivation to study \(\mathcal {P}\) as a homogeneous space

Resnikoff declares that the motivation to study \(\mathcal {P}\) as a homogeneous space comes from the analysis of Weber–Fechner’s law [7] in metric terms. Weber–Fechner’s law, often described as the first psycho-physical law ever determined, describes the perceptual response of humans with respect to changes of achromatic stimuli i.e. visual inputs that depend only on their intensity (typically obtained by activating only the retinal rods with dim lights). Experiments showed that the perceptual counterpart of an achromatic stimulus of intensity \(x\in \mathbb {R}^{+}=(0,+\infty)\), called brightness and denoted with \(b(x)\), is proportional to logx (for a wide range of intensities). Thus, the relative brightness \(b(x_{1})-b(x_{2})\) between two visual stimuli of intensity \(x_{1}\) and \(x_{2}\) is proportional to \(\log(x_{1})-\log(x_{2})=\log\frac{x_{1}}{x_{2}}=\log \frac{\lambda x_{1}}{\lambda x_{2}}\) for all positive coefficients λ belonging to the range of values for which Weber–Fechner’s law is valid. This explains why the relative brightness is invariant under the simultaneous modification of light intensity expressed by

$$ x_{1} \mapsto\lambda x_{1}, \qquad x_{2} \mapsto\lambda x_{2}, \quad\lambda>0. $$
(7)

\(\mathbb {R}^{+}=(0,+\infty)\), interpreted as the set of all possible visible light intensities, is both a cone embedded in the real one-dimensional vector space \(\mathbb {R}\) and a group with respect to the ordinary multiplication of positive real numbers. The very simple observation that

$$ \forall x,y\in \mathbb {R}^{+},\quad y=\frac{y}{x}x\equiv\lambda x, $$
(8)

shows that \(\mathbb {R}^{+}\) is an \(\mathbb {R}^{+}\)-homogeneous cone. Weber–Fechner’s law implies that the relative brightness between two perceived lights is an \(\mathbb {R}^{+}\)-invariant function defined on \(\mathbb {R}^{+}\). What is crucial here is that, up to a selection of unit of measurement, the brightness difference expressed by Weber–Fechner’s law coincides with the unique \(\mathbb {R}^{+}\)-invariant Riemannian distanceFootnote 3 on \(\mathbb {R}^{+}\) i.e.

$$ d(x_{1},x_{2})= \bigl\vert \log(x_{1})- \log(x_{2}) \bigr\vert = \biggl\vert \log\frac {x_{1}}{x_{2}} \biggr\vert , \quad x_{1},x_{2}\in \mathbb {R}^{+}. $$
(9)

This consideration represented a major inspiration for Resnikoff, who extended these ideas to the three-dimensional color space \(\mathcal {P}\).

3.2 \(\mathcal {P}\) as a homogeneous space

Resnikoff’s model for a homogeneous perceived color space is intimately connected with the nonstandard observational configuration that he assumed, which is depicted in Fig. 1.

Figure 1
figure 1

Observational configuration assumed by Resnikoff. In Resnikoff’s model a color is always associated with a couple given by a physical light and a uniform background illumination in which it is embedded

In this setting, a color is always associated with a light stimulus embedded in a uniform context. It follows that, in this context, the definition of \(\mathcal {P}\) given in (2) must be modified as follows:

$$ \mathcal {P}= \bigl( L^{2}_{+}(\varLambda) \times L^{2}_{+}(\varLambda) \bigr) /{\sim} $$
(10)

i.e. a perceived color \(x\in \mathcal {P}\) here is defined as a perceptual equivalence class of couples \((\textbf{x},\textbf{b})\in L^{2}_{+}(\varLambda) \times L^{2}_{+}(\varLambda)\), where x is the central light stimulus and b is the light uniformly distributed on the background. Two distinct couples \((\textbf {x}_{1},\textbf{b}_{1})\) and \((\textbf{x}_{2},\textbf{b}_{2})\), \(\textbf {x}_{1}\neq\textbf{x}_{2}\), \(\textbf{b}_{1}\neq\textbf{b}_{2}\), belong to the same equivalence class if, for \(i=1,2\), the central light stimuli \(\textbf{x}_{i}\) embedded in the corresponding background \(\textbf {b}_{i}\) induce the same perceived color \(x\in \mathcal {P}\). Resnikoff does not comment if, in this nonstandard observational configuration, \(\mathcal {P}\) conserves the properties of a regular convex cone; this is a paramount important issue that will be discussed in Sect. 3.3. For the moment, we assume that \(\mathcal {P}\) is a regular convex cone also in this setting.

As in Sect. 2, we define the smallest vector space containing \(\mathcal {P}\) as \(V=\mathcal {P}-\mathcal {P}\). The only difference is that in Resnikoff’s setting color match will be done for color stimuli embedded in a uniform background and not for stand-alone physical lights. This requires a novel specification of color matching that Resnikoff did not discuss.

Once established the model framework, we can begin with the mathematical construction that leads to the homogeneity axiom. First of all, since \(\mathcal {P}\) is a (positive) cone embedded in a three-dimensional real vector space V, \(\operatorname{Aut}(\mathcal {P})\) will be given by orientation-preserving linear transformations of V which preserve \(\mathcal {P}\) i.e.

$$ \operatorname{GL}^{+}(\mathcal {P}):=\bigl\{ B\in\operatorname{GL}(3,\mathbb {R}), \det (B)>0 \text{ and } B(x)\in \mathcal {P}\forall x\in \mathcal {P}\bigr\} , $$
(11)

where \(\operatorname{GL}(3,\mathbb {R})\) is identified with the group of invertible real \(3\times3\) matrices with determinant different from zero i.e. the complementary set in \(\mathrm{M}(3,\mathbb {R}) \cong \mathbb {R}^{9}\) of \(\operatorname{det}^{-1}\{0\}\), the inverse-image of 0 by the determinant function, which is continuous in the Euclidean topology, thus \(\operatorname{det}^{-1}\{0\}\) is closed and so \(\operatorname{GL}(3,\mathbb {R})\) is an open subset of \(\mathbb {R}^{9}\). The request of positive determinant is introduced to respect the direction of each generatrix of the cone \(\mathcal {P}\).

Resnikoff claimed that if we interpret \(B\in\operatorname{GL}^{+}(\mathcal {P})\) as a ‘change of background illumination’, or background transformations for short, then the action of \(\operatorname{GL}^{+}(\mathcal {P})\) on \(\mathcal {P}\) is transitive, thus making \(\mathcal {P}\) a homogeneous cone. The argument that he used follows this line of reasoning. First of all, it is generally accepted that any perceived color \(x\in \mathcal {P}\) can be transformed into any ‘sufficiently near’ one \(y\in \mathcal {P}\) by an appropriate change of background illumination, see Fig. 2 for a graphical representation of this phenomenon.

Figure 2
figure 2

An example of effect of background change. The inner disks appearing in the center of the two images are exactly the same physical color stimuli. However, the one in the left image is perceived, with respect to its background illumination, as a very saturated green, instead, after the change of background illumination shown by the image on the right, the color stimulus is perceived as yellowish. Due to the small size of the color stimuli on this document, they are surrounded by a thin black circle to enhance their visibility

For this reason, \(\mathcal {P}\) can be considered as a local homogeneous space with respect to the group \(\operatorname{GL}^{+}(\mathcal {P})\). Notice that this is not a physical property of color, but a perceptual feature of human vision, usually referred to as chromatic induction, see e.g. [9, 10, 19] for more details about how induction can be measured.

We now need some topological considerations. \(\mathcal {P}\) inherits the structure of metric space from V, thought of as a three-dimensional Euclidean space. Local homogeneity implies that, for every \(x\in \mathcal {P}\), there exists an open neighborhood \(U_{x} \subset \mathcal {P}\) such that each \(y\in U_{x}\) can be expressed as \(y=B(x)\in \mathcal {P}\) for some \(B\in\operatorname{GL}^{+}(\mathcal {P})\), so every element of \(\mathcal {P}\) is an interior point. To resume: \(\mathcal {P}\) is open in V.

Let us now consider local homogeneity in conjunction with Axiom 3 i.e. with the convexity of \(\mathcal {P}\): for every couple of perceived colors \(x,y\in \mathcal {P}\), the line segment L that joins x to y lies entirely in \(\mathcal {P}\). Local homogeneity assures that, for any \(z\in L\), there exists an open neighborhood \(U_{z}\subset \mathcal {P}\) that is a homogeneous space with respect to the group \(\operatorname{GL}^{+}(\mathcal {P})\). As z varies in L, we obtain the open covering \(\bigcup_{z\in L} U_{z}\) of L, and, since L is a compact subset of \(\mathcal {P}\), we can extract a finite open covering of L from it i.e. there exist \(x_{1},\ldots,x_{n} \in L\), \(n<+\infty\), such that \(\bigcup_{k=1}^{n} U_{x_{k}}\) is an open covering of L.

Let \(B_{k}\in\operatorname{GL}^{+}(\mathcal {P})\) be the change of background illumination which carries \(x_{k}\) to \(x_{k+1}\), where \(k=1,\ldots, n-1\), \(x_{0}\equiv x\), and \(x_{n}\equiv y\), then, since \(\operatorname{GL}^{+}(\mathcal {P})\) is a group, the transformation \(B\equiv B_{n}\circ B_{n-1} \circ\cdots\circ B_{1}\) carries x to y i.e. \(y=B(x)\), for every couple of perceived colors \(x,y\in \mathcal {P}\). As a consequence, \(\mathcal {P}\) is a \(\operatorname{GL}^{+}(\mathcal {P})\)-homogeneous space.

One might object that, operationally speaking, transforming any color sensation \(x\in \mathcal {P}\) to any other one \(y\in \mathcal {P}\) via a single change of background illumination would be illusory if x and y are very far apart in terms of chromatic attributes. The following considerations will clarify how to correctly interpret the composition of background transformations. Let us consider again Fig. 2 and search for a transformation B such that the green sensation \(x\in \mathcal {P}\), \(x=[(\textbf{x}_{0},\textbf {b}_{0})]_{\sim}\) of the color stimulus in the center of the image on the left is transformed into an arbitrarily different color \(y\in \mathcal {P}\). The first transformation \(B_{1}\) that we could use is, for example, the one shown on the right-hand side of Fig. 2. The key observation is that, thanks to what was stated at the beginning of this section, the yellowish perceived color \(x_{1}\in \mathcal {P}\), \(x_{1}=[(\textbf{x}_{0},\textbf{b}_{1})]_{\sim}\), \(\textbf{b}_{1} \neq \textbf{b}_{0}\), can be characterized by another couple \(( \textbf {x}_{1},\tilde{ \textbf{b}}_{1})\) that matches \(x_{1}\). Then, by performing a wisely chosen background change \(B_{2}\) on this alternative characterization of \(x_{1}\), we can transform it into a color \(x_{2}\in \mathcal {P}\), \(x_{2}=[(\textbf{x}_{1},\textbf{b}_{2})]_{\sim}\), perceptually closer to y than \(x_{1}\). As done before, \(x_{2}\) can be characterized by another couple \((\textbf{x}_{2},\tilde{ \textbf{b}}_{2})\), and a third background transformation \(B_{3}\) can be operated on this last configuration, obtaining a color \(x_{3}\in \mathcal {P}\) perceptually closer to y than \(x_{2}\). By iterating the previous steps, we arrive at the match with the desired color y. Of course, the experimental process has to be performed painstakingly and it is likely to be very time-consuming, but the mathematical argument discussed above guarantees that the procedure can be performed within a finite number of steps.

In Fig. 3 we report the perceived colors \(x_{1},\dots ,x_{5}\) obtained with the process described above, which shows how a color sensation can be gradually moved toward another one via composition of background transformations.

Figure 3
figure 3

Composition of background transformations. From left to right: the effect of composing four background transformations with the procedure explained in the text

All the considerations discussed so far justify why Resnikoff was led to postulate his own fifth axiom on the structure of the color space.

  • Axiom 5 (Resnikoff 1974, [1]) \(\mathcal {P}\) is a \(\operatorname{GL}^{+}(\mathcal {P})\)-homogeneous space.

Axioms 1 to 5 imply that \(\mathcal {P}\) is an open convex regular homogeneous cone (and, as such, also connected and contractible) embedded in a three-dimensional vector space V.

Such objects have been classified and this is what will allow us to explicitly determine the only possible geometrical structures of \(\mathcal {P}\). However, we postpone this analysis to Sect. 4 after an interlude in which we discuss an important issue related to the linearity of background transformations.

3.3 The issue of linearity in Resnikoff’s model

The transitive action of the changes of background illuminations on \(\mathcal {P}\) has been extensively analyzed above. Here we concentrate on the remaining properties that they must fulfill.

Of course every B preserves \(\mathcal {P}\) because a perceived color is still such after a background change; moreover, all transformations B are clearly invertible since we can perform the reverse change and turn back to the original color sensation.

However, there is a crucial issue that Resnikoff failed to analyze: it is not clear why background changes should be linear. Actually, Resnikoff himself, in the paper [2] published a little after [1], declared this issue to be ‘the least verified aspect’ of the group of transformations that he considered.

Mathematically, linear background transformations \(B\in\operatorname{GL}(V)\) should behave like this on elements of \(\mathcal {P}\):

$$ B(\alpha x+\beta y)=\alpha B(x)+\beta B(y), \quad\alpha,\beta\in \mathbb {R}^{+}, x,y\in \mathcal {P}. $$
(12)

In Fig. 4 we outline a psycho-physical experiment to check the additivity of background transformations. A similar procedure can be used to verify if B behaves linearly with respect to scaling.

Figure 4
figure 4

A psycho-physical experiment to check the additivity of background transformations. The experimental setup outlined in the picture can be used to check the additivity of background transformations, see the text for a detailed description

Consider two physical lights x, y and their superposition \(\textbf{x}+\textbf{y}\), all three embedded in a background b. They induce color sensations x, y and, assuming Eq. (4), \(x+y\). After the change of background B from b to \(\textbf{b}'\), the color sensations induced by x, y and \(\textbf{x}+\textbf{y}\) will become \(B(x)=[(\textbf{x},\textbf{b}')]_{\sim}\), \(B(y)=[(\textbf {y},\textbf{b}')]_{\sim}\) and \(B(x+y)=[(\textbf{x}+\textbf{y},\textbf {b}')]_{\sim}\), respectively.

Then consider an auxiliary background \(\textbf{b}''\) and search for the physical lights \(\tilde{\textbf{x}}\) and \(\tilde{\textbf{y}}\) that, embedded in \(\textbf{b}''\), are perceived as \(B(x)\) and \(B(y)\) i.e. \(B(x)=[(\tilde{\textbf{x}},\textbf{b}'')]_{\sim}\) and \(B(y)=[(\tilde{\textbf{y}},\textbf{b}'')]_{\sim}\), respectively. Thus, the superposition of \(\tilde{\textbf{x}}\) and \(\tilde{\textbf {y}}\) will give \(B(x)+B(y)=[(\tilde{\textbf{x}}+\tilde{\textbf {y}},\textbf{b}'')]_{\sim}\). If \(B(x+y)=[(\textbf{x}+\textbf {y},\textbf{b}')]_{\sim}\) matches \(B(x)+B(y)=[(\tilde{\textbf {x}}+\tilde{\textbf{y}},\textbf{b}'')]_{\sim}\), then B is additive with respect to the auxiliary background \(\textbf{b}''\). By repeating the test with a sufficiently diversified set of auxiliary background, the additivity of B with respect to x, y and b, \(\textbf{b}'\) is tested. Finally, by varying also x, y and b, \(\textbf{b}'\), the additivity of B tout court is tested.

Until the linearity hypothesis about the changes of background is experimentally confirmed, it remains a conjecture that lies at the core of Resnikoff’s model.

If these transformations turned out to be nonlinear, this would not invalidate Resnikoff’s results that we will discuss in the following section, it would just mean that the group \(\operatorname{GL}^{+}(\mathcal {P})\), which is supposed to act transitively on \(\mathcal {P}\), cannot be represented by changes of background illuminations. On the other hand, the hypothesis of homogeneity of \(\mathcal {P}\) seems very reasonable and nothing prevents other (at the moment unknown) transformations to be identifiable with the elements of a group acting transitively on \(\mathcal {P}\).

4 Consequences of the homogeneity axiom on the geometrical structure of \(\mathcal {P}\)

In this section we make use of the standard results of homogeneous spaces theory to prove the most important outcome of [1]. Classical references are e.g. [20, 21], and [12].

An element \(B\in\operatorname{GL}^{+}(V)\) belongs to \(\operatorname{GL}^{+}(\mathcal {P})\) if and only if \(g(\overline{\mathcal {P}})=\overline{\mathcal {P}}\) (the topological closure of \(\mathcal {P}\)), thus \(\operatorname{GL}^{+}(\mathcal {P})\) is a closed subgroup of \(\operatorname{GL}^{+}(V)\) and hence it is a Lie group itself. Moreover, for every fixed \(x\in \mathcal {P}\), the subgroup of \(\operatorname{GL}^{+}(\mathcal {P})\) defined by \(K_{x}=\{B\in\operatorname{GL}^{+}(\mathcal {P}), B(x)=x\}\) is called the stabilizer, or isotropy subgroup, of \(\operatorname{GL}^{+}(\mathcal {P})\) at x. In terms of color perception, for a fixed perceived color \(x\in \mathcal {P}\), generated by a light stimulus in a given background, \(K_{x}\) represents the subgroup of changes of background that leave the color sensation x unaltered.

Since \(\mathcal {P}\) is a homogeneous convex cone, the \(K_{x}\)s coincide with the maximal compact subgroups of \(\operatorname{GL}^{+}(\mathcal {P})\) and all the isotropy subgroups are isomorphic to each other since they are conjugated i.e. \(\forall x,y\in \mathcal {P}\) \(\exists\tilde{B}\in\operatorname{GL}^{+}(\mathcal {P})\) such that \(K_{y}=\tilde{B}K_{x}{\tilde{B}}^{-1}\). For this reason, they can be identified, and we can write simply K instead of \(K_{x}\).

The following result will be fundamental: if a differential manifold X is a G-homogeneous space w.r.t the action \(\eta: G\times X \to X\) of a Lie group G and K is the stabilizer at any fixed \(x\in X\), then the mapFootnote 4\(\beta: G/K \to X\) defined by \(\beta(gK)=\eta(g,x)\) is a diffeomorphism.

We have all the information that we need in order to give an alternative, simpler, proof of the main result of [1].

Theorem 1

Axioms 15 imply that \(\mathcal {P}\) is diffeomorphic to either

$$ \mathcal {P}_{1} \cong \mathbb {R}^{+} \times \mathbb {R}^{+} \times \mathbb {R}^{+} $$
(13)

or

$$ \mathcal {P}_{2} \cong \mathbb {R}^{+} \times\operatorname{SL}(2,\mathbb {R})/ \operatorname{SO}(2). $$
(14)

The first characterization embodies the well-known color spaces with three separated chromatic coordinates e.g. LMS, RGB, XYZ, and so on, see e.g. [22]. The second characterization obtained by Resnikoff is novel with respect to the classical flat color spaces. and it introduces the Poincaré–Lobachevsky 2-D space of constant negative curvature \(\operatorname{SL}(2,\mathbb {R})/\operatorname{SO}(2)\) in color theory.

Proof of Theorem 1

Applying the results recalled above to the homogeneous color space \(\mathcal {P}\), we get the diffeomorphic identification

$$ \mathcal {P}\cong\operatorname{GL}^{+}(\mathcal {P})/K. $$
(15)

Let us rewrite every \(B\in\operatorname{GL}^{+}(\mathcal {P})\) in the form \(B=\det(B)\frac {B}{\det(B)}\), where \(\det(B) \in \mathbb {R}^{+}\) and \(\frac{B}{\det(B)}\in \operatorname{SL}(\mathcal {P})\), where \(\operatorname{SL}(\mathcal {P})\) is the subgroup of \(\operatorname{GL}^{+}(\mathcal {P})\) given by the matrices of this group with determinant 1.

It follows that \(\operatorname{GL}^{+}(\mathcal {P})=\mathbb {R}^{+} \times\operatorname{SL}(\mathcal {P})\) and, since the isotropy subgroup of \(\mathbb {R}^{+}\) is evidently \(\{1\}\) and \(\mathbb {R}^{+}/\{1\} \cong \mathbb {R}^{+}\), the only nontrivial part of the quotient operation is on a closed subgroup K of \(\operatorname{SL}(\mathcal {P})\), thus

$$ \mathcal {P}\cong \mathbb {R}^{+} \times\operatorname{SL}(\mathcal {P})/K, $$
(16)

where both \(\mathbb {R}^{+}\) and \(\operatorname{SL}(\mathcal {P})/K\) are homogeneous spaces.

As differential manifolds, \(\mathcal {P}\) has dimension 3 and \(\mathbb {R}^{+}\) has dimension 1, so \(\operatorname{SL}(\mathcal {P})/K\) has dimension 2. Plus, \(\mathcal {P}\) and \(\mathbb {R}^{+}\) are connected and contractible, thus expression (16) implies that also \(\operatorname{SL}(\mathcal {P})/K\) is connected and contractible. Such type of spaces has been classified by Sophus Lie [23], see also [24] and [25] for a more modern treatise. It turns out that the only bidimensional connected contractible homogeneous spaces are either \(\mathbb {R}^{2}\), diffeomorphic to \(\mathbb {R}^{+}\times \mathbb {R}^{+}\) via the map \(\mathbb {R}^{2} \ni(x,y) \mapsto(\exp(x),\exp(y))\in \mathbb {R}^{+}\times \mathbb {R}^{+}\), or the hyperbolic plane \(\operatorname{SL}(2,\mathbb {R}) /\operatorname{SO}(2)\) (see Sect. 4.1 for more details about hyperbolic spaces). □

The proof provided by Resnikoff in [1] is not only much longer and difficult to follow, but it is also flawed. In fact, one of the fundamental arguments for his proof is the statement on page 112 that, whatever the dimension of \(\mathcal {P}\), the contractility of \(\mathcal {P}\) implies that the Lie group \(\operatorname{SL}(\mathcal {P})\) coincides with the exponential of its Lie algebra \(\mathfrak{sl}(\mathcal {P})\). This implication, however, is not true as we show in Appendix with a counter-example.

4.1 The models of the hyperbolic space \(\operatorname{SL}(2,\mathbb {R})/\operatorname{SO}(2)\)

With the perceived color space \(\mathcal {P}_{2}\), Resnikoff introduced in colorimetry a hyperbolic space. In his 1974 paper, he acknowledged the 1962 work of H. Yilmaz [26] which was, historically, the first one to consider hyperbolic structures to study color perception, even though with much less rigor than Resnikoff.

Differently than Euclidean spaces or spheres, hyperbolic spaces can be characterized in several equivalent ways, called models, each one useful for different purposes. For a general discussion about hyperbolic models, see e.g. [27]. Here, we just report the models of the hyperbolic space of interest for us i.e. \(\operatorname{SL}(2,\mathbb {R})/\operatorname{SO}(2)\):

  • the hyperboloid \(I^{2}=\{v\in \mathbb {R}^{3} : \langle v,v \rangle_{\mathcal{L}}=-1, v_{3} >0 \}\), where \(\langle v,v \rangle_{\mathcal{L}} = v_{1}^{2}+v_{2}^{2}-v_{3}^{2}\) is the Lorentz scalar product in \(\mathbb {R}^{3}\). The equation \(\langle v,v \rangle_{\mathcal{L}}=-1\) defines the two-sheet hyperboloid in \(\mathbb {R}^{3}\) so that \(I^{2}\) is the connected component with \(v_{3}>0\), also called the upper hyperboloid sheet;

  • the upper half plane \(H=\{(x,y)\in \mathbb {R}^{2} : y>0\}\cong\{z\in \mathbb {C}: \mathfrak{Im}(z)>0 \}\);

  • the Poincaré disk \(\mathcal{D}=\{(x,y)\in \mathbb {R}^{2} : x^{2}+y^{2} < 1\}\cong\{z\in \mathbb {C}: |z| < 1\}\);

  • \(\operatorname{Sym}_{1}^{+}(2,\mathbb {R})\), the set of \(2\times2\) real symmetric positive-definite matrices M with unitary determinant i.e. \(M^{t}=M\), \(\operatorname{det}(M)=1\) and \(u^{t}Mu>0\) for all \(u\in \mathbb {R}^{2}\).

This last characterization will be particularly important for the following. In fact, it will give us the possibility to interpret the elements of \(\mathcal {P}_{2}\) as matrices. To see how, let us define \(\operatorname{Sym}^{+}(2,\mathbb {R})\) to be the set of \(2\times2\) real symmetric positive-definite matrices, any matrix \(M\in\operatorname{Sym}^{+}(2,\mathbb {R})\) can be written as \(M=\det (M)\frac{M}{\det(M)}\) with \(\det(M)\in \mathbb {R}^{+}\) since M is positive-definite and \(N=\frac{M}{\det(M)}\in\operatorname{Sym}_{1}^{+}(2,\mathbb {R})\). This simple consideration implies that

$$ \mathcal {P}_{2} \cong\text{Sym}^{+}(2,\mathbb {R}). $$
(17)

5 Selection of invariant Riemannian metrics for the color spaces \(\mathcal{P}_{1}\) and \(\mathcal{P}_{2}\)

Once Resnikoff established the only possible geometrical structure of \(\mathcal {P}\) compatible with Axioms 1–5, he searched for Riemannian metrics on \(\mathcal {P}\) to measure color dissimilarity. As for the geometry of \(\mathcal {P}\), he uniquely singled out the metrics thanks to an invariance principle.

We recall that a Riemannian metric g on a differentiable manifold X of dimension n is a symmetric positive-definite tensor field of type \((0,2)\) on X i.e. a correspondence which assigns, smoothly with respect to each point \(x\in X\), a scalar product \(g_{x} : T_{x}X \times T_{x}X \to \mathbb {R}\), \((v,w) \mapsto g_{x}(v,w)\) for all \(v,w\in T_{x} X\), the tangent space to X in x. A differentiable manifold X endowed with a Riemannian metric g is called a Riemannian manifold \((X,g)\).

Let us also recall the local coordinate expression of g: we fix a local chart \((U,\varphi)\) of x, we write with \((x^{1},\ldots,x^{n})\) the local coordinates of x and with \((\partial_{1},\ldots,\partial _{n})\) the corresponding local basis of the tangent space \(T_{x}X\). The smooth functions \(g_{\mu\nu}\in\mathcal{C}^{\infty}(U)\), \(\mu,\nu =1,\ldots,n\), defined by \(g_{\mu\nu}=g(\partial_{\mu}, \partial_{\nu})\) verify \(g= g_{\mu\nu} \,dx^{\mu}\otimes dx^{\nu}\), where Einstein’s summation over repeated indices above and below is implicitly used. The component of \(g_{\mu\nu}\) can be organized in a symmetric matrix, and the previous expression for g is often written as \(ds^{2} = g_{\mu\nu }\, dx^{\mu}\, dx^{\nu}\).

A Riemannian manifold \((X,g)\) is also a metric space with respect to a distance canonically induced by g and defined with the help of the length of piecewise regular curves \(\gamma:[0,1] \to X\). If \((X,g)\) is a connected Riemannian manifold, then, if we define the length of the curve γ as

$$ L(\gamma)= \int_{0}^{1} \bigl\Vert \dot{\gamma}(u) \bigr\Vert _{\gamma(u)} \, du, $$
(18)

where \(\|\dot{\gamma}(u)\|_{\gamma(u)}=\sqrt{g_{\gamma(u)}(\dot{\gamma}(u),\dot{\gamma}(u))}\) is the norm induced by g, then the function \(d:X\times X\to \mathbb {R}^{+}\) defined by

$$ d(x,y)=\inf\bigl\{ L(\gamma), \gamma:[0,1]\to X \text{ piecewise regular}, \gamma(0)=x, \gamma(1)=y\bigr\} $$
(19)

is a distance on X, called the Riemannian distance on X induced by the Riemannian metric g.

A piecewise regular curve γ in X that minimizes the Riemannian distance between a pair of points \(x,y\in X\) is said to be a geodesic connecting the two points. Thus, the Riemannian distance \(d(x,y)\) can be defined as the length of a geodesic connecting x to y.

Let us now consider X as the perceived color space \(\mathcal {P}\). Since Axioms 1–5 determine the geometric structure of \(\mathcal {P}\) as a homogeneous space, Resnikoff was naturally led to search for Riemannian metrics on \(\mathcal {P}\) coherent with these axioms.

If \(x,y \in \mathcal {P}\) are the perceived colors associated with \((\textbf {x},\textbf{b})\) and \((\textbf{y},\textbf{b})\), respectively, then, after a change of background illumination B from b to \(\textbf{b}'\neq\textbf{b}\), x and y will be modified into \(x'=B(x)\in \mathcal {P}\) and \(y'=B(y)\in \mathcal {P}\).

Resnikoff wanted to analyze the consequences of the following assumption (that he called Axiom 6): if \(d:\mathcal {P}\times \mathcal {P}\to [0,+\infty)\) is the Riemannian distance on \(\mathcal {P}\) that measures perceptual differences between pairs of perceived colors \(x,y\in \mathcal {P}\), then d satisfies

$$ d\bigl(B(x),B(y)\bigr)=d(x,y), \quad\forall x,y\in \mathcal {P}, \forall B\in\operatorname{GL}^{+}(\mathcal {P}) $$
(20)

i.e. the perceptual dissimilarity between x and y is the same as that between \(x'\) and \(y'\) or, in mathematical terms, B is an isometry for the distance d. This assumption, however, must be refuted because it is not coherent with human color perception as clearly shown by the crispening effect represented in Fig. 5. The same couple of color stimuli is embedded in three different backgrounds, it is clear that the perceptual difference is not background independent.

Figure 5
figure 5

The crispening effect. The crispening effect (see text) used to refute Resnikoff’s Axiom 6 about the invariance of the perceptual color metric with respect to changes of background illuminations

It is, however, very interesting to follow Resnikoff’s argument and determine the metrics of \(\mathcal {P}\) that satisfy Eq. (20), because this will point out that those metrics are not fit to represent perceptual distances for color in context.

The request of \(\operatorname{GL}^{+}(\mathcal {P})\)-invariance permits to identify in a unique way Riemannian metrics for \(\mathcal {P}_{1}\) and \(\mathcal {P}_{2}\). First of all, let us recall that all diffeomorphism \(f:X\to Y\) induces a linear isomorphism \(df_{x}:T_{x}X\to T_{f(x)}Y\), plus, if \((X, g)\) and \((Y, h)\) are Riemannian manifolds and \(d_{h}\), \(d_{g}\) are the Riemannian distances associated with the Riemannian metrics h and g, respectively, then f is an isometry i.e. \(d_{h}(f(x),f(y))=d_{g}(x,y)\) for all \(x,y\in X\) if and only if

$$ h_{f(x)}\bigl(df_{x}(v),df_{x}(w)\bigr) = g_{x}(v,w), \quad\forall x\in X, \forall v,w\in T_{x}X. $$
(21)

In our case, by choosing \(X=Y=\mathcal {P}\) and \(f=B\), we have the possibility to reformulate the isometric condition expressed in Eq. (20) as follows:

$$ g_{B(x)}\bigl(dB_{x}(v),dB_{x}(w) \bigr)=g_{x}(v,w), \quad\forall B\in\operatorname{GL}^{+}(\mathcal {P}), \forall x\in \mathcal {P}, \forall v,w\in T_{x}(\mathcal {P}). $$
(22)

Recall now that \(\mathcal {P}\cong\operatorname{GL}^{+}(\mathcal {P})/K\) and that, by homogeneity of \(\mathcal {P}\), for every couple of elements \(x,y\in \mathcal {P}\), we can write \(y=B(x)\) for some \(B \in\operatorname{GL}^{+}(\mathcal {P})\). If we select for x the equivalent class to which the identity transformation of \(\operatorname{GL}^{+}(\mathcal {P})\) belongs i.e. the coset K itself, then, by definition, we get \(B(x)=x\) for all \(B\in K\). By transitivity, the K-invariance is independent of the choice of x, thus Eq. (22) implies that

$$ g_{x}\bigl(dB_{x}(v),dB_{x}(w) \bigr)=g_{x}(v,w), \quad\forall B\in K, \forall x\in \mathcal {P}, \forall v,w \in T_{x}(\mathcal {P}). $$
(23)

The quest for perceptual color metrics on \(\mathcal {P}\) is thus reduced to the much simpler task of searching for K-invariant Riemannian metrics for the spaces \(\mathcal {P}_{1}\) and \(\mathcal {P}_{2}\).

For \(\mathcal {P}_{1}\), \(K=\emptyset\), so K-invariance does not introduce any constraint. However, the metric must be the sum of \(\mathbb {R}^{+}\)-invariant metrics on each factor and all \(\mathbb {R}^{+}\)-invariant metrics on \(\mathbb {R}^{+}\) are proportional: once we have identified one such metric, all the others are positive multiples of it.

It is clear that an \(\mathbb {R}^{+}\)-invariant metric on \(\mathbb {R}^{+}\) is given by \(ds^{2}= (\frac{dx}{x} )^{2}\), thus the general color metric satisfying Eq. (20) on \(\mathcal {P}_{1}\) is

$$ ds^{2}= \alpha_{1} \biggl( \frac{dx_{1}}{x_{1}} \biggr)^{2} + \alpha_{2} \biggl( \frac{dx_{2}}{x_{2}} \biggr)^{2} + \alpha_{3} \biggl( \frac{dx_{3}}{x_{3}} \biggr)^{2}, \quad\alpha_{k}\in \mathbb {R}^{+}, k=1,2,3, $$
(24)

which is precisely Stiles’ generalization of Helmholtz’s metric (this last one corresponds to the particular case \(\alpha_{1}=\alpha_{2}=\alpha _{3}=1\)), see e.g. [7].

For \(\mathcal {P}_{2}\), \(K=\operatorname{SO}(2)\), so that the tangent space of \(\mathcal {P}_{2}\) at any \(x\in \mathcal {P}_{2}\) is

$$ T_{x} \mathcal {P}_{2}=\mathbb {R}\oplus T_{K} \operatorname{SL}(2,\mathbb {R})/\operatorname{SO}(2), \quad \forall x\in \mathcal {P}_{2}. $$
(25)

In this case, K-invariance means invariance under rotations, so that a background-invariant color metric for this realization of \(\mathcal {P}\) must be the sum of a one-dimensional and two-dimensional Euclidean metrics. This implies that, also for \(\mathcal {P}_{2}\), the color metric satisfying (20) is unique up to the selection of units of measure on each Cartesian factor \(\mathbb {R}^{+}\) and \(\operatorname{SL}(2,\mathbb {R})/\operatorname{SO}(2)\).

An explicit characterization of such a metric on \(\mathcal {P}_{2}\) can be written by recalling from Eq. (17) of Sect. 4.1 that \(\mathcal {P}_{2} \cong\text{Sym}^{+}(2,\mathbb {R})\), thus interpreting a perceived color x as a \(2\times2\) positive-definite real symmetric matrix.

The action of \(\operatorname{GL}(2,\mathbb {R})\) on \(\mathcal {P}\) is given by \(\operatorname{GL}(2,\mathbb {R}) \times \mathcal {P}_{2} \to \mathcal {P}_{2}\), \((A,x) \mapsto AxA^{t}\), thus, every background transformation \(B:\mathcal {P}_{2} \to \mathcal {P}_{2}\) can be parameterized by a matrix \(A\in\operatorname{GL}(2,\mathbb {R})\) and written as follows: \(B_{A}(x)=AxA^{t}\). It turns out that every \(\operatorname{GL}(2,\mathbb {R})\)-invariant Riemannian metric on \(\mathcal {P}_{2}\) is a scalar multiple of the so-called Rao–Siegel metric [2830]

$$ ds^{2} = \operatorname{Tr}\bigl(x^{-1} dx x^{-1} dx\bigr), $$
(26)

Tr being the matrix trace. Let us verify the \(\operatorname{GL}(2,\mathbb {R})\)-invariance: first of all notice that \(B_{A}(x)^{-1}={(A^{t})}^{-1} x^{-1} A^{-1}\) and that, by linearity, \(dB_{A}(x) = A \,dx A^{t}\). So

$$\begin{aligned} \operatorname{Tr}\bigl(B_{A}(x)^{-1} \,dB_{A}(x) B_{A}(x)^{-1} \,dB_{A}(x)\bigr) = &\operatorname{Tr}\bigl({\bigl(A^{t} \bigr)}^{-1} x^{-1} A^{-1} A \, dx A^{t} {\bigl(A^{t}\bigr)}^{-1} x^{-1} A^{-1} A \, dx A^{t}\bigr) \\ = &\operatorname{Tr}\bigl({\bigl(A^{t}\bigr)}^{-1} x^{-1} \,dx x^{-1} \, dx A^{t}\bigr). \end{aligned}$$
(27)

By using the cyclic property of the trace, we have

$$ \operatorname{Tr}\bigl(B_{A}(x)^{-1} \,dB_{A}(x) B_{A}(x)^{-1} \,dB_{A}(x)\bigr) = \operatorname{Tr}\bigl(x^{-1} dx x^{-1} \, dx\bigr), $$
(28)

\(\forall B_{A}\in\operatorname{GL}(2,\mathbb {R})\), thus confirming the \(\operatorname{GL}(2,\mathbb {R})\)-invariance.

To resume, the Helmholtz–Stiles metric on \(\mathcal {P}_{1}\) and the Rao–Fisher metric on \(\mathcal {P}_{2}\) cannot be considered perceptual metrics for color in context since they violate the crispening effect. By color in context we mean color stimuli perceived in a visual scene in which the background can undergo temporal and/or spatial changes. Of course, the crispening effect does not disqualify the metrics above when the background is fixed. However, in this case, we cannot use anymore the argument about the invariance under background changes to single them out.

In 1974, at the time of Resnikoff’s paper [1], only a few papers about this perceptual phenomenon were available. This explains why, not being aware of it, he wrongly assumed that a perceptual color metric should be background-invariant.

6 Conclusions

A detailed analysis of the Resnikoff model and his homogeneity axiom for the space of perceived colors \(\mathcal {P}\) led us to the following results: first of all, we have provided an alternative and much simpler proof of the main result contained in [1] i.e. the existence of only two geometric structures compatible with Schrödinger’s axioms together with Resnikoff’s homogeneity one. We have also shown, via a counter-example, that the original proof is flawed by a mathematical assumption that is not true.

Secondly, we have underlined the exigence of developing psycho-physical experiments to check the linearity of background transformations that Resnikoff supposed to be identified with the group of symmetries acting transitively on \(\mathcal {P}\). A proposal for such an experiment has been detailed.

Finally, we have discussed Resnikoff’s hypothesis about the isometric character of background transformations with respect to the Riemannian metrics on \(\mathcal {P}\) that should represent perceptual color differences in his theory. The crispening effect shows that Resnikoff’s hypothesis must be refuted; thus, both the Helmholtz–Stiles and the Fisher–Rao metrics, singled out by using this hypothesis, cannot be perceptual color distances in context.

In the second half of this two-part paper, \(\mathcal {P}\) will be analyzed by using a different strategy that relies on Jordan algebras. The link between the two parts is given by the following consideration: Schrödinger’s axioms imply that \(\mathcal {P}\) is a regular convex cone embedded in a real vector space of dimension three. If we accept Resnikoff’s homogeneity axiom, then \(\mathcal {P}\) becomes an open regular homogeneous convex cone. By adding the last hypothesis, the so-called self-duality, \(\mathcal {P}\) becomes a symmetric cone, and it turns out that these objects can be identified with the positive elements of a (formally real) Jordan algebra. The rich mathematical results associated with Jordan algebras will permit building a novel, quantum-like, theory of \(\mathcal {P}\).

Notes

  1. Resnikoff avoids using the term metameric equivalence, which refers to the case when x and y have different spectral radiant power distribution, but they generate the same tristimulus values [7] i.e. the weights of three fixed primary colors that are needed to match a reference color. Following Resnikoff, we do not employ the metameric equivalence because the primaries do not intervene in his model.

  2. This assumption is true for non-coherent light because, for coherent light, destructive interference can extinguish light intensity in certain spatial positions when two light beams are superposed.

  3. That is to say, the only Riemannian distance for which the multiplication by a positive scalar is an isometry, see Sect. 5 for more details about this.

  4. We recall that, given a topological group G, a normal subgroup H of G is a subgroup of G such that \(gH=Hg\) \(\forall g\in G\), where \(gH=\{gh, h\in H\}\) is the left coset of H in G w.r.t g and \(Hg=\{hg, h\in H\}\) is the right coset of H in G w.r.t g. Given a normal closed subgroup H of G, the quotient (or factor) group \(G/H\) is the group of all cosets (left or right, since they are the same because H is normal) with the following group structure: \((gH) (g'H) = (gg')H\).

References

  1. Resnikoff HL. Differential geometry and color perception. J Math Biol. 1974;1:97–131.

    Article  MathSciNet  Google Scholar 

  2. Resnikoff HL. On the geometry of color perception. In: Some mathematical questions in biology. VI; 1974. p. 217–32. (Lectures on mathematics in the life sciences; vol. 7).

    Google Scholar 

  3. Resnikoff HL. On the psychophysical function. J Math Biol. 1975;2(3):265–76.

    Article  MathSciNet  Google Scholar 

  4. Resnikoff HL. The illusion of reality. Berlin: Springer; 2012.

    MATH  Google Scholar 

  5. Niall KK, editor. Erwin Schrödinger’s color theory: translated with modern commentary; 2017.

    Google Scholar 

  6. Schrödinger E. Collected papers on wave mechanics. Providence: American Mathematical Society; 2003.

    Google Scholar 

  7. Wyszecky G, Stiles WS. Color science: concepts and methods, quantitative data and formulas. New York: Wiley; 1982.

    Google Scholar 

  8. Schanda J. Colorimetry: Understanding the CIE system. New York: Wiley; 2007.

    Book  Google Scholar 

  9. Rudd ME, Zemach IK. Quantitive properties of achromatic color induction: an edge integration analysis. Vis Res. 2004;44:971–81.

    Article  Google Scholar 

  10. Gronchi G, Provenzi E. A variational model for context-driven effects in perception and cognition. J Math Psychol. 2017;77:124–41.

    Article  MathSciNet  Google Scholar 

  11. Dubois E. The structure and properties of color spaces and the representation of color images; 2009. (Synthesis lectures on image, video, and multimedia processing; vol. 4).

    Google Scholar 

  12. Faraut J, Koranyi A. Analysis on symmetric cones. Oxford: Clarendon Press; 1994.

    MATH  Google Scholar 

  13. Schrödinger E. Grundlinien einer Theorie der Farbenmetrik im Tagessehen (Outline of a theory of colour measurement for daylight vision). Available in English in: MacAdam DL, editor. Sources of colour science. Cambridge: MIT Press; 1970. p. 134–182. Ann Phys. 1920;63(4):397–456; 481–520.

  14. Newton I. Opticks, or, a treatise of the reflections, refractions, inflections & colours of light. North Chelmsford: Courier Corporation; 1952.

    Google Scholar 

  15. Grassmann H. Zur theorie der farbenmischung. Ann Phys. 1853;165(5):69–84.

    Article  Google Scholar 

  16. von Helmholtz H, Southall JPC. Treatise on physiological optics. Vol. 3. North Chelmsford: Courier Corporation; 2005.

    Google Scholar 

  17. Munkres J. Topology. 2nd ed. Upper Saddle River: Pearson; 2000.

    MATH  Google Scholar 

  18. Ashtekar A, Corichi A, Pierri M. Geometry in color perception. In: Black holes, gravitational radiation and the universe; 1999. p. 535–50.

    Chapter  Google Scholar 

  19. Wallach H. Brightness constancy and the nature of achromatic colors. J Exp Psychol. 1948;38(3):310–24.

    Article  Google Scholar 

  20. Helgason S. Differential geometry, Lie groups, and symmetric spaces. New York: Academic Press; 1979. (Pure and applied mathematics; vol. 80).

    MATH  Google Scholar 

  21. Warner FW. Foundations of differentiable manifolds and Lie groups. Berlin: Springer; 2013. (Graduate texts in mathematics; vol. 94).

    MATH  Google Scholar 

  22. Gonzales RC, Woods RE. Digital image processing. New York: Prentice Hall: 2002.

    Google Scholar 

  23. Lie S. Theorie der transformationsgruppen III. Leipzig: Teubner; 1893.

    MATH  Google Scholar 

  24. Komrakov B, Churyumov A, Doubrov B. Two-dimensional homogeneous spaces. Matematisk Institutt, Universitetet i Oslo (1993).

  25. Doubrov B, Komrakov B. Low-dimensional pseudo-Riemannian homogeneous spaces. Matematisk Institutt, Universitetet i Oslo (1995).

  26. Yilmaz H. Color vision and a new approach to general perception. In: Biological prototypes and synthetic systems; 1962. p. 126–41.

    Chapter  Google Scholar 

  27. Martelli B. An introduction to geometric topology. arXiv:1610.02592 (2016).

  28. Amari S. Differential-geometrical methods in statistics. Berlin: Springer; 2012. (Lecture notes in statistics; vol. 28).

    MATH  Google Scholar 

  29. Calvo M, Oller JM. A distance between multivariate normal distributions based in an embedding into the Siegel group. J Multivar Anal. 1990;35(2):223–42.

    Article  MathSciNet  Google Scholar 

  30. Siegel CL. Symplectic geometry. Amsterdam: Elsevier; 2014.

    MATH  Google Scholar 

Download references

Acknowledgements

This paper is part of a program for a geometric re-foundation of colorimetry inspired by the brilliant work of H.L. Resnikoff (1937–2018). This paper is dedicated to his memory. The counter-example and the novel proof of the main result of Sect. 4 have been developed during the collaboration with Francesco Bottacin for the supervision of the master thesis of Fiammetta Cirrone at the University of Padova, both of them are warmly acknowledged.

Availability of data and materials

The images contained in this manuscript are available upon request to the author.

Funding

Partial funding for this paper has been provided by the 80primes grant from the French CNRS.

Author information

Authors and Affiliations

Authors

Contributions

The author wrote the whole paper. The author read and approved the final manuscript.

Corresponding author

Correspondence to Edoardo Provenzi.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Competing interests

The author declares that he has no competing interests.

Consent for publication

Not applicable.

Appendix

Appendix

We discuss in this appendix a counter-example which shows that it is not true, as claimed by Resnikoff, that the contractility of \(\mathcal {P}\) implies \(\exp(\mathfrak{sl}(\mathcal {P}))=\operatorname{SL}(\mathcal {P})\).

The argument can be discussed already for the case \(\operatorname{SL}(\mathcal {P})=\operatorname{SL}(2,\mathbb {R})\) and its Lie algebra \(\mathfrak{sl}(2,\mathbb {R})\) given by the real \(2\times2\) traceless matrices. If the exponential map \(\exp:\mathfrak {sl}(2,\mathbb {R}) \to\operatorname{SL}(2,\mathbb {R})\) were onto, then the matrix

$$ T= \begin{pmatrix} -1 & 1 \\ 0 & -1 \end{pmatrix} \in\operatorname{SL}(2,\mathbb {R}) $$

could be written as \(T=\exp(A)\) for a suitable \(A\in\mathfrak {sl}(2,\mathbb {R})\). Thanks to the well-known Schur decomposition theorem, A is similar to an upper triangular matrix U whose diagonal elements are the eigenvalues of A i.e. \(A=PUP^{-1}\), where \(P\in\operatorname{GL}(2,\mathbb {C})\). However, thanks to its cyclic property, the trace is similarity-invariant, so \(\operatorname{Tr}(A)=\operatorname{Tr}(U)\) and, being \(\operatorname{Tr}(A)=0\), it follows that the U must have the following form:

$$ U= \begin{pmatrix} \alpha& \beta\\ 0 & -\alpha \end{pmatrix} , $$

\(\alpha,\beta\in \mathbb {C}\), so that

$$ A=P \begin{pmatrix} \alpha& \beta\\ 0 & -\alpha \end{pmatrix} P^{-1}, $$

α and −α being the eigenvalues of A. Recalling that \(\exp(A)=\sum_{n\in \mathbb {N}} \frac{A^{n}}{n!}\), we have that

$$ T=\exp(A)=\sum_{n\in \mathbb {N}}\frac{(PUP^{-1})^{n}}{n!}=P \biggl(\sum_{n\in \mathbb {N}}\frac{U^{n}}{n!} \biggr)P^{-1} =P\exp \begin{pmatrix} \alpha& \beta\\ 0 & -\alpha \end{pmatrix} P^{-1}. $$

We can now show the contradiction. First of all, if \(\alpha\neq0\), then the Schur decomposition theorem guarantees that α and −α are two distinct eigenvalues of the \(2\times2\) matrix A i.e. A is similar to a diagonal matrix: there exists \(Q\in\operatorname{GL}(2,\mathbb {C})\) such that \(A=QDQ^{-1}\) with \(D= \operatorname{diag}(\alpha,-\alpha)\). But then, \(T=\exp(A)=Q\exp(D)Q^{-1}\) with \(\exp(D)=\operatorname{diag}(e^{\alpha},e^{-\alpha})\), which contradicts the fact that T is clearly not diagonalizable.

If, instead, \(\alpha=0\), then

$$ \exp \begin{pmatrix} 0 & \beta\\ 0 & 0 \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} + \begin{pmatrix} 0 & \beta\\ 0 & 0 \end{pmatrix} + \sum_{n=2}^{\infty}\frac{1}{n!} \begin{pmatrix} 0 & 0 \\ 0 & 0 \end{pmatrix} = \begin{pmatrix} 1 & \beta\\ 0 & 1 \end{pmatrix} , $$

which implies \(\operatorname{Tr}(T)=-2\neq\operatorname{Tr} ( \exp(U) \vert _{\alpha =0} )=2\), but this cannot be true because it contradicts the similarity-invariance of the trace.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Provenzi, E. Geometry of color perception. Part 1: structures and metrics of a homogeneous color space. J. Math. Neurosc. 10, 7 (2020). https://doi.org/10.1186/s13408-020-00084-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13408-020-00084-x

Keywords