 Research
 Open access
 Published:
A subRiemannian model of the visual cortex with frequency and phase
The Journal of Mathematical Neuroscience volume 10, Article number: 11 (2020)
Abstract
In this paper, we present a novel model of the primary visual cortex (V1) based on orientation, frequency, and phase selective behavior of V1 simple cells. We start from the firstlevel mechanisms of visual perception, receptive profiles. The model interprets V1 as a fiber bundle over the twodimensional retinal plane by introducing orientation, frequency, and phase as intrinsic variables. Each receptive profile on the fiber is mathematically interpreted as rotated, frequency modulated, and phase shifted Gabor function. We start from the Gabor function and show that it induces in a natural way the model geometry and the associated horizontal connectivity modeling of the neural connectivity patterns in V1. We provide an image enhancement algorithm employing the model framework. The algorithm is capable of exploiting not only orientation but also frequency and phase information existing intrinsically in a twodimensional input image. We provide the experimental results corresponding to the enhancement algorithm.
1 Introduction
The question of how we perceive has been an intriguing topic for different disciplines. One of the first schools that faced the problem is the Berlin school of experimental psychology, called Gestalt psychology school [68, 69, 103], which formulates precise laws explaining visual perception. The Gestalt psychology is a theory for understanding the principles underlying the emergence of perceptual units as the result of a grouping process. The main idea is that perception is a global phenomenon, which considers the scene as a whole and is much more than the pure sum of local perception. The first perceptual laws are of qualitative type, based on similarity, closure, good continuation, and alignment.
After that, there have been many psychophysical studies that attempted to provide a quantitative version of the grouping process. With the development of neuroscience studies, researchers started to look for cortical implementation of Gestalt laws, with a particular attention to neural architectures of the visual cortex. A particularly important one for our study is the pioneering work of Field et al. [43], which models Gestalt principles of good continuation and alignment. They experimentally proved that fragments aligned along a curvilinear path can be perceived as a unique perceptual unit much better than fragments with rapidly changing orientations. The results of their experiments were summarized in a representation, called association fields, which represent a complete set of paths with fixed initial position and orientation, which can be perceived as perceptual units.
The visual cortex is a part of the mammalian brain responsible for the firstlevel processing tasks of perceptual organization of local visual features in a visual stimulus (twodimensional image). It is known from neurophysiological experiments that the visual cortex contains neurons (simple cells) that are locally sensitive to several visual features, namely, orientation [54–56, 58], spatial frequency [59–61, 77, 89, 90, 97, 98], phase [30, 72, 80, 86], scale [10], and ocular dominance [61, 71, 95].
The simple cells are organized in a hypercolumnar architecture, which was first discovered by Hubel and Wiesel [57]. In this architecture, a hypercolumn is assigned to each point \((x,y)\) of the retinal plane \(M\simeq \mathbb {R}^{2}\) (if we disregard the isomorphic cortical mapping between retinal and cortical planes), and the hypercolumn contains all the simple cells sensitive to a particular value of the same feature type. The simple cells are able to locally detect features of the visual stimulus, and neural connectivity between the simple cells integrates them in a coherent global unity. Those two mechanisms, the feature detection and the neural connectivity, comprise the functional geometry of V1.
1.1 Previous models and applications
Several models were proposed for the functional geometry of V1 associated with the simple cells that were only orientation sensitive. Early models date back to the 1980s. Koenderink and van Doorn [66, 67] revealed the similarity between Gaussian derivative functions and simple cell receptive profiles. They proposed visual models based on the functions of Gaussian derivatives as mathematical representations of the receptive profiles. Their findings indeed encouraged many studies relying on the choice of a family of Gaussian derivative functions and Gaussian kernels, among which we would like to mention the works of Young [104] and Lindeberg [74, 76].
A different modeling approach from the mentioned ones was employing Gabor functions as mathematical representations of the orientationsensitive simple cell receptive profiles. The motivation for this choice was relying on an uncertainty principle as was elaborated by Daugman [29] through a generalization of the hypothesis of Marĉelja [78] (see also [62], where Jones and Palmer compared statistically the results obtained via Gabor functions and the neurophysiological results collected from V1 of a cat). Furthermore, Hoffman [51, 52] proposed to model the hypercolumnar architecture of V1 as a fiber bundle. Following the second school (which uses the Gabor functions) and further developing the model proposed by Petitot and Tondut [84] (see also Petitot [82, 83]), where hypercolumnar architecture was interpreted as a fiber bundle associated with a contact geometry, Citti and Sarti [25] introduced a groupbased approach. They proposed a new model of the functional geometry of V1, which considered the subRiemannian geometry of the rototranslation group (\(\mathrm {SE}(2)\)) as a suitable model geometry. The main reason for employing \(\mathrm {SE}(2)\) geometry was due to that the corresponding Lie algebra to \(\mathrm {SE}(2)\) was providing a good model of the actual neural connectivity in V1. The model proposed in [25] has been extended to other visual features in addition to orientation, such as scale by Sarti et al. [92], and to other cell types such as complex cells sensitive to velocity and movement direction by Barbieri et al. [2] and Cocci et al. [27].
Furthermore, image processing applications employing Gabor transform to extract visual features from medical images were proposed by Duits et al. [38] (see also [94], where the phase of complexvalued outputs obtained from a multiscale orientation decomposition represented in a similarity group). In [38] an image enhancement together with an inpainting procedure was explained. Their method performs a multifrequency and multiphase Gabor transform on a twodimensional image. The result of the transform is represented in the fivedimensional subRiemannian geometry of a reduced Heisenberg group. A combination of preprocessing via leftinvariant diffusion and leftinvariant convection–diffusion is applied for processing the lifted images to the Heisenberg group geometry. The convection provides sharpening of the image. Evolving phase of the diffused Gabor coefficients is handled via phasecovariant leftinvariant convection–diffusion procedure by taking advantage of a Gabor transform. This Gabor transform takes into account spatial frequencies along both horizontal and vertical axes, differently from our case, where we consider spatial frequencies perpendicular to a single reference axis and provide a frequency decomposition along this axis for each orientation. Their procedure deals with the diffusion taking place in both phase and frequency dimensions, and it is applied to the detection of the cardiac wall deformations in MRItagging images. Our framework differs from this setting in two points. First of all, our geometry is derived from the receptive profile, Gabor function. The geometry is a natural result of the choice of the receptive profile. In other words, it is not assigned to the result of the Gabor transform as in [38]. Secondly, the model geometry we propose is not restricted to image processing. The image enhancement application we propose uses multifrequency channels. Its performance is comparable to multiscale methods and provides an improvement in comparison to a singlefrequency method using a projected version of our model geometry. It uses pure Laplace–Beltrami evolution without any combination of pre/postprocessing techniques or any additional nonlinear procedure. The motivation of the algorithm we propose is rather showing the importance of inclusion of frequency in cortical modeling and its potential to be an alternative framework to the multiscale image enhancement settings.
Other applications in medical image analysis employing scale and orientation information can be found in [18] and [63], where the Gabor transform is employed for the detection of local frequencies in tagging MRI (magnetic resonance imaging) images and thus for the computation of local frequency deformations in those images. The interested reader can also refer to [42] for different applications of geometric approach in computer vision and robotics. Additionally to those studies, the models in terms of cortical orientation and orientationfrequency selectivity, provided by Bressloff and Cowan [16, 17], can be useful references for the reader. We refer to [26] for a review of several cortical models including many mentioned ones.
We would like to address the reader to [14, 79], and [105] for some other both theoretical and practical aspects provided for subRiemannian image reconstruction/inpainting techniques. Moreover, such aspects are also provided and used in optimal control theory; see, for example, [13, 34]. The reader can refer to [19, 20] for image denoising approaches based on totalvariation flows and to [21] for optimization approach used in image denoising and deblurring. Finally, we refer to [15] for a curvaturebased approach, which can be applied to image denoising and inpainting.
1.2 Neurophysiological motivation
The theoretical criterion underpinning the modeling we propose in this paper relies on the socalled neurogeometrical approach described by Citti and Sarti [25], Petitot and Tondut [84], and Sarti et al. [92]. Following this approach, processing capabilities of sensorial cortices and in particular of the visual cortex are modeled based on the geometrical structure of cortical neural connectivity. Global and local symmetries of the visual stimuli are captured by the cortical structure, which is invariant under those symmetries (see Sanguinetti et al. [91]). We follow a similar framework and will start from the firstlevel perceptual tasks performed by the simple cells from local feature extraction. This starting point will lead us to the model geometry of V1 associated with the simple cells sensitive to orientation, spatial frequency, and phase information at each position in a given twodimensional image.
At the level of Gestalt organization, the neurogeometrical architecture in \(\mathrm {SE}(2)\) [25] implements the psychophysical law of good continuation. The architecture in the affine group [92] implements good continuation and ladder (parallel chain of contours). The architecture in the Galilean group [2, 27] implements common fate. Finally, the architecture we consider here in a Gaborbased subRiemannian geometry implements similarity between textures/patterns and contains all the previous models employing the neurogeometrical approach.
It is known that the simple cells are selective not only to orientation but also to spatial frequency and the adjacent cells may have different phases. It was experimentally shown in [77] that the cortical architecture of a cat regarding spatial frequency of the stimulus is complementary to the hypercolumnar architecture associated with orientation. It was reported in [77] that the cortical neurons have the same preferred orientation but a variety of spatial frequency values along the electrode penetrations perpendicular to the cortical surface (i.e., along columns) and vice versa was valid for the penetrations parallel to the surface.
Deoxyglucose uptake increases in the regions of the brain where the neural activity increases. It was reported in [101] that the cats that were exposed to visual patterns containing a single spatial frequency and all orientations show columns of increased deoxyglucose uptake extending through all cortical layers. On the other hand, a stimulus containing all spatial frequencies and all orientations does not result in any difference in columnar density. Similar columnar organizations were reported in [96] and [12]. In [11], it was discovered that frequency preference maps were organized in frequency pinwheels around which all possible preferred frequency values are represented in an analogous way to the orientation pinwheels. In addition to those, it was shown in [61] that a wide range of frequency values were represented continuously in V1. Domains of different preferred frequency values were separated by \(3/4\) mm (as in the case of hypercolumnar organization of orientation selectivity) at the extremes of the frequency continuum. Those frequency extremes were found mostly at the pinwheels. Finally, in [31], it was shown that the simple cells that are locally within the same cortical region respond to different frequencies. The range of the preferred frequency values of a cell does not overlap with the values of another nearby located one in the same cortical region. Those studies suggest a similar organization of frequencies to the orientation organization discovered by Hubel and Wiesel [55, 56].
Moreover, in [31] the simple cells were shown to be phase selective via a procedure based on the EnrothCugell–Robson null phase test [41] for spatial summation. The cell fires, does not respond, or performs inhibition on the grating pattern depending on the spatial position of the pattern with respect to the receptive field of the cell. Furthermore, Pollen and Ronner [87, 88] reported from their experiments that the adjacent simple cells in cat V1 that have a common preference for orientation and spatial frequency differ in spatial phase from each other by approximately \(\pi /2\). This result is in coherence with that the receptive fields are conjugate pairs, that is, one even symmetric pair and one odd symmetric pair located around the same axis. Those experimental results support our choice of Gabor functions in such a way that adjacent simple cells can be interpreted as paired sine and cosine filters or Gabor functions. Finally, those aforementioned studies provide a neurophysiological basis for our choice of modeling the cortical architecture associated with frequency and phase selectivity in a similar columnar fashion as in the orientation selectivity case.
1.3 Choice of receptive profile
Once the light reflects from a visual stimulus and arrives at the retina, it evokes some spikes, which are transmitted along the neural pathways to the simple cells in V1. Each simple cell gives a response called a receptive profile to those spikes. In other words, a receptive profile is the impulse response of a simple cell. The simple cells extract the information of local visual features by using their receptive profiles, and it is possible to represent the extracted features mathematically in a higherdimensional space than in the given twodimensional image plane. We call this space the lifted space or the lifted geometry. We will use an extended Gabor function as the receptive profile of the simple cells. We will see that this choice naturally induces the corresponding Lie algebra of the subRiemannian structure, which is the corresponding lifted geometry to our model. The Lie algebra and its integral curves model neural connectivity between the simple cells. Moreover, since some pairs of the algebra are not commutative, it is possible to formulate an uncertainty principle, and this principle is satisfied by the extended Gabor function. That is, the extended Gabor function minimizes uncertainties arising from simultaneous detection of frequencyphase and simultaneous detection of positionorientation (see also [33, Sect. 7.5], [1, 3, 4], and [94] for similar phenomena in different frameworks).
Concerning the question of which family of functions to use as receptive profiles, let us recall that receptive field models consisting of cascades of linear filters and static nonlinearities may be adequate to account for responses to simple stimuli such as gratings and random checkerboards, but their predictions of responses to complicated stimuli (such as natural scenes) are correct only approximately. A variety of mechanisms such as response normalization, gain controls, crossorientation suppression, and intracortical modulation can intervene to change radically the shape of the profile. Then any static and linear model for the receptive profiles has to be considered just as a very first approximation of the complex behavior of a real dynamic receptive profile, which is not perfectly described by any of the static wavelet frames.
For example, the derivatives or differences of Gaussian functions are suitable approximations of the behavior of classical receptive profiles of the simple cells. Lindeberg [75, 76] starts from certain symmetry properties of the surrounding world and derives axiomatically the functions of Gaussian derivatives obtained from the extension of the family of rotationally symmetric Gaussian kernels to the family of affine Gaussian kernels, and proposes to model the simple cell receptive fields in terms of those Gaussian derivatives (see also Koenderink [66, 67], Young [104], and Landy and Movshon [70]). Indeed, Gaussian functions are good models of the receptive profiles if we restrict ourselves to the visual features except for frequency and phase. They provide good results for orientation and scale detection as shown by the scalespace school (see, e.g., the works of Lindeberg [73, 74, 76], Florack [44], ter Haar Romeny [99, 100], and Hannink et al. [50]). However, we are interested here in twodimensional visual perception based on orientation, frequency, and phasesensitive simple cells. Differently from the case with orientationscale sensitive simple cells, frequencyphase sensitive simple cells cannot be modeled in a straightforward way by Gaussian derivative functions. A different order Gaussian derivative must be used for the extraction of each frequency component of a given image. This requires the use of different functions, each corresponding to a certain frequency and thus to a certainorder derivative. In other words, the frequency is not a parameter as in the case of scale, but each frequency corresponds to a different function. It is not possible to derive a natural geometry starting from the derivatives of the Gaussian, and it is rather required to assign an adequate geometric setting to the set of extracted feature values by the Gaussian derivatives to represent those values.
At this point, a Gabor function seems to be a good candidate for the detection of different orientation, frequency, and phase values in a twodimensional image, since orientation, frequency, and phase are parameters of the Gabor function. In other words, instead of using different functions, we can use a single Gabor function corresponding to a set of parameter values to detect different feature values. In this way, we obtain a subRiemannian model geometry as the natural geometry induced directly by the Gabor function (i.e., by the receptive profile itself).
Moreover, the Gabor function is able to model both asymmetric simple cells and even/odd symmetric simple cells thanks to its phase offset term appearing in its wave content, whereas the functions of the Gaussian derivatives account only for the symmetric simple cells.
We take into account those aforementioned points and propose to use a Gabor function with frequency and phase parameters as the receptive profile model. The Gabor function allows us to extend the model provided in [25] to the true distribution of the profiles in V1 (including the asymmetric receptive profiles with phase shifts) in a straightforward way. Finally, we would like to refer to Duits and Franken [35–37], Franken and Duits [47], Sharma and Duits [94], Zhang et al. [106], and Bekkers et al. [9] for information about applications that employ other wavelets corresponding to unitary transforms for feature extraction.
Finally, the studies on group convolutional neural networks (GCNN) are to be mentioned. A particularly relevant one to \(\mathrm {SE}(2)\) subRiemannian geometry is explained by Bekkers [8]. Those neural networks use several neural layers for extraction and representation of the features necessary to perform proper highlevel visual tasks such as object recognition. Feature extraction and the representation of the extracted features take advantage of the lifting of the image to a proper subRiemannian geometry (e.g., \(\mathrm {SE}(2)\) geometry). Differently from the aforementioned approaches using a model function as a receptive profile, GCNN learns the receptive profile through a feedback mechanism updating an initial arbitrary kernel by comparing the outputs of the whole network with the objects in the input image. The outputs are recognized objects by the neural network. We refer to [40] and [8] for a recent overview.
1.4 Novelties
Here we consider the model framework provided in [25] as the departure point of our study. We extend this model from orientation selective framework to an orientation, frequency, and phase selective framework. Furthermore, we provide the neural connectivity among the simple cells not only orientation selective but also frequency selective with different phases. Thanks to the use of all frequency components of the Gabor functions, the Gabor transform can be followed by an exact inverse Gabor transform, which was not the case in the model presented in [25] since a single frequency component of the Gabor function was used. The projection of our generalized model onto \(\mathrm {SE}(2)\) can be considered as equivalent to the model provided in [25]. The procedure we use to obtain the extended framework can be employed for the extension to a model associated with orientationscale selective simple cells as well (see [7]).
We provide an image enhancement algorithm based on the Laplace–Beltrami procedure applied on all frequency channels in a reduced version of the model geometry. The reduced framework isolates each frequency and eliminates the activity propagation between different frequencies and phases. The Laplace–Beltrami procedure is then applied separately in each frequency channel and at a single phase, since in the reduced framework, each phase corresponds to a rotated version of the same image. It is an approximation of the Laplace–Beltrami procedure taking advantage of the full model geometry, and it partially employs the model geometry. In turn, it removes the excessive diffusion, which may destroy object boundaries and elongated structures in the image. Furthermore, it allows us to perform a threedimensional Laplace–Beltrami procedure at every frequency instead of a fivedimensional Laplace–Beltrami in the full geometry, avoiding high computational and memory load, which is considerably heavy in the full fivedimensional model geometry. Finally, it provides the flexibility to process only the significant frequencies without altering the other frequency channels, which can be important in some texture images. This algorithm is inspired by the enhancement technique explained in [64], and it is different on three main points. The first point is that the technique in [64] relies on a Riemannian geometry and thus a Riemannian metric structure. We work in a subRiemannian geometry endowed with a subRiemannian metric. This results in that the subRiemannian differential operators are degenerate, that is, they perform in a subspace of the tangent space. The second point is that the technique presented in [64] employs a multiscale Gabor transform with fixed frequency and employs the scale as the additional feature to the orientation, whereas we employ a multifrequency Gabor transform with fixed scale and use the frequency as the additional feature for the enhancement. This results in that we can use exact the inverse Gabor transform when we project the processed lifted image in G to the twodimensional image plane. The final point is the use of color. We study only grayscale images, but our approach has the potential to be extended to color images. Moreover, our subRiemannian model geometry is neurally inspired and induced by the receptive profiles. This is not the case in [64], where a proper Riemannian geometry is chosen for the image enhancement task, and it is not biologically motivated. Finally, we remark that the approach we use here for the construction of the model geometry is generic, and it provides a coherent way to derive the natural model geometry arising from the model function of the receptive profile.
1.5 Outline
We will see in Sect. 2 the model structure. We will show how the model geometry with the associated horizontal connectivity can be derived starting from the receptive profile model, that is, from the Gabor function. Then in Sect. 3, we will provide explicit expressions of the horizontal integral curves with constant coefficients, which are considered as the models of the association fields in V1. Finally, in Sect. 4, we provide an image enhancement algorithm using the model framework together with the results obtained by applying a discrete version of the algorithm on some test images. We provide a conclusion in Sect. 5.
2 The model
The model is based on two mechanisms. The first one is the linear feature extraction mechanism. The second mechanism is the horizontal connectivity, which models the neural connectivity in V1. We describe the model by using those two mechanisms in terms of both group and subRiemannian structures.
2.1 Feature extraction and representation
2.1.1 Receptive profiles, symplectic structure, and contact form
Inspired by the receptive profile models proposed in [25] for the orientation selective behavior and in [2, 28, 32] for the spatiotemporal behavior of the simple cells, we propose to represent the receptive profile of a simple cell in our setting with the Gabor functions of the type
with spatial frequency^{Footnote 1}\(\omega >0\) and \(r=(r_{1}, r_{2})=(\omega \sin \theta, \omega \cos \theta )\), where we represent a point in a sixdimensional space \(\mathcal{N}\) with \(\alpha =(q_{1}, q_{2},\phi, r_{1}, r_{2}, v)\in \mathbb {R}^{6}\). The complex exponential is the wave content, and it is the main component detecting orientation, frequency, and phase of the objects in the given twodimensional image. The second exponential is the Gaussian window. It provides the spatial localization around the point \((q_{1}, q_{2})\). The frequency ω determines how many wave peaks are found within the localizing window scaled by the Gaussian. The higher the number of wave peaks, the finer the structures that the Gabor can detect. The orientation θ determines how much the axis along which the waves lie deviate from the horizontal axis. It coincides with the orientation to which the simple cell associated with the Gabor function is sensitive. The parameter ϕ is the reference phase value, and it introduces a phase shift in the waves of the Gabor function as it varies. In the case of V1 motion sensitive cells with spatiotemporal dynamics, the term v represents the velocity of a twodimensional plane wave propagation (see Barbieri et al. [2] for details). In our framework, we interpret \(s\phi \) as the phase centered at ϕ in a static sense. We are not interested in any temporal behavior or in motion sensitivity. Therefore we will fix v to a finite number.
Note that the coordinate variables are \((x,y,\theta,\omega, s)\). Here \((q_{1}, q_{2})\) and ϕ should be considered as fixed parameter values since they are the reference spatial position and phase values.
We may express α as a complex number \(\alpha =(q_{1},q_{2},\phi )+i(r_{1},r_{2},v)\in \mathbb{C}^{3}\) and write the symplectic structure defined on the complex structure \(\mathbb{C}^{3}\) (in which α is a fixed point) as follows [2, 45]:
The standard Liouville form follows from the symplectic structure by definition as
As was mentioned previously, we may fix v to a finite number in our static case. We choose \(v=1\) for simplicity.
In this way, we obtain the fivedimensional space
where z denotes the feature variables \((\theta, \omega ,\phi )\in S^{1}\times \mathbb {R}^{+}\times S^{1}\). Then we may write the associated Gabor function centered at \(q\in M\) and sensitive to feature values z by using (1) as follows:
The standard Liouville form \(r_{1}\,dx+r_{2}\,dy  v \,ds\) reduces to
Indeed, Θ is a contact form since
is a volume form. In other words, it is maximally nondegenerate and does not vanish at any point on the manifold \(\mathcal{M}\).
The features which we would like to measure are embedded in the receptive profile. Those features manifest themselves as the plane waves corresponding to the Gabor functions in our model framework. Those plane waves describe the orthogonal directions to the simple cell connectivity. In other words, those plane waves are in orthogonal direction to the horizontal vector fields of the associated geometry, and their relation to the horizontal vector fields are found through the Liouville form. We may associate a Liouville form with each Gabor function given by (1). We write the Liouville form by following the coupling relations between the differential variables appearing in the wave content of the Gabor function. The considered differential variables in the Liouville form are not independent, but they are related to each other through a differential relation. The vanishing of the Liouville form expresses this differential constraint, and hence it reduces the dimension of free variables on the tangent space. This defines the horizontal tangent space and thus the horizontal vector fields.
2.1.2 Set of receptive profiles
An important property of Gabor functions is that they are invariant under certain symmetries. Therefore any Gabor function can be obtained from a reference Gabor function (mother Gabor function) up to a certain transformation law.
Let us denote the origin for the layer of a frequency ω by \(0_{\omega }=(0,0,\omega,0)\in \mathcal{M}\). Then a suitable choice of the mother Gabor function with frequency ω is
We set
which describes at each frequency the relation between a generic receptive profile centered at \(z=(q,\theta,\omega,\phi )\) and the mother Gabor function through
The set of all receptive profiles obtained from the mother Gabor function with all possible combinations of feature values at each possible frequency is called the set of receptive profiles.
2.1.3 Output of a simple cell
We obtain the output response of a simple cell (which is located at the point \(q=(q_{1},q_{2})\in M\simeq \mathbb {R}^{2}\) and sensitive to the feature values \(z=(\theta, \phi, \omega )\)) to a generic image \(I: M\rightarrow \mathbb {R}\) as a convolution with Gabor filter banks:
We apply the convolution for all feature values z at every point q to obtain the output responses of all receptive profiles in the set of receptive profiles. It is equivalent to applying a multifrequency Gabor transform on the given twodimensional image. Since we use all frequency components of the transform, we can employ the exact inverse Gabor transform to obtain the initial image:
with Ψ̄ denoting the complex conjugate. We will call the output response lifted image and the Gabor transform lifting.
Note that the theory provided in [38] takes into account the complexvalued functions resulting from the Gabor transform and explains a proper way to apply image enhancement and inpainting on the complex structure. Differently from this example, cortical models have employed the real part, imaginary part, or the absolute value of the output responses resulting from the convolution with corresponding Gabor filters so far (see, e.g., [25, 92, 93]). In other words, they have not taken into account half of the information they obtained from an image. Furthermore, the inverse Gabor transform was not possible in the previous models of the visual cortex given in [25, 92, 93] since in those models a singlefrequency Gabor transform was employed to obtain the output responses.
We remark that we consider the whole complex structure of the result of the convolution (9) as the output response of a simple cell and use the exact inverse Gabor transform (10) for the reconstruction of the image. This formula is adapted from [38, Sect. 2] as explained in [5, Chap. 8] (see also [36] for explanations of wavelet reconstruction formulas, in particular, in the case of socalled cake wavelets).
2.2 Horizontal vector fields and connectivity
Horizontal vector fields are defined as the elements of
where \(T\mathcal{M}\) denotes the tangent bundle of the fivedimensional manifold \(\mathcal{M}\). They are naturally induced by the 1form Θ given in (4). The horizontal vector fields are found explicitly as
The corresponding horizontal distribution is therefore as follows:
All nonzero commutators related to the horizontal vector fields given in (12) are as follows:
Note that the horizontal vector fields are bracket generating since
for all \(\alpha \in \mathcal{M}\), where \(T_{\alpha }\mathcal{M}\) denotes the tangent space of \(\mathcal{M}\) at α. Obviously, (15) shows that the horizontal vector fields fulfill the Hörmander condition [53], and consequently they provide the connectivity of any two points on \(\mathcal{M}\) through the horizontal integral curves defined on \(\mathcal{M}\) due to the Chow–Rashevski theorem [23]. This connectivity property is particularly important since it guarantees that any two points in V1 can be connected via the horizontal integral curves, which are the models of the neural connectivity patterns in V1.
2.3 Functional architecture of the visual cortex
2.3.1 The architecture as a Lie group
Receptive profiles evoke a group structure at each frequency \(\omega \in \mathbb {R}^{+}\). We can describe the group structure underlying the set of receptive profiles by using the transformation law given in (7).
First, we notice that the elements \((q,\theta,\phi )\) induce the group given by
which is indeed a Lie group associated with fixed frequency ω.
Then using (7), we write the group multiplication law for two elements
as
The differential \(L_{g^{\ast }}\) of the lefttranslation
is given by
The vector fields \(X_{1}\), \(X_{2}\), and \(X_{3}\) are bracket generating due to that
for every \(g\in G_{\omega }\). Hence \(X_{1}\), \(X_{2}\), and \(X_{3}\) generate the Lie algebra corresponding to \(G_{\omega }\).
2.3.2 The architecture as a subRiemannian structure
The functional geometry is associated with a subRiemannian structure at each frequency ω. We denote by \(G_{\omega }\) the submanifold of \(\mathcal{M}\) with points \(h=(q,\theta,\phi, \omega )=(q,z)\) restricted to a fixed ω. In this case the horizontal distribution is found by
Furthermore the induced metric \((g_{ij})^{G_{\omega }}_{h}: \mathcal{D}^{G_{\omega }}\times \mathcal{D}^{G_{ \omega }}\rightarrow \mathbb {R}\) is defined on \(\mathcal{D}^{G_{\omega }}\) and at every point \(h\in G_{\omega }\) makes \(X_{1}, X_{2}, X_{3} \) orthonormal.
Finally, the associated subRiemannian structure with frequency ω is written as the triple
3 Horizontal integral curves
The lifting mechanism leaves each lifted point isolated from each other since there is no connection between the lifted points. Horizontal vector fields endow the model with an integration mechanism that provides an integrated form of the local feature vectors obtained from the lifted image at each point on \(\mathcal{M}\).
Once a simple cell is stimulated, its activation propagates between the simple cells along certain patterns, which can be considered as the integrated forms of the local feature vectors. This propagation machinery is closely related to the association fields [43], which are the neural connectivity patterns between the simple cells residing in different hypercolumns (longrange horizontal connections) within V1. The association fields coincide with the anisotropic layout of the longrange horizontal connections at the psychophysical level. In the classical framework of [25], those association fields were modeled as the horizontal integral curves of \(\mathrm {SE}(2)\). We follow a similar approach and propose to model the association fields in our model framework as the horizontal integral curves associated with the fivedimensional subRiemannian geometry of \(\mathcal{M}\). We conjecture that those horizontal integral curves coincide with the longrange horizontal connections between orientation, frequency, and phase selective simple cells in V1.
We denote a time interval by \(\mathcal{I}=[0,T]\) with \(0< T<\infty \) and then consider a horizontal integral curve \((q_{1},q_{2},\theta, \omega ,\phi )=\gamma:\mathcal{I}\rightarrow \mathcal{M}\) associated with the horizontal vector fields given in (12) and starting from an initial point \(\hat{\alpha }=(\hat{q}_{1},\hat{q}_{2},\hat{\theta },\hat{\omega }, \hat{\phi })\). Let us denote the velocity of γ by \(\gamma ^{\prime }\). At each time \(t\in \mathcal{I}\) the velocity is a vector \(\gamma ^{\prime }(t)\in \operatorname{span}(X_{1},X_{2},X_{3},X_{4}) (\gamma (t) )\) at \(\gamma (t)=(q_{1}(t),q_{2}(t),\theta (t),\omega (t),\phi (t))\in \mathcal{M}\). To compute the horizontal integral curves, we first consider the vector field \(\gamma ^{\prime }\) given by
with coefficients \(c_{i}\) (which are not necessarily constants), \(i\in \{1,2,3,4 \}\). Then we can write each component of \(\gamma ^{\prime }(t)\) as follows:
In the case where the coefficients \(c_{i}\) are real constants and \(c_{2}\neq 0\), we solve the system of ordinary differential equations (25) of t with initial condition α̂ and find the following solution:
If \(c_{2}=0\), then the solution becomes
Note that (26) and (27) describe a family of horizontal integral curves described by the horizontal distribution
We are interested rather in two specific subfamilies corresponding to the horizontal vector fields that reside in one of the two orthogonal \(\mathcal{D}_{\alpha }^{\mathcal{M}}\) subspaces defined at every point \(\alpha =(q,\theta, \omega ,\phi )\in \mathcal{M}\) as
satisfying
Figure 1 gives an illustration of the orthogonal layout of \(S_{1}\mathcal{D}^{\mathcal{M}}_{\alpha }\) and \(S_{2}\mathcal{D}^{\mathcal{M}}_{\alpha }\) at points α on an orientation fiber, that is, on a horizontal integral curve along \(X_{1}+X_{2}\) corresponding to some fixed ω and ϕ. See also Fig. 2, where the integral curves along the vector fields \(X_{1}+c_{2} X_{2}\) and \(X_{3}+c_{4} X_{4}\) with varied \(c_{2}\) and \(c_{4}\) values, respectively, are presented.
We remark that \(S_{1}\mathcal{D}^{\mathcal{M}}_{\alpha }\) is the horizontal tangent space \(T_{(q,\theta )}\mathrm {SE}(2)\) of \(\mathrm {SE}(2)\) at the point α once the frequency ω and phase ϕ are fixed. In other words, at each point \(\alpha =(q,\theta, \omega ,\phi )\) with ω and ϕ fixed on \(\mathcal{M}\), we find the submanifold \(\mathrm {SE}(2)\), which is the classical subRiemannian geometry corresponding to the model given in [25]. This property allows the simple cell activity to be propagated in each subspace corresponding to a frequencyphase pair separately, and it will be important for image enhancement applications employing our model framework.
Finally, we note that the notion of horizontal curves is mathematically more general than that we consider in our model framework. It refers to the integral curves that are everywhere tangent to the horizontal tangent space. Therefore the coefficients may be time varying as well.
4 Enhancement
Image enhancement refers to smoothing a given input image by reducing the noise and at the same time preserving the geometric structures (edges, corners, textures, etc.). We perform our image enhancement procedure on the output responses instead of those on the input image. Since the output responses encode the local feature values of orientation, frequency, and phase, this allows us to exploit the additional information obtained from those features. Our enhancement procedure is based on an iterative Laplace–Beltrami procedure on the simple cell output responses in the fivedimensional subRiemannian geometry \(\mathcal{M}\), and it results in a mean curvature flow in the geometry.
4.1 Laplace–Beltrami procedure
An anisotropic metric on the space \(\mathcal{M}\) of simple cell output responses defines the subRiemannian Laplacian in the subRiemannian space generated by the simple cells:
where the coefficients \(c_{i}\) are nonnegative constants representing the weights of the secondorder horizontal vector fields given in (12). The weights are used to adjust the operator to the subRiemannian homogeneity of \(\mathcal{M}\). They are particularly important in the discrete case, where different dimensions of the space need not necessarily be sampled in the same way.
It has been proved by Franceschiello et al. [46] that the output induces a metric on the space of the model geometry proposed in [25] and the metric elicits certain visual illusions. Franceschiello et al. [46] used a simplified diagonal metric. On the other hand, following the approach of Kimmel et al. [64, 65], we choose the metric induced by the output \(O^{I}(q, z)\) on \(\mathcal{M}\) and use a simplified version of this metric for applications.
The metric \((g_{ij})\) induced by the output responses is defined as follows.
Definition 1
with constants \(c_{1},c_{2},c_{3},c_{4}\geq 0\).
We denote the inverse metric by \((g^{ij})\) and its elements by \(g^{ij}\).
The mean curvature flow provides an adapted enhancement to the surface underlying the image function I since the flow is restricted to the evolving level sets of the image. The Laplace–Beltrami procedure is a generalization of Laplacian from flat spaces to manifolds. It restricts the diffusion of a function to the manifold on which the function is defined thanks to the metric given in Definition 1. Metrics of type (1) are commonly used on Riemannian manifolds where the induced metric is found in terms of the vector fields spanning the whole tangent space at each point on the manifold. In the subRiemannian setting in our model, the metric is composed by the horizontal vector fields. Those vector fields span the horizontal subset of the whole tangent space at each point. Therefore the model geometry is degenerate. Moreover, the horizontal vector fields are noncommutative, and this results in some diffusion taking place in orthogonal directions to the manifold. Such a diffusion should be kept small; otherwise, it might result in excessive blurring, which destroys contextual information (contours, object boundaries, etc.) in the image. This is one of the reasons for us to use a reduced version of the metric given in Definition 1. Yet, the noncommutative nature of the vector fields provides the full connectivity of the geometry through the horizontal integral curves due to the Hörmander condition and Chow–Rashevski theorem.
The coefficients \(c_{1}, c_{2}, c_{3}, c_{4}\) can be chosen suitably to determine the amount of diffusion in the orthogonal direction. The Laplace–Beltrami operator is written as
where \(\operatorname{det}(g_{ij})\) is the determinant of the induced metric. The Laplace–Beltrami operator can be considered as the linearization of the motion by curvature explained in [6]. It performs a more adaptive (and restricted) diffusion to the surface in comparison to the horizontal diffusion characterized by (30).
For practical reasons, we will use the Laplace–Beltrami process with operator (32) associated with a reduced version of the metric provided in Definition 1. It is equivalent to consider each frequency channel as a separate space. It provides the freedom to work on a Laplace–Beltrami procedure separately in each frequency subspace differently from a nonlinear diffusion in a singlefrequency channel. We refer to [24] for an example of a singlefrequency Laplace–Beltrami procedure restricted to the subRiemannian metric of \(\mathrm {SE}(2)\) and its application to image inpainting and enhancement. In addition to a Laplace–Beltrami procedure, other procedures based on totalvariation flows in subRiemannian geometries for successful image processing can be found in [22, 24]. Finally, we address the reader to [39] for some generalizations of totalvariation and mean curvature flows to \(\text{SE}(d)\simeq \mathbb {R}^{3}\rtimes S^{d1}\) and their applications in the higherdimensional subRiemannian space of \(\text{SE}(3)\).
The evolution equation for the enhancement via a subRiemannian Laplace–Beltrami procedure is written as
for all \((q,p)\in \mathcal{M}\) and \(0< t\leq T\). Here the operator L depends on the metric evolving at each time instant due to the evolving output responses. This evolution equation models the neural activity propagation between simple cells along the neural connections modeled via the horizontal connectivity in our framework.
4.1.1 Reduced equation
It is possible to perform the Laplace–Beltrami procedure in each frequency and phase subspace separately in a reduced framework. In that case, we choose \(c_{1},c_{2}> 0\) and \(c_{3}=c_{4}=0\). In this way, we apply the evolution equation on surfaces in each frequency and phase subspace, that is, on each \(\mathrm {SE}(2)_{(\omega ,\phi )}\) manifold, which is the submanifold with elements \((q,\theta )\) representing the points \((q,\theta, \omega ,\phi )\in \mathcal{M}\) with fixed ω and ϕ. In this framework the metric \((g_{ij})\) boils down to
We choose \(c_{1}\) and \(c_{2}\) suitably by regarding the fixed ω values.
There are two reasons for employing the reduced setting. First, we would like to avoid, by choosing \(c_{3}=0\), excessive diffusion in the direction of the vector field \(\sin \theta \partial _{x} +\cos \theta \partial _{y}\), which is the first part of \(X_{3}\). We already have sufficient diffusion in this direction due to the commutator \([X_{1}, X_{2}]\). Direct application of \(X_{3}\) introduces excessive diffusion in orthogonal directions to the object boundaries, which is not desired since it may destroy object boundaries and contour structures in the input image. Moreover, the diffusion in phase results in multiplication of the evolving output responses by a constant since (see (1) and (9))
where \(v=1\) for all output responses (see [38]). Second, we eliminate the use of \(X_{4}\) by fixing \(c_{4}=0\) to reduce the computational load. We perform in this way a Laplace–Beltrami procedure in each frequency channel by avoiding any interaction between different frequency channels. It is an approximation of the exact flow. Yet, it captures the frequency content of the input image. This provides a diffusion where the dominant frequency components determine the resultant image in an analogous way to the multiscale leftinvariant diffusion procedures presented in [36, 37, 47]. It is thanks to the lifting representing dominant frequency components with large output response values in \(\mathcal{M}\). This reduced version results in multiple Laplace–Beltrami procedures applied in the threedimensional subRiemannian geometry \(\mathrm {SE}(2)_{(\omega, \phi )}\) at each frequency ω instead of the fivedimensional subRiemannian geometry \(\mathcal{M}\). Finally, the vector fields \(X_{3}\) and \(X_{4}\) play a role rather in image inpainting (where the diffusion in \(X_{1}\) and \(X_{2}\) directions is not desired) than image enhancement.
Although in the present study we provide no results related to the image inpainting task of the Laplace–Beltrami procedure, we would like to mention a few related points. The use of the vector field \(X_{3}\) becomes important in texture image inpainting. In that case, on the contrary to the enhancement, we would like to have information flow in orthogonal directions to the object boundaries and reduce the flow along the boundaries. In that case, since also the spatial frequency of the texture patterns have a great importance, we would like to keep the track of the frequency and phase of the evolving output responses. This requires fixing \(c_{1}\) and \(c_{2}\) to zero instead of \(c_{3}\) and \(c_{4}\) in that case.
4.2 Implementation of the algorithm
4.2.1 The algorithm
We present the steps of our algorithm based on (33) by starting from the initial image function \(I:\mathbb {R}^{2}\simeq M\rightarrow \mathbb {R}\) at \(q\in M\).

1.
Lift the image \(I(q)\) to \(O^{I}(q, p)\) by using (9). Choose this output as the initial value \(u_{\vert t=0}\) of the solution to (33) at time \(t=0\).

2.
Denote the discrete step in time by Δt. At the kth iteration (i.e., \(t=k\Delta t\)), compute the result of the discretized version L̄ (of the operator L) applied on the current value of u at time instant t as \(\bar{L} u(t)\) and update the solution and the value of \(u(t)\) by using (33) as follows:
$$ u(t+\Delta t)=u(t) +\Delta t \bar{L}u(t). $$ 
3.
Repeat step 2 until the final time \(T=(\text{number of iterations})\times \Delta t\) is achieved.

4.
Apply the inverse Gabor transform given by (10) on \(u(T)\).
4.2.2 Discrete simple cell output responses
We discretize the image function I on a uniform spatial grid as
with \(i,j\in \{1,2,\dots,N\}\) (N is the number of samples in spatial dimensions) and \(\Delta x,\Delta y\in \mathbb {R}^{+}\) denoting the pixel width (in general, we use square images as input image, and we fix \(\Delta x=\Delta y=1\) in terms of pixel unit). Furthermore, the discretized simple cell response \(O^{I}(q_{1,i},q_{2,j},\theta _{k},\omega _{l},\phi _{m})\) of \(I[i,j]\) on uniform orientation, frequency, and phase grids with points \(\theta _{k}=k\Delta \theta \), \(\omega _{l}=l\Delta \omega \), and \(\phi _{m}=m\Delta s\) (\(k\in \{1,2,\dots, K\}\), \(l\in \{1,2,\dots,L\}\), \(m\in \{1,2,\dots, M\}\) (where we denote the number of samples in the orientation dimension by K, in the frequency dimension by L, and in the phase dimension by M, and the distances between adjacent samples in the orientation dimension by Δθ, in the frequency dimensions by Δω, and in the phase dimension by Δs) is denoted by
where \(q_{1,i}=i\Delta x\) and \(q_{2,j}=j\Delta y\).
In this case the discrete version of the Gabor function given by (8) is written as
where \(\tilde{i},\tilde{j}\in \{1,2,\dots,\tilde{N}\}\), \(\tilde{k}\in \{1,2,\dots,\tilde{K}\}\), \(\tilde{n}\in \{1,2,\dots,\tilde{M}\}\). Then we fix \(s_{\tilde{n}}=0\) (i.e., \(\tilde{n}=0\)) in the reduced framework (which was explained in Sect. 4.1.1) and write the discrete cell response obtained from the image \(I[i,j]\) via the discrete Gabor transform as follows:
The time correspondence in the discrete case is represented by the time index \(h_{p}\), where the time interval is discretized by \(P\in \mathbb{N}^{+}\) samples, and \(h_{p}\) represents the time instant \(h_{p}=p\Delta t\) with Δt satisfying \(T=P\Delta t\) and \(p\in \{1,2,\dots,P\}\). In this case the discretized Gabor coefficient is written as
4.2.3 Explicit scheme with finite differences
Here we provide the discrete scheme related to the numerical approximation of the algorithm. We propose an explicit finite difference scheme to iterate the evolution equation (33). The reason for choosing an explicit scheme is that an implicit scheme requires large memory in our fourdimensional (reduced) anisotropic framework.
We obtain the explicit scheme first by writing (33) in terms of the horizontal vector fields \(X_{1}\), \(X_{2}\), \(X_{3}\), and \(X_{4}\) given in (12). Then following Unser [102] and Franken [49], we implement the horizontal vector fields by using central finite differences that are interpolated by Bsplines on a uniform spatial sample grid. Note that Bspline interpolation is required since not all horizontal vectors are aligned with the spatial sample grid.
The interpolation is achieved by determining the coefficients \(b(i,j)\) in
in such a way that the spline polynomial \(s(x,y)\) with the Bspline basis functions \(\rho (xi, yj)\) coincides with the horizontal derivatives of the output \(O^{I}\) at the grid points. For example, in the case of the first horizontal derivative \(X_{1} O^{I}\), the condition \(s(i\Delta x, j\Delta y)=X_{1}O^{I}[i,j,k,l,m]\) must hold if we consider a discrete output as explained in Sect. 4.2.2. We refer to the explanations of Unser [102] for details.
We fix \(\Delta x=\Delta y=1\) and define
See Fig. 3 for an illustration of those vectors. We write the central finite differences of the firstorder horizontal derivatives as
and of the secondorder horizontal derivatives we use as
Then the numerical iteration (discretized from step 2 of the algorithm provided in Sect. 4.2.1) with a time step \(\Delta t>0\) is written as follows:
where L̄ represents the discretized version of L given in (32) (with coefficients \(c=\{c_{1}>0,c_{2}>0,c_{3}=0,c_{4}=0\}\)) in terms of the central finite differences.
4.2.4 Stability analysis
We must consider two points for the stability of our finite discrete scheme:

1.
Suitable choice of the time step size Δt,

2.
Preserving the space homogeneity during the Laplace–Beltrami evolution.
The stability analysis for the \(\mathrm {SE}(2)\) case is explained in [36, 47, 49], and [39] based on Gershgorin theory. We adapt the analysis to our reduced framework and find the upper limit for the time step Δt as
where \(s_{\theta }=\frac{2\pi }{K}\) is the sampling distance between adjacent orientation samples. In our experiments the worst case corresponds to \(c_{1}/c_{2}=0.25\). In that case, \(\Delta t=0.1\leq 0.87\), which is in accordance with the upper bound given in (45).
The second point is due to that we sample each dimension by using a different number of samples. To perform subRiemannian diffusion by regarding the sample unit coherency, we must choose the parameters \(c_{1}\), \(c_{2}\) of the operator L in such a way that the space homogeneity of \(\mathcal{M}\) is preserved. For this reason, we choose \(c_{1}=1\) and \(c_{2}=\beta ^{2}\).
4.3 Experiments
We first show the effects of the numbers of frequency and orientation samples. Then we present our simulation results together with the results presented in [64] and [47] for a comparison. The method in [64] lifts a twodimensional image to a fourdimensional geometry, where the orientation and scale of the image is represented explicitly, and then it applies on the lifted image a multiscale Laplace–Beltrami procedure with a Riemannian metric. In [47] the twodimensional image is lifted to the subRiemannian geometry of \(\mathrm {SE}(2)\) via a socalled cake wavelet transform. In the lifted geometry the orientation corresponding to each point on the twodimensional image is represented explicitly. Then a leftinvariant coherenceenhancing and crossingpreserving diffusion (which is named CEDOS in [47]) is applied on the lifted image. It is a nonlinear adaptive diffusion procedure, which uses the local features curvature, deviation from horizontality, and orientation confidence.
4.3.1 Gabor transform
The delicate point related to the lifting and inversion process is that the Gabor functions \(\varPsi _{(q,\theta, \omega ,\phi )}(x,y,s)\) must be sampled (in orientation θ, frequency ω, and phase ϕ dimensions) in such a way that they cover all the spectral domain (i.e., they must fulfill the Plancherel formula [85]).
We present some results on the Gabor transforminverse transform procedure associated with our setting and the effects of the number of samples in the orientation dimension in Fig. 4. We use the Gabor filter banks obtained from (6) and (8) with scale value of 2 pixels (the total filter size is 24 pixels) to lift the test images (see Fig. 5 for some examples of those Gabor functions). On the top row, we see the results related to an artificial \(64\times 64\) test image (left), and at the bottom, we see the results related to a real \(64\times 64\) test image (left) taken from Kimmel et al. [64] We see in the middle and right columns those two images now transformed and then inverse transformed with different numbers of orientation samples. We sample the space at frequencies \(\omega \in \{0.25, 0.5,\dots, 1.25, 1.375,\dots, 2.25,2.3125,\dots, 3.25 \}\), orientations \(\theta \in \{\frac{2\pi }{32},\frac{4\pi }{32},\dots,\frac{62\pi }{32} \}\) (middle) and \(\theta \in \{0,\frac{2\pi }{8},\dots, \frac{14\pi }{8} \}\) (right), and phases \(\phi \in \{0, \frac{\pi }{8},\dots, \frac{15\pi }{8}\}\). We observe that the decrease in the number of orientation samples reduces the quality of the transformation procedure noticeably in both test images.
Discrete orientation and frequency sampling of the Gabor functions used in the Gabor transform must be done in such a way that the Fourier transform of the filter bank must cover the whole spectral domain (due to the Parseval–Plancherel identity [81, 85]). This is essential for the reconstructability (see (10)) of the transformed signal. See Fig. 6 for the effect of discretization in orientation and frequency on the coverage of the Gabor transform in the spectral domain.
Discrete orientation and frequency sampling of the Gabor functions used in the Gabor transform must be done in such a way that the Fourier transform of the filter bank must cover the whole spectral domain (due to the Parseval–Plancherel identity [81, 85]). This is essential for the reconstructability (see (10)) of the transformed signal. See Fig. 6, where we show the effects of changing frequency and orientation of the Gabor functions and their effects on the spectral domain coverage. As the number of frequency and orientation samples decreases, the coverage becomes weaker, resulting in a degenerate reconstruction from the Gabor transform (see Figs. 4 and 11).
4.3.2 Enhancement
The lifting procedure is performed by the Gabor filters of the type given by (6) and (8) with \(\text{scale}=2\) pixels (the filter size is \(12\times \text{scale}=24\) pixels) and time step \(\Delta t=0.1\) in the experiments.
In Fig. 7, we see the results of the enhancement procedure applied on an artificially produced \(64\times 64\) grayscale test image with white noise. The lifting is achieved with frequency samples \(\omega \in \{0.25, 0.5,\dots, 1,1.125,\dots, 2.25 \}\), phase samples \(\phi =\{ 0,\frac{\pi }{8},\dots,\frac{\pi }{2}\}\), and orientation samples \(\theta \in \{0, \frac{2\pi }{16},\frac{4\pi }{16},\dots, \frac{30\pi }{16}\}\). Note that \(\text{number of orientations}=16\), and thus \(\beta =\frac{\text{number of orientations}}{\text{image size}}=0.25\). To fulfill physical unit coherency, we choose \(c_{1}=1\) and \(c_{2}=\beta ^{2}\). The experiments are done with 15 and 30 iterations.
We continue with Fig. 8, where we apply our procedure on a real \(128\times 128\) image taken from Kimmel et al. [64], who use a multiscale Laplace–Beltrami procedure with fixed frequency. We use the same phase and orientation samples as in the case of Fig. 7, but we employ the frequency samples \(\omega \in \{ 0.25, 0.5,\dots, 1,1.125,\dots, 2.25, 2.3125,\dots, 3 \}\) for the lifting. Here the coefficients \(c_{1}\), \(c_{2}\) are chosen as in the case of Fig. 7. We perform the experiments with 30 and 50 iterations.
In Fig. 9, we show the results related to our Laplace–Beltrami procedure applied on another real image, with dimensions \(64\times 64\), taken from Kimmel et al. [64]. We use the same sampling parameters as in the previous case of Fig. 8 for the lifting. We perform our Laplace–Beltrami procedure with 6 and 15 iterations. The results are presented together with the multiscale Laplace–Beltrami results obtained by Kimmel et al. [64] for a qualitative comparison. In Figs. 8 and 9, we highlight with a red dashed curve a few details related to contextual structures, particularly related to the frequency feature: Our algorithm (bottom right) takes advantage of different frequencies present in the images and therefore can preserve texture structures and preserves these structures better than that of Kimmel et al. [64] (middle right).
We compare our technique to the enhancement method provided in [47], where nonlinear adaptivity mechanisms are combined with a leftinvariant diffusion procedure performing in the subRiemannian geometry of \(\mathrm {SE}(2)\). In Fig. 10, we provide the results of our algorithm applied with different numbers of iterations on a \(256\times 256\) grayscale image taken from [47]. We use the Gabor filter banks obtained from (6) and (8) with scale value of 2 pixels (total filter size is 48 pixels) to lift the test images. We sample the space at frequencies \(\omega \in \{1.45, 1.51, 1.58, 1.66, 1.74, 1.82, 1.91, 2, 2.09, 2.19\}\) and orientations \(\theta \in \{\frac{2\pi }{16},\frac{4\pi }{16},\dots,\frac{32\pi }{16} \}\). We use \(c_{1}=1\) and \(c_{2}=\beta ^{2}\) as before and choose the time step \(\Delta t=0.1\). Here we note that CEDOS algorithm presented in [47] performs an adaptive diffusion in the threedimensional subRiemannian geometry where only the orientation feature is represented explicitly. Despite the fact that it does not perform a multiscale or multifrequency wavelet transform, it provides good enhancement results thanks to the nonlinear adaptation mechanisms taking into account the local features curvature, deviation from horizontality, and orientation confidence. Our results are comparable to those of CEDOS when such nonlinear adaptive mechanisms are not used (see the bottom left and bottom right in comparison to the middle left of Fig. 10). At this point, we remark that the reduced model framework is flexible in the sense that it contains already \(\mathrm {SE}(2)\) at each frequency separately. Hence \(\mathrm {SE}(2)\) can be considered as a particular case of our model framework. There is no obstacle to employing nonlinear operators in this setting (as long as they are adequately used in \(\mathrm {SE}(2)\)) as in the case of CEDOS algorithm explained in [47] or as in the case of diffusionconcentration procedure presented in [25]. We can choose the desired frequency channel in our model framework and apply such nonlinear algorithms in the corresponding \(\mathrm {SE}(2)\) to that frequency channel.
Finally, we see in Fig. 11 a comparison of our multifrequency procedure to its singlefrequency counterpart. The procedure and simulation parameters are the same as given in the case of Fig. 10. We observe that the singlefrequency Laplace–Beltrami procedure results in a lower quality enhancement in comparison to the multifrequency Laplace–Beltrami procedure.
5 Conclusion
In this paper, we have shown that the multifeature selective simple cells and the associated V1 functional geometry can be modeled starting from a suitably chosen receptive profile, which in our framewor kwas the extended Gabor function. We have derived the whole model subRiemannian geometry and the corresponding horizontal connectivity directly from the receptive profile. In addition to this construction of the model, we have also provided an image processing application employing our model framework, image enhancement via a subRiemannian Laplace–Beltrami procedure. We have provided the algorithm and its discretization explicitly as well as some experimental results. We have also mentioned that, in fact, the enhancement procedure could be switched to an image inpainting procedure via a modification of the reduced metric used for the enhancement.
As far as the complexity and the richness of the visual semantics are considered, it is natural to think that the visual system samples all features once it is given a visual input to find a unique correspondence between the visual input and output. The necessity of handling such a variety of images results in a suitable compromise in rendering the visual features. This compromise manifests itself as the visual system being restricted to a psychophysically and neurophysiologically relevant architecture. As such, the visual system is limited in highly specialized tasks, which are used in several domains such as medical image analysis, image processing, radar imaging, and computer and robotic vision.
Our model is not particularly for image processing. It is not motivated by image processing problems and it should not be interpreted as an image processing model. Firstly, image processing models are specialized on specific visual tasks, which are required generally for a certain category of images (medical images, radar images, etc.). They are not necessarily restricted to psychophysical and neurophysiological findings providing information about the architecture of the visual system. Secondly, they can make use of several nonlinear mechanisms, which need not be motivated by any biological reasoning. Our model is motivated biologically, and it relies on the psychophysically and neurophysiologically relevant cortical architecture. It is a phenomenological model, and it provides a geometrical explanation for the cortical architecture, which is compatible with the architecture. Finally, the application of the model to image enhancement was provided to show effects of the use of spatial frequency by comparing it qualitatively to some other image enhancement algorithms.
Notes
Spatial frequency refers to \(\omega =\frac{2\pi }{\lambda }\) with wavelength \(\lambda >0\) in our terminology.
Abbreviations
 V1:

Primary visual cortex
 MRI:

Magnetic resonance imaging
References
Barbieri D, Citti G. Reproducing kernel Hilbert spaces of CR functions for the Euclidean motion group. Anal Appl. 2015;13(03):331–46.
Barbieri D, Citti G, Cocci G, Sarti A. A corticalinspired geometry for contour perception and motion integration. J Math Imaging Vis. 2014;49(3):511–29.
Barbieri D, Citti G, Sanguinetti G, Sarti A. Coherent states of the Euclidean group and activation regions of primary visual cortex. arXiv preprint arXiv:1111.0669 (2011).
Barbieri D, Citti G, Sanguinetti G, Sarti A. An uncertainty principle underlying the functional architecture of V1. J Physiol (Paris). 2012;106(5):183–93.
Baspinar E. Minimal surfaces in subRiemannian structures and functional geometry of the visual cortex. Ph.D. thesis. University of Bologna; 2018.
Baspinar E, Citti G. Uniqueness of viscosity mean curvature flow solution in two subRiemannian structures. SIAM J Math Anal. 2019;51(3):2633–59.
Baspinar E, Citti G, Sarti A. A geometric model of multiscale orientation preference maps via Gabor functions. J Math Imaging Vis. 2018;60(6):900–12.
Bekkers EJ. Bspline CNNs on Lie groups. arXiv preprint arXiv:1909.12057 (2019).
Bekkers EJ, Lafarge MW, Veta M, Eppenhof KA, Pluim JP, Duits R. Rototranslation covariant convolutional networks for medical image analysis. In: International conference on medical image computing and computerassisted intervention. Berlin: Springer; 2018. p. 440–8.
Blakemore CT, Campbell F. On the existence of neurones in the human visual system selectively sensitive to the orientation and size of retinal images. J Physiol. 1969;203(1):237.
Bonhoeffer T, Grinvald A et al.. Isoorientation domains in cat visual cortex are arranged in pinwheellike patterns. Nature. 1991;353(6343):429–31.
Bonhoeffer T, Kim DS, Malonek D, Shoham D, Grinvald A. Optical imaging of the layout of functional domains in area 17 and across the area 17/18 border in cat visual cortex. Eur J Neurosci. 1995;7(9):1973–88.
Boscain U, Duits R, Rossi F, Sachkov Y. Curve cuspless reconstruction via subRiemannian geometry. ESAIM Control Optim Calc Var. 2014;20(3):748–70.
Boscain U, Gauthier JP, Prandi D, Remizov A. Image reconstruction via nonisotropic diffusion in Dubins/Reed–Shepplike control systems. In: 53rd IEEE conference on decision and control. New York: IEEE Press; 2014. p. 4278–83.
Bredies K, Pock T, Wirth B. Convex relaxation of a class of vertex penalizing functionals. J Math Imaging Vis. 2013;47(3):278–302.
Bressloff PC, Cowan JD. The functional geometry of local and horizontal connections in a model of V1. J Physiol (Paris). 2003;97(2):221–36.
Bressloff PC, Cowan JD, Golubitsky M, Thomas PJ, Wiener MC. Geometric visual hallucinations, Euclidean symmetry and the functional architecture of striate cortex. Philos Trans R Soc Lond B, Biol Sci. 2001;356(1407):299–330.
Bruurmijn LM, Kause HB, Filatova OG, Duits R, Fuster A, Florack LM, van Assen HC. Myocardial deformation from local frequency estimation in tagging MRI. In: International conference on functional imaging and modeling of the heart. Berlin: Springer; 2013. p. 284–91.
Chambolle A. An algorithm for total variation minimization and applications. J Math Imaging Vis. 2004;20(1–2):89–97.
Chambolle A, Lions PL. Image recovery via total variation minimization and related problems. Numer Math. 1997;76(2):167–88.
Chambolle A, Pock T. An introduction to continuous optimization for imaging. Acta Numer. 2016;25:161–319.
Chambolle A, Pock T. Total rototranslational variation. Numer Math. 2019;142(3):611–66.
Chow WL. Über Systeme von linearen partiellen Differentialgleichungen erster Ordnung. In: The collected papers of Wei–Liang Chow. Singapore: World Scientific; 2002. p. 47–54.
Citti G, Franceschiello B, Sanguinetti G, Sarti A. SubRiemannian mean curvature flow for image processing. SIAM J Imaging Sci. 2016;9(1):212–37.
Citti G, Sarti A. A cortical based model of perceptual completion in the rototranslation space. J Math Imaging Vis. 2006;24(3):307–26.
Citti G, Sarti A. Neuromathematics of vision. Berlin: Springer; 2014.
Cocci G, Barbieri D, Citti G, Sarti A. Cortical spatiotemporal dimensionality reduction for visual grouping. Neural Computation. 2015.
Cocci G, Barbieri D, Sarti A. Spatiotemporal receptive fields of cells in V1 are optimally shaped for stimulus velocity estimation. JOSA A. 2012;29(1):130–8.
Daugman JG. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by twodimensional visual cortical filters. JOSA A. 1985;2(7):1160–9.
De Valois KK, Tootell R. Spatialfrequencyspecific inhibition in cat striate cortex cells. J Physiol. 1983;336(1):359–76.
De Valois RL, Albrecht DG, Thorell LG. Spatial frequency selectivity of cells in macaque visual cortex. Vis Res. 1982;22(5):545–59.
DeAngelis GC, Ohzawa I, Freeman R. Spatiotemporal organization of simplecell receptive fields in the cat’s striate cortex. I. General characteristics and postnatal development. J Neurophysiol. 1993;69(4):1091–117.
Duits R. Perceptual organization in image analysis: a mathematical approach based on scale, orientation and curvature. PhD thesis. Technische Universiteit Eindhoven; 2005.
Duits R, Boscain U, Rossi F, Sachkov Y. Association fields via cuspless subRiemannian geodesics in se (2). J Math Imaging Vis. 2014;49(2):384–417.
Duits R, Franken E. Line enhancement and completion via linear left invariant scale spaces on SE(2). In: International conference on scale space and variational methods in computer vision. Berlin: Springer; 2009. p. 795–807.
Duits R, Franken E. Leftinvariant parabolic evolutions on SE(2) and contour enhancement via invertible orientation scores part I: linear leftinvariant diffusion equations on SE(2). Q Appl Math. 2010;68(2):255–92.
Duits R, Franken E. Leftinvariant parabolic evolutions on SE(2) and contour enhancement via invertible orientation scores part II: nonlinear leftinvariant diffusions on invertible orientation scores. Q Appl Math. 2010;68(2):293–331.
Duits R, Führ H, Janssen B, Bruurmijn M, Florack L, van Assen H. Evolution equations on Gabor transforms and their applications. Appl Comput Harmon Anal. 2013;35(3):483–526.
Duits R, StOnge E, Portegies J, Smets B. Total variation and mean curvature PDEs on the space of positions and orientations. In: International conference on scale space and variational methods in computer vision. Berlin: Springer; 2019. p. 211–23.
Ecker AS, Sinz FH, Froudarakis E, Fahey PG, Cadena SA, Walker EY, Cobos E, Reimer J, Tolias AS, Bethge M. A rotationequivariant convolutional neural network model of primary visual cortex. arXiv preprint arXiv:1809.10504 (2018).
EnrothCugell C, Robson JG. The contrast sensitivity of retinal ganglion cells of the cat. J Physiol. 1966;187(3):517–52.
Faugeras O. Threedimensional computer vision: a geometric viewpoint. Cambridge: MIT Press; 1993.
Field DJ, Hayes A, Hess RF. Contour integration by the human visual system: evidence for a local association field. Vis Res. 1993;33(2):173–93.
Florack L. Image structure. Series in Mathematical Imaging and Vision. 1997.
Folland GB. Harmonic analysis in phase space. (AM122). vol. 122. Princeton: Princeton University Press; 2016.
Franceschiello B, Sarti A, Citti G. A neuromathematical model for geometrical optical illusions. J Math Imaging Vis. 2018;60(1):94–108.
Franken E, Duits R. Crossingpreserving coherenceenhancing diffusion on invertible orientation scores. Int J Comput Vis. 2009;85(3):253.
Franken E, Duits R, Ter Haar Romeny B. Nonlinear diffusion on the 2D Euclidean motion group. In: Scale space and variational methods in computer vision. Berlin: Springer; 2007. p. 461–72.
Franken EM. Enhancement of crossing elongated structures in images. Eindhoven: Eindhoven University of Technology; 2008.
Hannink J, Duits R, Bekkers E. Crossingpreserving multiscale vesselness. In: International conference on medical image computing and computerassisted intervention. Berlin: Springer; 2014. p. 603–10.
Hoffman WC. Higher visual perception as prolongation of the basic Lie transformation group. Math Biosci. 1970;6:437–71.
Hoffman WC. The visual cortex is a contact bundle. Appl Math Comput. 1989;32(2–3):137–67.
Hörmander L. Hypoelliptic second order differential equations. Acta Math. 1967;119(1):147–71.
Hubel DH, Wiesel T. Shape and arrangement of columns in cat’s striate cortex. J Physiol. 1963;165(3):559–68.
Hubel DH, Wiesel TN. Receptive fields of single neurons in the cat’s striate cortex. J Physiol. 1959;148(3):574–91.
Hubel DH, Wiesel TN. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol. 1962;160(1):106–54.
Hubel DH, Wiesel TN. Uniformity of monkey striate cortex: a parallel relationship between field size, scatter, and magnification factor. J Comp Neurol. 1974;158(3):295–305.
Hubel DH, Wiesel TN. Ferrier lecture: functional architecture of macaque monkey visual cortex. Proc R Soc Lond B, Biol Sci. 1977;198(1130):1–59.
Hübener M, Shoham D, Grinvald A, Bonhoeffer T. Spatial relationships among three columnar systems in cat area 17. J Neurosci. 1997;17(23):9270–84.
Issa NP, Rosenberg A, Husson TR. Models and measurements of functional maps in V1. J Neurophysiol. 2008;99(6):2745–54.
Issa NP, Trepel C, Stryker MP. Spatial frequency maps in cat visual cortex. J Neurosci. 2000;20(22):8504–14.
Jones JP, Palmer LA. An evaluation of the twodimensional Gabor filter model of simple receptive fields in cat striate cortex. J Neurophysiol. 1987;58(6):1233–58.
Kause HB, Filatova OG, Duits R, Bruurmijn LM, Fuster A, Westenberg JJ, Florack LM, van Assen HC. Direct myocardial strain assessment from frequency estimation in tagging MRI. In: International workshop on statistical atlases and computational models of the heart. Berlin: Springer; 2013. p. 212–9.
Kimmel R, Malladi R, Sochen N. Images as embedded maps and minimal surfaces: movies, color, texture, and volumetric medical images. Int J Comput Vis. 2000;39(2):111–29.
Kimmel R, Sochen NA, Malladi R. On the geometry of texture. Tech. rep., Technion—Israel Inst. of Tech. Haifa. Dept. of Computer Science; 2000.
Koenderink JJ. The structure of images. Biol Cybern. 1984;50(5):363–70.
Koenderink JJ, van Doorn AJ. Representation of local geometry in the visual system. Biol Cybern. 1987;55(6):367–75.
Koffka K. Principles of gestalt psychology. London: Routledge; 2013.
Köhler W. Gestalt psychology: an introduction to new concepts in modern psychology. New York: Norton; 1970.
Landy MS, Movshon JA et al.. Computational models of visual processing. Cambridge: MIT Press; 1991.
LeVay S, Stryker MP, Shatz CJ. Ocular dominance columns and their development in layer iv of the cat’s visual cortex: a quantitative study. J Comp Neurol. 1978;179(1):223–44.
Levitt J, Sanchez R, Smith E, Movshon J. Spatiotemporal interactions and the spatial phase preferences of visual neurons. Exp Brain Res. 1990;80(2):441–5.
Lindeberg T. Scalespace theory: a basic tool for analyzing structures at different scales. J Appl Stat. 1994;21(1–2):225–70.
Lindeberg T. Feature detection with automatic scale selection. Int J Comput Vis. 1998;30(2):79–116.
Lindeberg T. Generalized Gaussian scalespace axiomatics comprising linear scalespace, affine scalespace and spatiotemporal scalespace. J Math Imaging Vis. 2011;40(1):36–81.
Lindeberg T. A computational theory of visual receptive fields. Biol Cybern. 2013;107(6):589–635.
Maffei L, Fiorentini A. Spatial frequency rows in the striate visual cortex. Vis Res. 1977;17(2):257–64.
Marĉelja S. Mathematical description of the responses of simple cortical cells. J Opt Soc Am A. 1980;70(11):1297–300.
Mashtakov AP, Ardentov AA, Sachkov YL. Parallel algorithm and software for image inpainting via subRiemannian minimizers on the group of rototranslations. Numer Math, Theory Methods Appl. 2013;6(1):95–115.
Mechler F, Reich DS, Victor JD. Detection and discrimination of relative spatial phase by V1 neurons. J Neurosci. 2002;22(14):6129–57.
Parseval MA. Mémoire sur les séries et sur l’intégration complète d’une équation aux différences partielles linéaires du second ordre, à coefficients constants. Mém. Prés. par Divers Savants. Acad Sci, Paris. 1806;1(1):638–48.
Petitot J. The neurogeometry of pinwheels as a subRiemannian contact structure. J Physiol (Paris). 2003;97(2):265–309.
Petitot J. Neurogéométrie de la vision. Modeles mathématiques et physiques des architectures fonctionelles. Paris: Éd. École Polytech; 2008.
Petitot J, Tondut Y. Vers une neurogéométrie. Fibrations corticales, structures de contact et contours subjectifs modaux. Math Inform Sci Hum. 1999;145:5–102.
Plancherel M, Leffler M. Contribution à l’étude de la représentation d’une fonction arbitraire par des intégrales définies. Rend. Circ. Mat. Palermo (1884–1940). 1910;30(1):289–335.
Pollen DA, Gaska JP, Jacobson LD. Responses of simple and complex cells to compound sinewave gratings. Vis Res. 1988;28(1):25–39.
Pollen DA, Ronner SF. Phase relationships between adjacent simple cells in the visual cortex. Science. 1981;212(4501):1409–11.
Pollen DA, Ronner SF. Visual cortical neurons as localized spatial frequency filters. IEEE Trans Syst Man Cybern. 1983;5:907–16.
Ribot J, Aushana Y, BuiQuoc E, Milleret C. Organization and origin of spatial frequency maps in cat visual cortex. J Neurosci. 2013;33(33):13,326–13,343.
Ribot J, Romagnoni A, Milleret C, Bennequin D, Touboul J. Pinwheeldipole configuration in cat early visual cortex. NeuroImage. 2016;128:63–73.
Sanguinetti G, Citti G, Sarti A. A model of natural image edge cooccurrence in the rototranslation group. J Vis. 2010;10(14):37.
Sarti A, Citti G, Petitot J. The symplectic structure of the primary visual cortex. Biol Cybern. 2008;98(1):33–48.
Sarti A, Citti G, Petitot J. Functional geometry of the horizontal connectivity in the primary visual cortex. J Physiol (Paris). 2009;103(1):37–45.
Sharma U, Duits R. Leftinvariant evolutions of wavelet transforms on the similitude group. Appl Comput Harmon Anal. 2015;39(1):110–37.
Shatz CJ, Stryker MP. Ocular dominance in layer iv of the cat’s visual cortex and the effects of monocular deprivation. J Physiol. 1978;281(1):267–83.
Silverman MS. Deoxyglucose and electrophysiological evidence for spatial frequency columns in cat striate cortex. San Francisco: University of California; 1984.
Sirovich L, Uglesich R. The organization of orientation and spatial frequency in primary visual cortex. Proc Natl Acad Sci. 2004;101(48):16,941–16,946.
Tani T, Ribot J, O’Hashi K, Tanaka S. Parallel development of orientation maps and spatial frequency selectivity in cat visual cortex. Eur J Neurosci. 2012;35(1):44–55.
Ter Haar Romeny B. Frontend vision and multiscale image analysis multiscale computer vision theory and applications. Mathematica. 2003.
Ter Haar Romeny BM. Multiscale and multiorientation medical image analysis. In: Biomedical image processing. Berlin: Springer; 2010. p. 177–96.
Tootell RB, Silverman MS, De Valois RL. Spatial frequency columns in primary visual cortex. Science. 1981;214(4522):813–5.
Unser M, Splines: a perfect fit for signal and image processing. IEEE Signal Process Mag. 1999;16(6):22–38.
Wertheimer M. Laws of organization in perceptual forms. 1938.
Young RA. The Gaussian derivative model for spatial vision: I. Retinal mechanisms. Spat Vis. 1987;2(4):273–93.
Zhang C, Berkels B, Wirth B, Voyles PM. Joint denoising and distortion correction for atomic column detection in scanning transmission electron microscopy images. Microsc Microanal. 2017;23(S1):164–5.
Zhang J, Dashtbozorg B, Bekkers E, Pluim JP, Duits R, Ter Haar Romeny BM. Robust retinal vessel segmentation via locally adaptive derivative frames in orientation scores. IEEE Trans Med Imaging. 2016;35(12):2631–44.
Acknowledgements
Not applicable.
Availability of data and materials
Not applicable.
Funding
This study was funded by Progetto Europeo GAIA (H2020 MCSA RISE), GA, n 777822, PRIN 2015 “Variational and perturbative aspects of nonlinear differential problems”.
Author information
Authors and Affiliations
Contributions
All authors contributed equally. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Consent for publication
Not applicable.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Baspinar, E., Sarti, A. & Citti, G. A subRiemannian model of the visual cortex with frequency and phase. J. Math. Neurosc. 10, 11 (2020). https://doi.org/10.1186/s13408020000896
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13408020000896