Neural Green’s function for Laplacian systems

of the


A B S T R A C T
Solving linear system of equations stemming from Laplacian operators is at the heart of a wide range of applications. Due to the sparsity of the linear systems, iterative solvers such as Conjugate Gradient and Multigrid are usually employed when the solution has a large number of degrees of freedom. These iterative solvers can be seen as sparse approximations of the Green's function for the Laplacian operator. In this paper we propose a machine learning approach that regresses a Green's function from boundary conditions. This is enabled by a Green's function that can be effectively represented in a multi-scale fashion, drastically reducing the cost associated with a dense matrix representation. Additionally, since the Green's function is solely dependent on boundary conditions, training the proposed neural network does not require sampling the righthand side of the linear system. We show results that our method outperforms state of the art Conjugate Gradient and Multigrid methods.

Introduction
Efficiently solving linear systems originating from discrete Partial Differential Equations (PDEs) is central to many modern applications, ranging from modelling of natural phenomena to image processing. Since these equations model local relationships, their discretized counterparts are sparse and only few entries are non-zero. Therefore, direct matrix inversion is not efficient since it yields a dense representation, and iterative solvers such as Conjugate Gradient and Multigrid are preferred.
In this work, we propose a novel framework that comprehends iterative methods for solving linear systems of equations stemming from Poisson Equations. Our method is inspired by the theory of Green's functions: integral equations obtained from the PDE and its corresponding boundary conditions. Once computed, they can provide the solution to the PDE by a simple convolution. However, Green's function methods are not adopted in practical applications because analytic solutions exist only in simple settings, and discretizing them is unpractical due to their wide kernel support. Our first contribution, thus, is a novel multi-level discrete Green's function formulation that is able to take advantage of a sparse and more efficient design. Our second contribution is to take advantage that Green's functions only depend on the boundaries of the domain, and train a neural network model to regress Green's functions from general boundary settings. Lastly, due to the spectral properties of our multi-level discrete Green's function, our method can be applied as an iterative solver on the error residual. These combined contributions create a linear system solver with unprecedented error convergence in 2-D, surpassing state of the art Conjugate Gradient and Multigrid methods.

Related Works
We revisit methods for solving linear system of equations arising from linear discrete PDEs in the following. Classical methods are extensively studied by LeVeque [1], so our discussion will focus on recent attempts involving learning-based methods and approximating Green's functions.
Learning Green's functions. The work of Alkhalifah et al. [2] proposed to represent Green's function of wave equations through fully-connected neural networks. The network takes spatial coordinates and source locations as input and predicts the Green's function value for a given domain. The formulation is similar to the Physics-Informed Neural Networks (PINN) [3], where each specific PDE is represented by one fully connected network. Similarly to our work, Ichimura et al. [4] developed a neural network trained to fit local patches of the discretized Green's function. The training depends on the right-hand side and solution pair, thus results in a very large dataset, containing around 16.2 million samples. As the proposed Green's function is still an approximation of the ground-truth, the authors adopted it as a pre-conditioner for the linear system. On the contrary, we adopt the approximated Green's function as a multi-level iterative solver.
Green's functions were also explored in a multi-scale fashion to solve specific electromagnetic potentials. In a series of papers, Li et al. [5,6,7] proposed neural networks as non-linear neural operators to map coefficients of differential operators directly to solutions. However, the proposed neural operators always assume fixed forcing terms in the PDE, and these methods do not deal with complex boundary conditions. The work of Feliu-Faba et al. [8] decomposes Green's function of pseudodifferential operators through a nonstandard wavelet transform, training neural networks that jointly learn the mapping from Laplacian coefficients and wavelet parameters. This approach, however, does not take complex boundary conditions into consideration. Gin et al. [9] designs an autoencoder structure to map forcing terms from non-linear PDEs into a linear space for solving PDEs through Green's functions. Recent advances in learning Green's functions also involve applying them other domains such as solving Helmholtz and Sturm-Liouville problems [9], computing discrete Markov Chains [10], and quantum field theory [11].

Multi-Resolution Analysis and Sparse Approximate Inverses.
Our work is inspired by several recent contributions on multiresolution analysis and Sparse Approximate Inverses. Multiresolution matrix factorization (MMF) [12,13,14] extends classic multi-resolution analysis [15] to matrix representations. Haar wavelets (which act as ideal low pass filters) are the natural basis when constructing hierarchical representations of the Laplacian operator [12]. Similarly to our work, progressive mollification is applied to Kroenecker deltas for multi-scale analysis with diffusion wavelets [16]. Sparse Approximate Inverses (SPAI), on the other hand, are more general low rank matrix approximations for the inverse of a discrete matrix. These can be employed as preconditioners [17,18,19,20] and as Multigrid smoothers [21,22].
Solving Linear PDEs with Convolutional Neural Networks. More recent endeavors in the deep learning era leaned towards a direct mapping between image-represented boundary conditions of 2D Laplace equations and their solutions through convolutional neural networks (CNN). Barati Farimani et al. [23] trained a U-Net model by combining L 1 and adversarial losses, while Sharma et al. [24] adopted a weakly-supervised residual loss. Other works solved the Poisson equation for applied problems, including pressure projection in fluid simulation [25,26], electric potentials [27] and particles simulation [28]. Differ-ently from the previous work, Hsieh et al. [29] proposed to use neural networks to modify Jacobi-style iterative solvers. The neural network operates on the error term at each iteration, and is designed to be linear (without bias and non-linear activation functions) to guarantee convergence to the correct fix point solution. After supervised training, the network can moderately improve the convergence speed of a Jacobi and a Multigrid solver.

Background
We briefly review the linear partial differential equations (PDE) and their Green's function solutions in this section. The symbols used throughout the paper can be found in Table 1.

Linear Partial Differential Equations
Linear PDE solvers aim to find functions that satisfy a set of linear differential equations. Consider F = {u : Ω ∈ R k → R} as a space of smooth scalar field functions in a domain of k dimensions, A : F → F a linear differential operator and u ∈ F a candidate function that satisfies the equation Au(x) = f (x), for f ∈ F . In this paper, we assume which yields a Poisson equation of the form ∇ 2 u(x) = f (x). Solving a linear PDE involves finding a function u ∈ F that satisfies the above conditions.
The solution of a linear PDE depends on specified boundary conditions (BC). Assuming ∂Ω as the boundary of the domain with an oriented normal vector n, homogeneous Dirichlet (u(x) = 0, x ∈ ∂Ω D ) and Neumann ( ∂u(x) ∂n = 0, x ∈ ∂Ω N ) are common boundary conditions. These conditions can model obstacles and domain boundaries. Thus, a Poisson equation with prescribed boundary conditions is formulated as

Green's Function
A Green's function G(x, x ′ ) of the linear differential operator A is defined as where δ is the Dirac delta function: δ(x) = 0 for x 0, ∞ −∞ δ(x)dx = 1; and x ′ ∈ Ω is a fixed point. The Green's function is useful to solve inhomogeneous linear PDEs, since once computed for a specific operator, the PDE Au(x) = f (x) is immediately solved by for any arbitrary forcing function f (x). Several analytical solutions of Green's functions exist; for example, the three dimensional Green's function for the Laplace equation with Dirichlet conditions at the infinity is given by However, once more complex boundary conditions are introduced, there are no known closed form Green's function formulation for the general case. The goal of this paper, thus, is to efficiently approximate G(x, x ′ ) given a certain boundary configuration. This will provide us a solution operator of any Poisson equation without relying on the forcing function f (x).

The Discretized Setting
Since there are no trivial analytical solutions for linear PDEs in the general case, its common to solve them numerically. To do so, the first step is to discretize the operator A spatially. Among grid-based methods, regular (Cartesian), curvilinear, or unstructured discretizations are common [1], and each of which can be potentially coupled with distinct approximation schemes such as finite differences, finite volumes or finite elements. In this paper we will focus on approximating Green's functions for regular grids in 2-D with embedded objects discretized by fully filled cells [30], and no fractional boundary treatment. , We notice, however, that all the aforementioned discretizations will yield a discrete linear matrix and the discussion presented here does not limit itself to the particular choice of regular grids coupled with finite differences. The 2-D Laplacian operator discretized with a second order approximation at regular grid node x i, j is given by (5) where h is the grid spacing and u i+1, j , u i, j+1 , u i−1, j , u i, j−1 , u i, j are discrete grid values at different locations (inset). Notice that this discretization of Laplacian operator is compact, since each node only interacts with its immediate neighbors. Extending this relationship for all grid nodes yields the following linear system: Here we use A to discretize the negative of Laplacian operator. This system is sparse with 5 non-zero entries per row/column. Boundary conditions can change the stencil of the discrete Laplacian kernel (Equation (5)) and the structure of matrix A. Moreover, if boundary conditions are compatible, the matrix A is symmetric positive definite. It's straightforward to see that in the discrete case, the Green's function is simply the inverse of the Laplacian matrix, since discretizing Equation (2) yields where G is the discretized Green function for the operator A. Therefore, for an arbitrary discrete forcing term f, the solution of the PDE is

Neural Green's Function For Laplacian Systems
The most straightforward way to find a Green's function solution for a Poisson Equation is to invert its discrete matrix representation G = A −1 . However, this is computationally inefficient as G is dense. Even if the Green's function for discrete Laplacian is known a-priori, the cost for computing the solution by dense matrix multiplication u = Gf is O(N 2 ) given a regular grid with N nodes, which is sub-par when compared with state-of-the-art iterative solvers. This approach, therefore, is not employed in practice. The dense Green's function for the Laplacian operator, however, has a sparse counterpart in the frequency space: consider its one-dimensional representation plotted for Dirichlet boundary conditions in Figure 1. Despite being dense in the original space, its power spectrum shows that it is sparse on the frequency space. This property suggests that there is another representation, besides the standard matrix format, in which the Green's function can be compactly discretized. Previous works on Sparse Approximate Inverses (SPAI) employed Wavelets [20] to also make the G's representation more efficient. We therefore propose a multi-level compact representation for the Green's function in the next section.

A Multi-level Green's Function Representation
Our multi-level Green's approximation (MLGA) relies on lower resolution grids that are progressively up-sampled until a target resolution: where G * 1 and G * L are matrices representing coarsest and finest discretizations respectively; and U q ℓ (D q ℓ ) is upsampling (downsampling) operator that maps a vector from discretization level ℓ (of N ℓ nodes) to discretization level q. We implement the upsampling (downsampling) operators through composing a series of ratio-2 operators; e.g., the upsampling operator U L ℓ is composed as Similar to the Multigrid method [31], the ratio-2 upsampling and downsampling operators are represented by 3 × 3 fullweighting kernels. The downsample kernels are constructed by computing the tensor product D = B ⊗ B between onedimensional stencils. We choose B to be a linear stencil defined as The upsample operator is chosen to be the scaled transpose of the downsample operator U = 4D T [31]. Boundary conditions are easily handled by manipulating the operand: a Dirichlet condition imposes that values on the boundaries are set to 0, while a Neumann condition requires extrapolation of values to the boundary in the direction of the derivative. When a boundary cell has multiple Neumann conditions applied to it, the extrapolation simply averages contributions from different directions. Moreover, we notice that linear 3 × 3 downsampling kernels can introduce aliasing; however this was not an observed problem in our experiments.

Enforcing Sparsity
Simply approximating the matrix inverse with the formulation shown in Equation (9) is not enough for an efficient representation. Therefore, similarly to SPAI approaches [14], our goal is to find a sparse approximation of the Green's function G, which can be expressed through the following optimization: whereĜ is sparse. One of the contributions of this paper is to show that the multi-level approximation can sparsely and efficiently represent the Green's Function by solving Equation (12) independently per level. To show that, we first define G ℓ as The equation above states that the approximation at the ℓ-th level is defined by the summing the previous upsampled level G ℓ−1 and a residual matrix G * ℓ defined at ℓ. Similarly to the Galerkin approximation, our derivation assumes that coarser levels should solve a Laplace system progressively downsampled from the finest level. Defining A ℓ = D ℓ L AU L ℓ and using the Green's definition of Equation (13), the sparse optimization can be written aŝ where I ℓ = D ℓ L IU L ℓ is the downsampled identity matrix. Directly optimizing Equation (14) is memory inefficient, since the approximation in a level depends on the previous coarser one. By reformulating Equation (14), we can solve for the mismatch of the downsampled identity between grid resolutions of adjacent levels I ℓ − U ℓ ℓ−1 D ℓ−1 ℓ I ℓ , and rewrite the optimization to be level-independent aŝ Notice that this method only works because it is bound to a target finest grid resolution; therefore, it cannot be reused for distinct grids with varying number of nodes/levels. The full derivation from Equation (14) to Equation (15) is presented in the supplemental material. Lastly, the per-levelĜ * ℓ Green's approximation is mostly sparse. That happens because each row (a 2-D kernel converted to an array) has a compact support due to the predominance of lower frequencies in Green's function of elliptical operators. The kernel with compact support is centered around the point of evaluation as illustrated in Figure 2. This assumption requires that contributions from distant nodes are inherently modelled by interpolation of Green's Functions from coarser levels. Scenarios that violate this assumption will be discussed in Section 6.
We choose to represent residual Green's function approxi-mationsĜ * ℓ through spatially varying convolutions, i.e. slidingwindow of compact kernels G * ℓ (i ℓ , j ℓ ) that vary at each position. Here 1 ≤ i ℓ ≤ m ℓ , 1 ≤ i ℓ ≤ n ℓ , with m ℓ and n ℓ being the sizes in x and y direction of the regular grid at level ℓ, and N ℓ = m ℓ n ℓ . We can conveniently write them as sparse matrix-vector multiplications: the values of G * ℓ (i ℓ , j ℓ ) represent the non-zero values at row (i ℓ N ℓ + j ℓ ) in the matrix G * ℓ ∈ R N ℓ ×N ℓ . For coarser levels we adopt kernels of sizes k ℓ × k ℓ to cover most of the domain since they are still relatively cheap to compute. As the discretization progresses to more refined levels, the kernel size progressively decreases until it covers only a small neighbourhood (e.g., for all our examples the finest level has kernel size 5 × 5). The number of operations needed to for sparse matrixvector multiplication depends on the number of non-zero matrix entries; the total number of non-zero elements in G * ℓ is L ℓ=1 k 2 ℓ N ℓ . The number of operations needed for upsampling and downsampling operators implemented by 3 × 3 kernels is L−1 ℓ=1 3 2 N ℓ . Therefore, the total number of operations for multiplying the multi-level Green's function with a vector is Thus, as long as k 2 l << N l in higher resolution levels, evaluating Equation (8) with the proposed multi-level Green's approximation requires significantly less operations when compared with it the dense Green's counterpart, which requires (N L ) 2 operations.

Solving the Poisson System Iteratively
Once the optimization problem in Equation (15) is solved, the final approximation of the Green's function can be obtained by combining allĜ * ℓ in Equation (9). However, naively computing this matrix-vector multiplication is memory inefficient, since there is no need to upsample the per-level residual Green's functionĜ * ℓ to level L. Instead, Equation (8) can be implemented through applyingĜ * ℓ on the right-hand side vector f in a level-by-level fashion. Starting with level ℓ = 1, a first approximation of the solution is obtained u 0 =Ĝ * 1 (D 1 L f). At the following levels ℓ, we applyĜ * ℓ on the corresponding righthand side vector to get a error correction term e ℓ =Ĝ * ℓ (D ℓ L f). This correction is then added to the upsampled coarser level, which yields the current level's solution u ℓ = U ℓ ℓ−1 u ℓ−1 + e ℓ . The process continues until the finest level L.
As the approximated Green's functionĜ is not exact due to cut-offs introduced by compact kernels, directly applying it on the right-hand side vector usually does not provide a solution that is precise enough. Thus, we devise an iterative fashion of applying the Green's function to refine the prediction. After getting a first approximation of the result asû = u L , a residual r = f − Aû is computed. Then, the approximated Green is applied on this residual and added back toû to obtain the next approximation:û ←û +Ĝr. The process is repeated until a user-defined residual tolerance or maximum number of iteration is reached. The presented method has similarities with a Multigrid solver, but the relaxation step is substituted by using our approximated Green's kernels. The whole solving process is summarized in Algorithm 1.

Learning Green's Functions
The optimization in Equation (15) depends on the specific system matrix A that it was solved for, which restricts its application to a Laplace operator under specific boundary conditions. Assuming that the Laplacian operator has homogeneous coefficients, we observe that the kernel weights of G * ℓ (i, j) can be determined solely by the boundary conditions in a given domain. Therefore, we can obtain Green's functions for more general configurations by training a Multilayer Perceptron (MLP) to directly map boundary conditions into kernel weights.
Contrary to position-based features that were used in previous implicit approaches [32], we can adopt features that are based on distances of an evaluation point relative to all domain boundaries. Including all these distances (or, alternatively, the whole domain as input) as features of the neural network, however, would result in a large parametrization space that is challenging to generalize, since it would contain features that would only weakly correlate with the outputs of our Neural Green's function. Instead, we rely on a feature vector that takes patches of Signed Distance Functions (SDFs) from the point of evaluation to the domain boundaries. These functions are smooth, and once coupled with their derivatives -which are automatically encoded by the patch information -they can represent non-local information of the underlying geometries embedded on the domain.
The neural network that maps the boundary conditions to Green's function kernels at level ℓ is defined as M Θ ℓ : ϕ ℓ (i, j) ∈ R k ℓ ×k ℓ → G * ℓ (i, j) ∈ R k ℓ ×k ℓ . This mapping takes a SDF patch ϕ ℓ centered at (i, j) and maps it to a Green's kernel G * ℓ (i, j) of the same size. Since the mapping is defined continuously by the signed distance functions, the mapping function M Θ ℓ is well represented by a MLP. The MLP flattens the input SDF patch into a vector and outputs a vector of the same shape. The output vector is then reshaped into a k ℓ × k ℓ patch and used as G * ℓ (i, j). We define separate MLPs for each level. The training of each level is also performed independently according to the objective defined in Equation (15). The only difference lies in the parameters to be optimized: where r denotes the training example index. The multiplication of M Θ ℓ (ϕ ℓ (i, j)) and matrix A is a sparse vector-matrix multiplication. In practice, we do not sequentially take all patches from a boundary setting to form a batch during training, but rather randomly pick (i, j) from different boundary settings.
At test time, we extract all sliding local patches from ϕ ℓ to form a batch {ϕ ℓ (i, j) : ∀1 ≤ i ≤ m ℓ , ∀1 ≤ j ≤ n ℓ }. The whole batch is used as input to the current level's MLP to get predictions of the corresponding residual Green's function kernels {G * ℓ (i, j) : ∀1 ≤ i ≤ m ℓ , ∀1 ≤ j ≤ n ℓ }. This process is performed for all levels ℓ = 1, . . . , L to obtain the full multi-level Green's function representation. Given G * ℓ (i, j) for all levels, we can use the procedure defined in Algorithm 1 to solve for arbitrary right-hand side. An overall illustration of the training and test pipeline can be seen in Figure 3.

Experiments and Results
We evaluate the ability of our approximated Green's function to solve the Poisson equations (Algorithm 1) by comparing its performance with classical linear solvers. Random right-hand side vectors f are generated with either Gaussian noise ( Figure  5, top row) or Perlin noise [33] (Figure 5, bottom row). Classical linear solvers evaluated include Jacobi, Conjugate Gradient (CG), Multigrid (MG), and Multigrid Preconditioned Conjugate Gradient (MGPCG). The residual L 1 -norm of different solvers is plotted against the number of multiply-add operations, as a comparison based on the number of iterations would be biased by the computational cost required at each iteration step.

Implementation Details
We implement all our models, as well as classical (Conjugate Gradient (CG), Jacobi, Multigrid, and Multigrid Preconditioned Conjugate Gradient) solvers in PyTorch [34]. All experiments are run on a NVIDIA GeForce RTX 2080Ti GPU with 11 GB of dedicated memory.
Multi-level representation. Grids at all levels are discretized with a resolution of N l = (2 ℓ+1 + 1) × (2 ℓ+1 + 1) nodes, where ℓ = 1, . . . , L is the index of the level. This means that we always fix the coarsest level ℓ = 1 to have 5 × 5 nodes. The kernel sizes of each level k ℓ are determined heuristically. For example, setups with the resolution of N = 129 2 employ different kernels sizes for each level as k 1 = 9, k 2 = 11, k 3 = 9, k 4 = 7, k 5 = 5, k 6 = 5. The total number of multiply-add operations in this setting when applying the multi-level Green's function to the right-hand side is ∼ 7.1 × 10 5 , while the operations needed for applying the dense Green's function is ∼ 2.8 × 10 8 . Equation (15) is solved as an unconstrained optimization problem with a Quasi-Newton (L-BFGS) optimizer. We choose a history size of 10 to approximate the Hessian, and the Armijo criteria is used for the line search algorithm, which is limited to 10 steps. For all tests the same hyper-parameters are used for the L-BFGS algorithm.
Neural networks. The model in Section 4.4 uses distinct MLPs for each level. The MLPs of the coarser levels (up to ℓ = 3) have 6 fully-connected layers, while rest of the other levels have 12 layers. Each hidden layer in the MLP has 256 channels, and uses ReLU activations. The input layer is skip connected to the 4th layer. Dropout is applied in all layers during training. Each MLP is trained with an Adam Optimizer, with initial learning rate set to 10 −4 . The learning rate progressively decays every 300 epochs by a ratio of 0.5 until convergence. MLPs of different levels are trained independently in parallel. The training takes around 24 hours for each level.
Multigrid Settings. We implement a geometric Multigrid solver [31] with weighted Jacobi (ω = 2 3 ) as the smoothing operator. We use 8 pre-smoothing steps, 16 post-smoothing steps for all levels, and 20 smoothing steps for the coarsest level. When used as preconditioner for Conjugate Gradient, we change the parameters to 2 pre-smoothing steps, 2 postsmoothing steps and 4 smoothing steps for the coarsest level. The same set of parameters is used for all experiments.

Representing Green's Function for a Single Scene
We evaluate our method in two parts: first, we validate that our multi-level representation can adequately represent a Green's function for a fixed boundary configuration. Next, we measure the capacity of the MLPs to predict Green's functions for arbitrary obstacles. The purpose of our first experiment is to validate that the choice of our structure of multi-level convolution is sufficient to accurately and sparsely approximate the discrete Green's function (the dense inverse of the A matrix). To this end, we fix the boundary conditions and directly optimize for the values ofĜ * ℓ for all ℓ (Equation (15)). We note that the discretization of the Green's function can be exactly represented with our multi-layer model if we allow the radius of the convolution to span the size of the whole domain. Although this would be unpractical for large domains, this allows us to compute a set of ground truth kernels G * ℓ at low resolution (33 × 33). Figure 4 shows the comparisons of ground-truthĜ * ℓ (i, j) (b) and approximated G * ℓ (i, j) (c) kernels at different locations for level ℓ = 3, with the difference between them shown on the right-most column. This difference does not exceed 3% of the magnitude for the ground-truth kernel, which indicates that the most dominant values are well represented by the compact ker-nelsĜ * ℓ (i, j). In Figure 4(e), we show that applying our kernel iteratively on the residual (Section 4.3) results in an algorithm that outperforms state-of-the-art solvers.
Additional examples for a grid resolution of 257 × 257 nodes are shown in Figure 5. The MGPCG solver (top row) outperforms our model for the simplest case of boundary conditions (Dirichlet exterior boundaries, no interior boundaries). However, when more complex boundaries are present (bottom row, mixed Dirichlet and Neumann exterior and interior boundaries), our model starts to outperform all classical solvers. We chose two different types of randomly generated right-hand side vectors on the top and bottom rows to showcase that, once optimized, our Green's function representation is oblivious to the right-hand side function. Distinct functions do not influence the convergence curves shown on the right, and additional examples are demonstrated in the supplemental material. We also compare the wall-clock time across resolutions for different solvers in Table 2. Although our method saves only about 2× the number of multiply-add operations compared to MGPCG, its intrinsic parallel nature enables it to reach a speedup of up to 12× at all resolutions.

Predicting Green's Function for Various Boundary Geometries
We further evaluate the performance of MLPs to predict Green's function kernels based on general boundary conditions of arbitrary scenes. Note that the training does not rely on either the solution or the right-hand-side vectors. It only requires SDF values for all levels and the corresponding discrete Laplacian operator (ϕ 1 , . . . , ϕ L , A).
Dataset generation. We randomly place spheres and rectangles of random sizes into the scene to represent interior Neumann boundaries. Spheres and rectangles keep canonical orientations for simplicity, but are allowed to overlap in order to create more complex shapes. SDF values are computed at different discretization levels from the parameters of the spheres and rectangles to obtain ϕ 1 , . . . , ϕ L . The voxelized obstacle setting and its corresponding discrete Laplacian A are computed   at the finest discretization level. The discrete Laplacian is not stored for coarser levels, instead, they are downsampled relative to the finest resolution on the fly during training. We generate 1,000 scenes for the dataset (see examples in Figure 6), but the number of training samples is larger as we randomly select local patches. Moreover, 10 scenes are generated in a similar fashion (randomly placing spheres and rectangles) for creating the testing dataset. This means that none of the test sampeles is seen during training. The left side of the exterior boundary of all scenes is assumed to be Dirichlet while the other sides are assumed to be Neumann.
Predicting on new boundary settings. We train MLPs for grids with 129 × 129 nodes and show test results evaluated by solving Poisson equations in Figure 7 (a,b). Similar to the optimization results in Figure 5, the Green's kernel produced by the MLPs can outperform classical solvers in terms of residual convergence.
Fine tuning model output. Most of the test examples perform similarly well as in Figure 7 (a,b) For some examples with more complex interior boundary geometry though, inaccurate kernel predictions can result in slow convergence or even divergence of the residual, as is shown in Row (c) of Figure 7. The predicted kernels are suitable approximations in most of the regions, except the few grid points next to the interior Neumann boundaries. To fix this issue, we run a post-processing step on the failure cases. We solve the optimization in Equation (15) with the predicted Green's function kernels from MLPs as a starting value. The results of the fixed kernels are shown in Row (d) of Figure 7. The initialization reduces the number of iteration needed for the optimizations from 20 to 10, and reduces the runtime of optimization from ∼ 35 s to ∼ 13 s.
Ablation tests. To evaluate the results variation over multiple training runs, we train the model with different random seeds for initialization (Figure 8 (b, d)). Both seeds used for initialization show similar results in terms of both accuracy and convergence (Figure 8 (e)). We also tried using spatial gradients of the SDF patch as extra input channels (Figure 8 (c)). The plot (e) shows that incorporating the spatial gradient results in a slightly inferior convergence rate. As the spatial gradient is inherently contained in the input SDF patch, we argue that using it as extra channels is redundant.

An Example Application
Our multi-level Green's function can be used to replace classical solvers for Poisson Equations. We showcase an application to a fluid (smoke) simulation in Figure 9, where a Poisson solver is necessary at the pressure projection step [30]. In particular, the pressure projection equation is where ∂Ω N and ∂Ω D are fluid-solid and fluid-air interfaces, respectively. This step is the computational bottleneck in fluid solvers; therefore, it is crucial that its solution is computed efficiently. We restrict our study to a 2-D, 257 × 257 scene, with solid obstacles as shown in the top-left inset images in the Figure. The multi-level Green's function approximations are obtained by optimizing kernels for this single scene. We compare our results for several frames (top), to a MGPCG solver (bottom). Both solvers are set to stop at residual tolerance of 10 −4 in L ∞ norm. The resulting density field of the smoke are similar between the two solvers. The simulation takes 104 ms per frame when using our Multi-level Green's function, and takes 576 ms per frame when using MGPCG solver.

Conclusions and Discussions
A multi-level Green's functions for 2-D Poisson Equations was presented as an alternative representation to standard SPAIs. Our novel optimization scheme is level-independent, which makes its evaluation efficient and memory-bound. Moreover, we show that our representation can be used to solve the discrete linear system by iteratively applying it on the residual of the error.
The locality property of the Laplace operator -its evaluation only depends on the neighborhood of the evaluated position -is the key for our method's efficiency. The Green's function, on the other hand, exhibits a sparsity pattern in the frequency domain: the low-frequency global information is crucial to reconstruct the solution of the current position, while magnitudes of high-frequencies are often small (Figure 1). Our designed multi-level Green's function approximation takes advantage of this property by representing low-frequency information through coarser grids, while high-frequency information is localized in finer grids. Therefore, our method can potentially be applied to other types of elliptical PDE as the differential operators are typically local.
By taking advantage that Green's function of the Poisson Equations only depends on the boundary conditions, we show that neural networks can be trained to Green's functions for general boundary settings. Once trained or optimized, our model can surpass state-of-the-art linear system solvers for certain settings in terms of convergence rate and runtime. Lastly, tested Multi-level Green's function outperforms other competing solvers in terms of residual convergence. Row (c) shows a divergent result due an inaccurate kernel prediction for a position that is simultaneously close to two objects (inset image). After post-processing problematic kernels through optimization (Row (d)), our method outperforms other solvers.    our Green's function representation to replace pressure solvers in fluid simulation, and achieve a speedup of ∼ 5x compared to state of the art classical solvers. The up-front cost to get our approximated multi-level Green's function can pay off when the system is solved for multiple different right-hand sides (single scene optimization). Examples of such applications include Poisson matting [35], in which a Poisson equation of different right-hand side is solved in each iteration; Poisson image editing [36], where the boundary condition changes when a different region of blending is selected; grid-based fluid simulations [30], as the right-hand side computed from the advected velocity change at each time step.

Limitations
Inaccurate MLP kernel predictions. As shown in Figure  7 (c), our trained model underperforms on some individual patches around solid obstacles. This small potion of inaccurate kernels can result in slow convergence or even divergence if the SPD property is locally violated. We suspect that this behavior might come from SDF patches that are not well represented on the original dataset and the limited capability of the simple MLPs architectures employed. Therefore an improved network design and better sampling of the input examples during training may relieve this issue. Furthermore, when testing our model on scenes with unseen shapes (e.g., triangles) representing interior Neumann boundaries, our model also demonstrates inferior performance ( Figure 11). We notice, however, that our method can be further fine-tuned for these scenarios by performing additional optimization iterations through Equation (15) (Figure 7 (d)).
Kernels without compact support. We found in experiments that ground-truth residual Green's kernels G * ℓ (i, j) can have a non-compact support in some cases, as shown in Figure 10. The Figure shows the comparison of the approximated residual Green's kernelĜ * ℓ (i, j) (b) and G * ℓ (i, j) (c), computed similarly as in Figure 4), at different spatial locations ( (10,9), top) and ((4, 23), bottom) for levels ℓ = 3 and ℓ = 4 respectively. The absolute difference between G * ℓ (i, j) andĜ * ℓ (i, j) is shown on the right. Compared to the kernels in Figure 4, ground-truth kernels in this example have more values outside the compact range defined. However, our approximated kernels can still perform well when evaluating its convergence of solving Poisson Equations (Figure 10 (e)). We suspect the non-compact kernels may have a larger effect on the convergence rate in higher resolutions, and additional experiments for fine-tuning and ablation are needed.

Future Work
We plan to extend our work to support higher resolutions and 3-D settings. Moreover, our method could be applied to handle Poisson equations on arbitrary meshes, which would require more sophisticated numerical methods and efficient implementation of the downsampling and upsampling operators. Lastly, our method could offer significant speed-ups when dealing with varying boundaries (moving solids or liquids simulations). These scenarios were not explored in this paper; however, we believe that the method can be extended to handle such cases, as long as the training dataset represents moving boundaries accurately. These extensions would greatly increase the application of the proposed approach, since solving the Poisson equation is a widely pervasive problem.