-
Notifications
You must be signed in to change notification settings - Fork 643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Computing the gradient of a level-set function #2235
Comments
Recall that our level set requires a filter step (or that you transform your grid into a signed distance function) in order to work properly. Then you must internally set β=∞. I may have missed it, but it looks like you are simply populating the grid with a binary array. |
See #2067 for a SDF. |
Unfortunately, work on this issue is blocked by the bug in #2226 (comment) involving an incorrect mode (an oblique planewave) as computed by the Newton-method solver via MPB in |
@oskooi check out this comment too when you are back to looking at this. It describes essentially what you want to do (as a basic example). |
We are finally able to return to this issue with the use of #2285 and #2767. I have modified the example from #2054 (comment) of computing the adjoint gradient of a 1D grating to specify the grating structure as a level set using the signed-distance function from #2067 ( @smartalecH: to validate the adjoint gradient of the level set using the brute-force finite difference, should we just perturb one of the geometric parameters of the grating such as the height or duty cycle? This would be a more useful validation than simply applying a random perturbation to the entire design region as we have been doing in all of our previous examples involving a density-based representation of the design region. from enum import Enum
from typing import Tuple
from autograd.extend import primitive, defvjp
from autograd import numpy as anp
from autograd import tensor_jacobian_product
import matplotlib.pyplot as plt
import meep as mp
import meep.adjoint as mpa
RESOLUTION_UM = 100
WAVELENGTH_UM = 0.5
GRATING_PERIOD_UM = 2.5
PADDING_UM = 3.0
Polarization = Enum("Polarization", "S P")
def mapping(
weights: anp.ndarray,
eta: float,
beta: float,
filter_radius_um: float,
design_region_size: mp.Vector3,
design_region_resolution: float
) -> anp.ndarray:
"""A differentiable mapping function for the design weights."""
# Note: weights should be a 2d array. If it is 1d, it will be reshaped
# to 2d by conic_filter. There is no need to reshape it here.
filtered_field = mpa.conic_filter(
weights,
filter_radius_um,
design_region_size.x,
design_region_size.y,
design_region_resolution,
)
if beta == 0:
return filtered_field.flatten()
else:
projected_field = mpa.smoothed_projection(
filtered_field,
beta,
eta,
design_region_resolution,
)
return projected_field.flatten()
def design_region_to_meshgrid(
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> Tuple[anp.ndarray, anp.ndarray]:
"""Returns the meshgrid coordinate arrays for the design region."""
xcoord = anp.linspace(
-0.5*design_region_size.x,
+0.5*design_region_size.x,
nx
)
ycoord = anp.linspace(
-0.5*design_region_size.y,
+0.5*design_region_size.y,
ny
)
xv, yv = anp.meshgrid(xcoord, ycoord, indexing='ij')
return xv, yv
@primitive
def grating_levelset(
grating_height_um: float,
grating_duty_cycle: float,
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> anp.ndarray:
"""Returns the weights for the grating as a 1D array."""
xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)
weights = anp.where(
anp.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
anp.where(
xv <= xv[0][0] + grating_height_um,
1,
0
),
0
)
return weights.flatten()
def grating_levelset_vjp(
ans,
grating_height_um: float,
grating_duty_cycle: float,
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> anp.ndarray:
"""Returns the vector-Jacobian product."""
xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)
# pixel dimensions
delta_x = design_region_size.x / (nx - 1)
delta_y = design_region_size.y / (ny - 1)
# Gradient of the level set with respect to the grating height.
# The gradient is obtained using a finite difference. The gradient is
# therefore a zero matrix with non-zero entries only at the locations
# of the height boundary. Since the height boundary is normal to the
# x axis, these non-zero matrix elements have a value of 1 / Δx.
weights = anp.where(
anp.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
anp.where(
xv <= xv[0][0] + grating_height_um + 0.5 * delta_x,
anp.where(
xv >= xv[0][0] + grating_height_um - 0.5 * delta_x,
1 / delta_x,
0
),
0
),
0
)
jacobian = weights.flatten()
return lambda g: anp.tensordot(g, jacobian, axes=1)
def grating_1d(
pol: Polarization,
grating_height_um: float,
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> mpa.OptimizationProblem:
"""Sets up the adjoint optimization of a 1D grating."""
frequency = 1 / WAVELENGTH_UM
substrate_um = 3.0
pml_um = 1.0
pml_layers = [mp.PML(direction=mp.X, thickness=pml_um)]
n_glass = 1.5
glass = mp.Medium(index=n_glass)
size_x_um = pml_um + substrate_um + grating_height_um + PADDING_UM + pml_um
size_y_um = GRATING_PERIOD_UM
cell_size = mp.Vector3(size_x_um, size_y_um, 0)
k_point = mp.Vector3()
if pol.name == "S":
eig_parity = mp.ODD_Z
src_cmpt = mp.Ez
else:
eig_parity = mp.EVEN_Z
src_cmpt = mp.Hz
src_pt = mp.Vector3(-0.5 * size_x_um + pml_um, 0, 0)
sources = [
mp.Source(
mp.GaussianSource(frequency, fwidth=0.1 * frequency),
component=src_cmpt,
center=src_pt,
size=mp.Vector3(0, size_y_um, 0),
)
]
matgrid = mp.MaterialGrid(
mp.Vector3(nx, ny),
mp.air,
glass,
weights=anp.ones((nx, ny)),
do_averaging=False,
)
matgrid_region = mpa.DesignRegion(
matgrid,
volume=mp.Volume(
center=mp.Vector3(
(-0.5 * size_x_um + pml_um + substrate_um +
0.5 * design_region_size.x),
0,
0
),
size=design_region_size,
),
)
geometry = [
mp.Block(
material=glass,
size=mp.Vector3(pml_um + substrate_um, mp.inf, mp.inf),
center=mp.Vector3(
-0.5 * size_x_um + 0.5 * (pml_um + substrate_um), 0, 0
),
),
mp.Block(
material=matgrid,
size=matgrid_region.size,
center=matgrid_region.center,
),
]
sim = mp.Simulation(
resolution=RESOLUTION_UM,
cell_size=cell_size,
boundary_layers=pml_layers,
k_point=k_point,
sources=sources,
geometry=geometry,
)
tran_pt = mp.Vector3(0.5 * size_x_um - pml_um, 0, 0)
order_y = 1
kdiff = mp.Vector3(
(frequency**2 - (order_y / GRATING_PERIOD_UM)**2)**0.5,
order_y / GRATING_PERIOD_UM,
0
)
print(f"kdiff = ({kdiff.x:.5f}, {kdiff.y:.5f}, {kdiff.z:.5f})")
obj_list = [
mpa.EigenmodeCoefficient(
sim,
mp.Volume(
center=tran_pt,
size=mp.Vector3(0, size_y_um, 0),
),
mode=1,
kpoint_func=lambda *not_used: kdiff,
eig_parity=eig_parity,
eig_vol=mp.Volume(
center=tran_pt,
size=mp.Vector3(0, 1 / RESOLUTION_UM, 0)
),
),
]
def J(tran_mon):
return anp.abs(tran_mon)**2
opt = mpa.OptimizationProblem(
simulation=sim,
objective_functions=J,
objective_arguments=obj_list,
design_regions=[matgrid_region],
frequencies=[frequency],
)
return opt
if __name__ == "__main__":
grating_height_um = 0.5
grating_duty_cycle = 0.5
height_perturbation_um = 1 / RESOLUTION_UM
design_region_size = mp.Vector3(
grating_height_um + PADDING_UM,
GRATING_PERIOD_UM,
0
)
design_region_resolution_um = int(2 * RESOLUTION_UM)
nx = int(design_region_size.x * design_region_resolution_um) + 1
ny = int(design_region_size.y * design_region_resolution_um) + 1
# Properties of the convolution filter and projection operator.
beta = 0.2
eta_intrinsic = 0.5
eta_erosion = 0.75
min_feature_size_um = 0.5
filter_radius_um = mpa.get_conic_radius_from_eta_e(
min_feature_size_um,
eta_erosion
)
opt = grating_1d(
Polarization.P,
grating_height_um,
design_region_size,
nx,
ny,
)
unperturbed_design_weights = grating_levelset(
grating_height_um,
grating_duty_cycle,
design_region_size,
nx,
ny,
)
mapped_design_weights = mapping(
unperturbed_design_weights,
eta_intrinsic,
beta,
filter_radius_um,
design_region_size,
design_region_resolution_um,
)
obj_value_unperturbed, grad_unperturbed = opt(
[mapped_design_weights],
need_gradient=True,
)
# Backpropagate the adjoint gradient through two functions in the correct order.
grad_backprop = tensor_jacobian_product(mapping, 0)(
mapped_design_weights,
eta_intrinsic,
beta,
filter_radius_um,
design_region_size,
design_region_resolution_um,
grad_unperturbed,
)
defvjp(grating_levelset, grating_levelset_vjp)
grad_backprop = tensor_jacobian_product(grating_levelset, 0)(
grating_height_um,
grating_duty_cycle,
design_region_size,
nx,
ny,
grad_backprop,
)
fig, ax = plt.subplots()
opt.plot2D(init_opt=False, ax=ax)
if mp.am_master():
fig.savefig(
'grating_1d_plot2D.png', dpi=150, bbox_inches='tight'
)
perturbed_design_weights = grating_levelset(
grating_height_um + height_perturbation_um,
grating_duty_cycle,
design_region_size,
nx,
ny,
)
mapped_design_weights = mapping(
perturbed_design_weights,
eta_intrinsic,
beta,
filter_radius_um,
design_region_size,
design_region_resolution_um,
)
obj_value_perturbed, _ = opt(
[mapped_design_weights],
need_gradient=False
)
adj_dirderiv = height_perturbation_um * grad_backprop
fnd_dirderiv = obj_value_perturbed[0] - obj_value_unperturbed[0]
rel_err = abs(adj_dirderiv - fnd_dirderiv) / fnd_dirderiv
print(
f"dirderiv:, {fnd_dirderiv} (finite difference), "
f"{adj_dirderiv} (adjoint), {rel_err:.6f} (error)"
)
|
Frankly, I would use the smoothed projection feature we recently implemented in python (rather than what we tried to implement in c++). We know there are bugs in 3D, I imagine there are bugs in 2D too.
Yes that should work fine. You may have to play with how big the step needs to be. |
Why is it necessary to apply a smoothing projection to the design weights when they are already smoothed using the signed-distance function? Separately, I have updated the script in my previous comment to compute the adjoint gradient of the diffraction efficiency with respect to the height of the grating. However, the adjoint gradient back propagated through the signed-distance function using Autograd's
I suspect this is because the signed-distance function implemented in the function |
So keep in mind that the underlying framework using the material grid assumes a density based approach to TO. In order to turn it into a level set, you need to do subpixel smoothing, and our modified projection function lets us do that easily without leaving the realm of density-based TO. If you already have a level-set function that does smoothing for you, and you can backprop through it, then you don't need any extra smoothing (and you should turn off both the smoothing and set beta=0 in the material grid). The challenge is that we don't always have level set functions available to us. For trivial geometries, like squares or circles it's easy. But if you are trying to evolve something more difficult, (like starting with a simple waveguide crossing and doing shape optimization) then using the smoothed projection function makes this easy.
I would just write a custom level set for this. Rectangles are easy and you can do it in native autograd. Lots of resources online. |
I have modified the script to disable the internal subpixel smoothing in 1. unperturbed grating # Obtain the grating as a level set.
unperturbed_design_weights = grating_weights(
grating_height_um,
grating_duty_cycle,
design_region_size,
nx,
ny,
)
# Apply a filter and projection to the level set.
mapped_unperturbed_design_weights = mapping(
unperturbed_design_weights,
eta_intrinsic,
npa.inf,
filter_radius_um,
design_region_size,
design_region_resolution_um,
)
obj_value_unperturbed, grad_unperturbed = opt(
[mapped_unperturbed_design_weights],
need_gradient=True,
) 2. grating with perturbed height # Obtain the grating as a level set.
height_perturbation_um = 1e-2
perturbed_design_weights = grating_weights(
grating_height_um + height_perturbation_um,
grating_duty_cycle,
design_region_size,
nx,
ny,
)
# Apply a filter and projection to the level set.
mapped_perturbed_design_weights = mapping(
perturbed_design_weights,
eta_intrinsic,
npa.inf,
filter_radius_um,
design_region_size,
design_region_resolution_um,
)
obj_value_perturbed, _ = opt(
[mapped_perturbed_design_weights],
need_gradient=False
) Now it seems we just need to backpropagate twice: the first time through the def mapping(
weights: npa.ndarray,
eta: float,
beta: float,
filter_radius_um: float,
design_region_size: mp.Vector3,
design_region_resolution: float
) -> np.ndarray:
"""A differentiable mapping function for the design weights."""
filtered_field = mpa.conic_filter(
weights,
filter_radius_um,
design_region_size.x,
design_region_size.y,
design_region_resolution,
)
if beta == 0:
return filtered_field.flatten()
else:
projected_field = mpa.smoothed_projection(
filtered_field,
beta,
eta,
design_region_resolution,
)
return projected_field.flatten() Is this the correct approach? |
You want to skip the conic filter, which moves your level set. |
There is a comment in the docstrings of the meep/python/adjoint/filters.py Lines 717 to 718 in 87bc763
@smartalecH: what will happen if we remove the |
@oskooi the only assumption for this new projection function is that you have a level set and that it's smoothly varying. You don't appear to have that. You just have binary shapes derived from fundamental parameters. So in your case, you do need to filter. But you would also need to map this ad hoc levelset to your new shape using eg a marching squares algorithm. It would be much easier to simply use a signed distance function of a rectangle. Lots of resources online like this one. Offset the zero levelset to be 0.5 (our standard eta value) and use the new smoothed projection function to do the proper smoothing. You should be able to code it up in autograd. |
(To be clear, the main application of this smoothed projection is for topology optimization, i.e. when the density/level-set function itself comprises the degrees of freedom.) |
As a test to verify that the gradient calculation has been set up correctly, I would like to demonstrate that we can compute a non-zero gradient of the diffraction efficiency with respect to the grating height. (The gradient is incorrect until we have implemented the signed-distance function using Autograd and used it to replace the For now, applying a
It is unclear why the gradient of the levelset function is always zero. @smartalecH, any thoughts? |
Looks like the problem is that Autograd does not support gradients of def grating_levelset(
grating_height_um: float,
grating_duty_cycle: float,
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> npa.ndarray:
"""Returns the weights for the grating as a 1D array."""
xcoord = npa.linspace(
-0.5*design_region_size.x,
+0.5*design_region_size.x,
nx
)
ycoord = npa.linspace(
-0.5*design_region_size.y,
+0.5*design_region_size.y,
ny
)
xv, yv = npa.meshgrid(xcoord, ycoord, indexing='ij')
weights = npa.where(
npa.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
npa.where(
xv <= xcoord[0] + grating_height_um,
1,
0
),
0
)
return weights.flatten() Also, it does not seem to be an issue but Autograd does support gradients of The fix is that we therefore just need to rewrite |
Here is an easier approach:
This is feasible because your number of parameters p is very small, and F(f(p)) is very fast (no PDE solve), so finite differences are cheap. Just make sure that the image is high enough resolution that finite differences are sufficiently accurate. |
(Your f(p) function above, i.e. |
This seems to be working now after I added a custom VJP function for the levelset construction function (shown below). We can now demonstrate good agreement in the directional derivative computed using the adjoint gradient and finite differences:
These results were obtained using a resolution of 200 pixels/μm and a runtime of nearly 1.5 hours using a single Xeon 4.2 GHz. Unfortunately, using finite differences to approximate the Jacobian necessarily requires large resolutions to obtain accurate results. This feature has already been noted in the comments above:
As a check, I cranked up the resolution to 400 pixels/μm and the relative error decreased. However, this run took nearly 12 hours with 4 Xeon cores.
Details of the calculation involving composition of the objective function and computing the Jacobian matrix using finite differences are summarized in the slide below. The current calculation does not use subpixel smoothing (which will be added next by extending this approach to the signed-distance function in order to replace the conic filter in the def grating_levelset_vjp(
ans,
grating_height_um: float,
grating_duty_cycle: float,
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> np.ndarray:
"""Returns the vector-Jacobian product."""
xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)
# pixel dimensions
delta_x = design_region_size.x / (nx - 1)
delta_y = design_region_size.y / (ny - 1)
# Gradient of the level set with respect to the grating height.
# The gradient is obtained using a finite difference. The gradient is
# therefore a zero matrix with non-zero entries only at the locations
# of the height boundary. Since the height boundary is normal to the
# x axis, these non-zero matrix elements have a value of 1 / Δx.
weights = np.where(
np.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
np.where(
xv <= xv[0][0] + grating_height_um + 0.5 * delta_x,
np.where(
xv >= xv[0][0] + grating_height_um - 0.5 * delta_x,
1 / delta_x,
0
),
0
),
0
)
jacobian = weights.flatten()
return lambda g: np.tensordot(g, jacobian, axes=1) |
Using the same approach involving a custom VJP described in my previous comment applied to the signed-distance function via the module Here are the results comparing the directional derivatives obtained using the two methods (finite difference vs. adjoint gradient) for the S polarization
P polarization
There is good agreement for the S polarization
P polarization
The script used to generate the results is available in this gist. It contains two key parts:
matgrid = mp.MaterialGrid(
mp.Vector3(nx, ny),
mp.air,
glass,
weights=anp.ones((nx, ny)),
do_averaging=True,
beta=anp.inf
) def signed_distance(weights: anp.ndarray) -> anp.ndarray:
"""Maps the 2D weights using a signed-distance function."""
# Create signed distance function.
sd = skfmm.distance(weights - 0.5)
# Linear interpolation of zero-levelset onto 0.5-levelset.
xp = [anp.min(sd.flatten()), 0, anp.max(sd.flatten())]
yp = [0, 0.5001, 1]
return anp.interp(sd.flatten(), xp, yp)
@primitive
def levelset_and_smoothing(
grating_height_um: float,
grating_duty_cycle: float,
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> anp.ndarray:
"""Generates the grating as a levelset with signed-distance smoothing."""
xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)
weights = anp.where(
anp.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
anp.where(
xv <= xv[0][0] + grating_height_um,
1,
0
),
0
)
return signed_distance(weights)
def levelset_and_smoothing_vjp(
ans: anp.ndarray,
grating_height_um: float,
grating_duty_cycle: float,
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> anp.ndarray:
"""Computes the vector-Jacobian product."""
xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)
# pixel dimensions
delta_x = design_region_size.x / (nx - 1)
weights = anp.where(
anp.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
anp.where(
xv <= xv[0][0] + grating_height_um + delta_x,
1,
0
),
0
)
jacobian = (signed_distance(weights) - ans) / delta_x
return lambda g: anp.tensordot(g, jacobian, axes=1) This figure shows the smoothed grating obtained by applying the signed-distance function to its level-set representation. |
Related to #2011.
This is an initial attempt to use the new subpixel smoothing feature of the adjoint solver to compute the gradient of a structure parameterized by its geometry (a level set) as an alternative to the conventional density-based (
MaterialGrid
) representation. The example involves computing the gradient of the diffraction efficiency of the m=+1 transmitted order of a 1D binary grating with respect to its height and fill factor. The results are to be validated by the usual method of the brute-force gradient computed using a finite difference. Once everything is working correctly, this demonstration will be converted into a tutorial for the user manual and a unit test.The main feature of this calculation is a differentiable wrapper function (
grating_matgrid
in the script below) that takes the grating height and fill factor as arguments and returns the weights of aMaterialGrid
for the binary grating. These weights are then passed to the adjoint solver via aDesignRegion
object in the usual way and the gradient which is computed is then backpropagated through the wrapper function.An image of the geometry created using this approach confirms that the structure is set up correctly. However, it seems there is a bug in the current setup because the final gradient is zero. Also, there is an important warning from autograd:
This message suggests there is a likely bug in the backpropagation step.
output of running the script below
The text was updated successfully, but these errors were encountered: