From 861be49c32a211b5877bd366c1b5a8296e67b0ee Mon Sep 17 00:00:00 2001
From: Josh Rosen <joshrosen@databricks.com>
Date: Fri, 10 Jan 2025 14:09:52 -0800
Subject: [PATCH] Several performance optimizations, primarily aimed at
 reducing garbage object creation in common cases (#258)

This PR bundles together several small optimizations, most aimed at
reducing garbage object creation. Collectively, these changes result in
a large performance improvement for some of our largest jsonnet inputs.

## Optimizations

These are somewhat coherently split into multiple smaller intermediate
commits, which can help for navigating this change. I'll describe each
optimization in more detail below.

### Optimizing `Obj` key lookup methods for objects without `super`

The `hasKeys`, `containsKey`, `containsVisibleKey`, `allKeyNames`, and
`visibleKeyNames` methods can be optimized in the common case of objects
that don't have `super` objects.

We already perform an optimization for `static` objects, pre-populating
`allKeys`, but for non-static objects we had to run `gatherAllKeys` to
populate a `LinkedHashMap` of keys and a boolean indicating visibility.

If a non-static object has no `super` then we can just use `value0` to
compute the keys: this avoids an additional `LinkedHashMap` allocation
and also lets us pre-size the resulting string arrays, avoiding wasteful
array copies from resizes or trims.

In `visibleKeyNames`, I chose to pre-size the output builder based on
the _total_ key count: this is based a common-case assumption that most
objects don't have hidden keys.

This optimization makes a huge difference in `std.foldl(std.megePatch,
listOfPatches, {})`, where the intermediate merge targets' visible are
repeatedly recomputed. In these cases, the intermediate objects contain
_only_ visible keys, allowing this optimization to maximally avoid
unnecessary array allocations.

### Pre-size various hash maps

This builds on an idea from
https://github.com/databricks/sjsonnet/pull/197 : there are multiple
places where we construct hashmaps that are either over- or under-sized:
an over-sized map wastes space (I saw >90% of backing array slots wasted
in some heap dumps) and an under-sized map wastes time and space in
re-sizing upwards during construction.

Here, I've generalized that PR's pre-sizing to apply in more contexts.

One notable special case is the `valueCache`: if an object inherits
fields then it's not free to determine the map size. As a result, I've
only special-sized this for `super`-free objects. This map is a little
bit different from `value0` or `allFields` because its final size is a
function of whether or not object field values are actually computed.
Given this, I've chosen to start pretty conservatively by avoiding
changing the size in cases where it's not an obvious win; I may revisit
this further in a future followup.

### Change `valueCache` from a Scala map to a Java map

This was originally necessary because the Scala 2.12 version of
`mutable.HashMap` does not support capacity / load factor configuration,
which got in the way with the pre-sizing work described above.

But a nice secondary benefit is that Java maps let me avoid closure /
anonfun allocation in `map.getOrElse(k, default)` calls: even if we
don't invoke `default`, we still end up doing some allocations for the
lambda / closure / thunk. I had noticed this overhead previously in
`Obj.value` and this optimization should fix it.

### Remove most Scala sugar in `std.mergePatch`, plus other
optimizations

The `recMerge` and `recSingle` methods used by `std.mergePatch`
contained big Scala `for` comprehensions and used `Option` for handling
nulls. This improves readability but comes at a surprising performance
cost. I would have naively assumed that most of those overheads would
have been optimized out but empirically this was not the case in my
benchmarks.

Here, I've rewritten this with Java-style imperative `while` loops and
explicit null checks.

### Optimize `std.mergePatch`'s distinct key computation

After fixing other bottlenecks, I noticed that the

```scala
val keys: Array[String] = (l.visibleKeyNames ++ r.visibleKeyNames).distinct
```

step in `std.mergePatch` was very expensive. Under the hood, this
constructs a combined array, allocates an ArrayBuilder, and uses an
intermediate HashSet for detecting already-seen keys.

Here, I've added an optimized fast implementation for the cases where
`r.visibleKeyNames.length < 8`. I think it's much more common for the
LHS of a merge to be large and the RHS to be small, in which case we're
conceptually better off by building a hash set on the RHS and removing
RHS elements as they're seen during the LHS traversal. But if the RHS is
small enough then the cost of hashing and probing will be higher than a
simple linear scan of a small RHS array.

Here, `8` is a somewhat arbitrarily chosen threshold based on some local
microbenchmarking.

### Special overload of `Val.Obj.mk` to skip an array copy

Pretty self-explanatory: we often have an `Array[(String, Val.Member)]`
and we can avoid a copy by defining a `Val.Obj.mk` overload which
accepts the array directly.

### Make `PrettyNamed` implicits into constants

This is pretty straightforward, just changing a `def` to a `val`, but it
makes a huge different in reducing ephemeral garabge in some parts of
the evaluator.

## Other changes

I also added `Error.fail` calls in a couple of `case _ =>` matches which
should never be hit. We weren't actually hitting these, but it felt like
a potentially dangerous pitfall to silently ignore those cases.
---
 sjsonnet/src/sjsonnet/Evaluator.scala  |  17 ++-
 sjsonnet/src/sjsonnet/Expr.scala       |   4 +-
 sjsonnet/src/sjsonnet/Std.scala        | 122 ++++++++++++++++----
 sjsonnet/src/sjsonnet/Util.scala       |  17 ++-
 sjsonnet/src/sjsonnet/Val.scala        | 152 +++++++++++++++++++------
 sjsonnet/src/sjsonnet/ValVisitor.scala |   2 +-
 6 files changed, 248 insertions(+), 66 deletions(-)

diff --git a/sjsonnet/src/sjsonnet/Evaluator.scala b/sjsonnet/src/sjsonnet/Evaluator.scala
index 48213291..765705e8 100644
--- a/sjsonnet/src/sjsonnet/Evaluator.scala
+++ b/sjsonnet/src/sjsonnet/Evaluator.scala
@@ -579,7 +579,7 @@ class Evaluator(resolver: CachedResolver,
       newScope
     }
 
-    val builder = new java.util.LinkedHashMap[String, Val.Obj.Member]
+    val builder = Util.preSizedJavaLinkedHashMap[String, Val.Obj.Member](fields.length)
     fields.foreach {
       case Member.Field(offset, fieldName, plus, null, sep, rhs) =>
         val k = visitFieldName(fieldName, offset)
@@ -604,8 +604,14 @@ class Evaluator(resolver: CachedResolver,
           builder.put(k, v)
         }
       case _ =>
+        Error.fail("This case should never be hit", objPos)
     }
-    cachedObj = new Val.Obj(objPos, builder, false, if(asserts != null) assertions else null, sup)
+    val valueCache = if (sup == null) {
+      Val.Obj.getEmptyValueCacheForObjWithoutSuper(fields.length)
+    } else {
+      new java.util.HashMap[Any, Val]()
+    }
+    cachedObj = new Val.Obj(objPos, builder, false, if(asserts != null) assertions else null, sup, valueCache)
     cachedObj
   }
 
@@ -636,7 +642,12 @@ class Evaluator(resolver: CachedResolver,
           case _ =>
         }
       }
-      new Val.Obj(e.pos, builder, false, null, sup)
+      val valueCache = if (sup == null) {
+        Val.Obj.getEmptyValueCacheForObjWithoutSuper(builder.size())
+      } else {
+        new java.util.HashMap[Any, Val]()
+      }
+      new Val.Obj(e.pos, builder, false, null, sup, valueCache)
     }
 
     newSelf
diff --git a/sjsonnet/src/sjsonnet/Expr.scala b/sjsonnet/src/sjsonnet/Expr.scala
index 8216142c..14643887 100644
--- a/sjsonnet/src/sjsonnet/Expr.scala
+++ b/sjsonnet/src/sjsonnet/Expr.scala
@@ -58,7 +58,7 @@ object Expr{
     }
     case class Field(pos: Position,
                      fieldName: FieldName,
-                     plus: Boolean,
+                     plus: Boolean, // see https://jsonnet.org/ref/language.html#nested-field-inheritance
                      args: Params,
                      sep: Visibility,
                      rhs: Expr) extends Member {
@@ -175,7 +175,7 @@ object Expr{
                        preLocals: Array[Bind],
                        key: Expr,
                        value: Expr,
-                       plus: Boolean,
+                       plus: Boolean, // see https://jsonnet.org/ref/language.html#nested-field-inheritance
                        postLocals: Array[Bind],
                        first: ForSpec,
                        rest: List[CompSpec]) extends ObjBody {
diff --git a/sjsonnet/src/sjsonnet/Std.scala b/sjsonnet/src/sjsonnet/Std.scala
index 9fd9c9eb..1bfe6846 100644
--- a/sjsonnet/src/sjsonnet/Std.scala
+++ b/sjsonnet/src/sjsonnet/Std.scala
@@ -381,7 +381,7 @@ class Std(private val additionalNativeFunctions: Map[String, Val.Builtin] = Map.
       val func = _func.asFunc
       val obj = _obj.asObj
       val allKeys = obj.allKeyNames
-      val m = new util.LinkedHashMap[String, Val.Obj.Member]()
+      val m = Util.preSizedJavaLinkedHashMap[String, Val.Obj.Member](allKeys.length)
       var i = 0
       while(i < allKeys.length) {
         val k = allKeys(i)
@@ -392,7 +392,8 @@ class Std(private val additionalNativeFunctions: Map[String, Val.Builtin] = Map.
         m.put(k, v)
         i += 1
       }
-      new Val.Obj(pos, m, false, null, null)
+      val valueCache = Val.Obj.getEmptyValueCacheForObjWithoutSuper(allKeys.length)
+      new Val.Obj(pos, m, false, null, null, valueCache)
     }
   }
 
@@ -909,39 +910,110 @@ class Std(private val additionalNativeFunctions: Map[String, Val.Builtin] = Map.
     builtin(Range),
     builtin("mergePatch", "target", "patch"){ (pos, ev, target: Val, patch: Val) =>
       val mergePosition = pos
-      def createMember(v: => Val) = new Val.Obj.Member(false, Visibility.Normal) {
+      def createLazyMember(v: => Val) = new Val.Obj.Member(false, Visibility.Normal) {
         def invoke(self: Val.Obj, sup: Val.Obj, fs: FileScope, ev: EvalScope): Val = v
       }
       def recPair(l: Val, r: Val): Val = (l, r) match{
         case (l: Val.Obj, r: Val.Obj) =>
-          val kvs = for {
-            k <- (l.visibleKeyNames ++ r.visibleKeyNames).distinct
-            lValue = if (l.containsVisibleKey(k)) Option(l.valueRaw(k, l, pos)(ev)) else None
-            rValue = if (r.containsVisibleKey(k)) Option(r.valueRaw(k, r, pos)(ev)) else None
-            if !rValue.exists(_.isInstanceOf[Val.Null])
-          } yield (lValue, rValue) match{
-            case (Some(lChild), None) => k -> createMember{lChild}
-            case (Some(lChild: Val.Obj), Some(rChild: Val.Obj)) => k -> createMember{recPair(lChild, rChild)}
-            case (_, Some(rChild)) => k -> createMember{recSingle(rChild)}
-            case (None, None) => Error.fail("std.mergePatch: This should never happen")
+          val keys: Array[String] = distinctKeys(l.visibleKeyNames, r.visibleKeyNames)
+          val kvs: Array[(String, Val.Obj.Member)] = new Array[(String, Val.Obj.Member)](keys.length)
+          var kvsIdx = 0
+          var i = 0
+          while (i < keys.length) {
+            val key = keys(i)
+            val lValue = if (l.containsVisibleKey(key)) l.valueRaw(key, l, pos)(ev) else null
+            val rValue = if (r.containsVisibleKey(key)) r.valueRaw(key, r, pos)(ev) else null
+            if (!rValue.isInstanceOf[Val.Null]) { // if we are not removing the key
+              if (lValue != null && rValue == null) {
+                // Preserve the LHS/target value:
+                kvs(kvsIdx) = (key, new Val.Obj.ConstMember(false, Visibility.Normal, lValue))
+              } else if (lValue.isInstanceOf[Val.Obj] && rValue.isInstanceOf[Val.Obj]) {
+                // Recursively merge objects:
+                kvs(kvsIdx) = (key, createLazyMember(recPair(lValue, rValue)))
+              } else if (rValue != null) {
+                // Use the RHS/patch value and recursively remove Null or hidden fields:
+                kvs(kvsIdx) = (key, createLazyMember(recSingle(rValue)))
+              } else {
+                Error.fail("std.mergePatch: This should never happen")
+              }
+              kvsIdx += 1
+            }
+            i += 1
           }
 
-          Val.Obj.mk(mergePosition, kvs:_*)
+          val trimmedKvs = if (kvsIdx == i) kvs else kvs.slice(0, kvsIdx)
+          Val.Obj.mk(mergePosition, trimmedKvs)
 
         case (_, _) => recSingle(r)
       }
       def recSingle(v: Val): Val  = v match{
         case obj: Val.Obj =>
-          val kvs = for{
-            k <- obj.visibleKeyNames
-            value = obj.value(k, pos, obj)(ev)
-            if !value.isInstanceOf[Val.Null]
-          } yield (k, createMember{recSingle(value)})
-
-          Val.Obj.mk(obj.pos, kvs:_*)
+          val keys: Array[String] = obj.visibleKeyNames
+          val kvs: Array[(String, Val.Obj.Member)] = new Array[(String, Val.Obj.Member)](keys.length)
+          var kvsIdx = 0
+          var i = 0
+          while (i < keys.length) {
+            val key = keys(i)
+            val value = obj.value(key, pos, obj)(ev)
+            if (!value.isInstanceOf[Val.Null]) {
+              kvs(kvsIdx) = (key, createLazyMember(recSingle(value)))
+              kvsIdx += 1
+            }
+            i += 1
+          }
+          val trimmedKvs = if (kvsIdx == i) kvs else kvs.slice(0, kvsIdx)
+          Val.Obj.mk(obj.pos, trimmedKvs)
 
         case _ => v
       }
+      def distinctKeys(lKeys: Array[String], rKeys: Array[String]): Array[String] = {
+        // Fast path for small RHS size (the common case when merging a small
+        // patch into a large target object), avoiding the cost of constructing
+        // and probing a hash set: instead, perform a nested loop where the LHS
+        // is scanned and matching RHS entries are marked as null to be skipped.
+        // Via local microbenchmarks simulating a "worst-case" (RHS keys all new),
+        // the threshold of `8` was empirically determined to be a good tradeoff
+        // between allocation + hashing costs vs. nested loop array scans.
+        if (rKeys.length <= 8) {
+          val rKeysCopy = new Array[String](rKeys.length)
+          rKeys.copyToArray(rKeysCopy)
+          var i = 0
+          var numNewRKeys = rKeysCopy.length
+          while (i < lKeys.length) {
+            val lKey = lKeys(i)
+            var j = 0
+            while (j < rKeysCopy.length) {
+              // This LHS key is in the RHS, so mark it to be skipped in output:
+              if (lKey == rKeysCopy(j)) {
+                rKeysCopy(j) = null
+                numNewRKeys -= 1
+              }
+              j += 1
+            }
+            i += 1
+          }
+          // Combine lKeys with non-null elements of rKeysCopy:
+          if (numNewRKeys == 0) {
+            lKeys
+          } else {
+            val outArray = new Array[String](lKeys.length + numNewRKeys)
+            System.arraycopy(lKeys, 0, outArray, 0, lKeys.length)
+            var outIdx = lKeys.length
+            var j = 0
+            while (j < rKeysCopy.length) {
+              if (rKeysCopy(j) != null) {
+                outArray(outIdx) = rKeysCopy(j)
+                outIdx += 1
+              }
+              j += 1
+            }
+            outArray
+          }
+        } else {
+          // Fallback: Use hash-based deduplication for large RHS arrays:
+          (lKeys ++ rKeys).distinct
+        }
+      }
       recPair(target, patch)
     },
     builtin("sqrt", "x"){ (pos, ev, x: Double) =>
@@ -1404,12 +1476,12 @@ class Std(private val additionalNativeFunctions: Map[String, Val.Builtin] = Map.
       }
       def rec(x: Val): Val = x match{
         case o: Val.Obj =>
-          val bindings = for{
+          val bindings: Array[(String, Val.Obj.Member)] = for{
             k <- o.visibleKeyNames
             v = rec(o.value(k, pos.fileScope.noOffsetPos)(ev))
             if filter(v)
           }yield (k, new Val.Obj.ConstMember(false, Visibility.Normal, v))
-          Val.Obj.mk(pos, bindings: _*)
+          Val.Obj.mk(pos, bindings)
         case a: Val.Arr =>
           new Val.Arr(pos, a.asStrictArray.map(rec).filter(filter).map(identity))
         case _ => x
@@ -1500,12 +1572,12 @@ class Std(private val additionalNativeFunctions: Map[String, Val.Builtin] = Map.
       )))
     },
     builtin("objectRemoveKey", "obj", "key") { (pos, ev, o: Val.Obj, key: String) =>
-      val bindings = for{
+      val bindings: Array[(String, Val.Obj.Member)] = for{
         k <- o.visibleKeyNames
         v = o.value(k, pos.fileScope.noOffsetPos)(ev)
         if k != key
       }yield (k, new Val.Obj.ConstMember(false, Visibility.Normal, v))
-      Val.Obj.mk(pos, bindings: _*)
+      Val.Obj.mk(pos, bindings)
     },
     builtin(MinArray),
     builtin(MaxArray),
diff --git a/sjsonnet/src/sjsonnet/Util.scala b/sjsonnet/src/sjsonnet/Util.scala
index c6e87885..7fe73827 100644
--- a/sjsonnet/src/sjsonnet/Util.scala
+++ b/sjsonnet/src/sjsonnet/Util.scala
@@ -52,4 +52,19 @@ object Util{
       s"<$s>"
     }
   }
-}
\ No newline at end of file
+
+  def preSizedJavaLinkedHashMap[K, V](expectedElems: Int): java.util.LinkedHashMap[K, V] = {
+    // Set the initial capacity to the number of elems divided by the default load factor + 1
+    // this ensures that we can fill up the map to the total number of fields without resizing.
+    // From JavaDoc - true for both Scala & Java HashMaps
+    val hashMapDefaultLoadFactor = 0.75f
+    val capacity = (expectedElems / hashMapDefaultLoadFactor).toInt + 1
+    new java.util.LinkedHashMap[K, V](capacity, hashMapDefaultLoadFactor)
+  }
+
+  def preSizedJavaHashMap[K, V](expectedElems: Int): java.util.HashMap[K, V] = {
+    val hashMapDefaultLoadFactor = 0.75f
+    val capacity = (expectedElems / hashMapDefaultLoadFactor).toInt + 1
+    new java.util.HashMap[K, V](capacity, hashMapDefaultLoadFactor)
+  }
+}
diff --git a/sjsonnet/src/sjsonnet/Val.scala b/sjsonnet/src/sjsonnet/Val.scala
index 9e02311d..33400201 100644
--- a/sjsonnet/src/sjsonnet/Val.scala
+++ b/sjsonnet/src/sjsonnet/Val.scala
@@ -74,13 +74,13 @@ sealed abstract class Val extends Lazy {
 
 class PrettyNamed[T](val s: String)
 object PrettyNamed{
-  implicit def strName: PrettyNamed[Val.Str] = new PrettyNamed("string")
-  implicit def numName: PrettyNamed[Val.Num] = new PrettyNamed("number")
-  implicit def arrName: PrettyNamed[Val.Arr] = new PrettyNamed("array")
-  implicit def boolName: PrettyNamed[Val.Bool] = new PrettyNamed("boolean")
-  implicit def objName: PrettyNamed[Val.Obj] = new PrettyNamed("object")
-  implicit def funName: PrettyNamed[Val.Func] = new PrettyNamed("function")
-  implicit def nullName: PrettyNamed[Val.Null] = new PrettyNamed("null")
+  implicit val strName: PrettyNamed[Val.Str] = new PrettyNamed("string")
+  implicit val numName: PrettyNamed[Val.Num] = new PrettyNamed("number")
+  implicit val arrName: PrettyNamed[Val.Arr] = new PrettyNamed("array")
+  implicit val boolName: PrettyNamed[Val.Bool] = new PrettyNamed("boolean")
+  implicit val objName: PrettyNamed[Val.Obj] = new PrettyNamed("object")
+  implicit val funName: PrettyNamed[Val.Func] = new PrettyNamed("function")
+  implicit val nullName: PrettyNamed[Val.Null] = new PrettyNamed("null")
 }
 object Val{
 
@@ -144,6 +144,25 @@ object Val{
 
   object Obj{
 
+    /**
+     * Helper for saving space in valueCache for objects without a super object.
+     * For objects with no super, we (cheaply) know the exact number of fields and
+     * therefore can upper bound the number of fields that _might_ be computed.
+     */
+    def getEmptyValueCacheForObjWithoutSuper(numFields: Int): util.HashMap[Any, Val] = {
+      // We only want to pre-size if it yields a smaller initial map size than the default.
+      if (numFields >= 12) {
+        new util.HashMap[Any, Val]()
+      } else {
+        Util.preSizedJavaHashMap[Any, Val](numFields)
+      }
+    }
+
+    /**
+     * @param add whether this field was defined the "+:", "+::" or "+:::" separators, corresponding
+     *            to the "nested field inheritance" language feature; see
+     *            https://jsonnet.org/ref/language.html#nested-field-inheritance
+     */
     abstract class Member(val add: Boolean, val visibility: Visibility, val cached: Boolean = true) {
       def invoke(self: Obj, sup: Obj, fs: FileScope, ev: EvalScope): Val
     }
@@ -154,28 +173,63 @@ object Val{
     }
 
     def mk(pos: Position, members: (String, Obj.Member)*): Obj = {
-      val m = new util.LinkedHashMap[String, Obj.Member]()
+      val m = Util.preSizedJavaLinkedHashMap[String, Obj.Member](members.length)
       for((k, v) <- members) m.put(k, v)
       new Obj(pos, m, false, null, null)
     }
+
+    def mk(pos: Position, members: Array[(String, Obj.Member)]): Obj = {
+      val m = Util.preSizedJavaLinkedHashMap[String, Obj.Member](members.length)
+      var i = 0
+      while (i < members.length) {
+        val e = members(i)
+        m.put(e._1, e._2)
+        i += 1
+      }
+      new Obj(pos, m, false, null, null)
+    }
   }
 
+  /**
+   * Represents json/jsonnet objects.
+   *
+   * Obj implements special optimizations for "static objects", which are objects without
+   * `super` where all fields are constant and have default visibility. Static objects can
+   * be created during parsing or in [[StaticOptimizer]].
+   *
+   * @param value0 maps fields to their Member definitions. This is initially null for
+   *               static objects and is non-null for non-static objects.
+   * @param static true if this object is static, false otherwise.
+   * @param triggerAsserts callback to evaluate assertions defined in the object.
+   * @param `super` the super object, or null if there is no super object.
+   * @param valueCache a cache for computed values. For static objects, this is pre-populated
+   *                   with all fields. For non-static objects, this is lazily populated as
+   *                   fields are accessed.
+   * @param allKeys a map of all keys in the object (including keys inherited from `super`),
+   *                where the boolean value is true if the key is hidden and false otherwise.
+   *                For static objects, this is pre-populated and the mapping may be interned
+   *                and shared across instances. For non-static objects, it is dynamically
+   *                computed only if the object has a `super`
+   */
   final class Obj(val pos: Position,
                   private[this] var value0: util.LinkedHashMap[String, Obj.Member],
                   static: Boolean,
                   triggerAsserts: Val.Obj => Unit,
                   `super`: Obj,
-                  valueCache: mutable.HashMap[Any, Val] = mutable.HashMap.empty[Any, Val],
+                  valueCache: util.HashMap[Any, Val] = new util.HashMap[Any, Val](),
                   private[this] var allKeys: util.LinkedHashMap[String, java.lang.Boolean] = null) extends Literal with Expr.ObjBody {
     var asserting: Boolean = false
 
     def getSuper = `super`
 
     private[this] def getValue0: util.LinkedHashMap[String, Obj.Member] = {
+      // value0 is always defined for non-static objects, so if we're computing it here
+      // then that implies that the object is static and therefore valueCache should be
+      // pre-populated and all members should be visible and constant.
       if(value0 == null) {
-        val value0 = new java.util.LinkedHashMap[String, Val.Obj.Member]
+        val value0 = Util.preSizedJavaLinkedHashMap[String, Val.Obj.Member](allKeys.size())
         allKeys.forEach { (k, _) =>
-          value0.put(k, new Val.Obj.ConstMember(false, Visibility.Normal, valueCache(k)))
+          value0.put(k, new Val.Obj.ConstMember(false, Visibility.Normal, valueCache.get(k)))
         }
         // Only assign to field after initialization is complete to allow unsynchronized multi-threaded use:
         this.value0 = value0
@@ -221,18 +275,45 @@ object Val{
       allKeys
     }
 
-    @inline def hasKeys = !getAllKeys.isEmpty
+    @inline def hasKeys: Boolean = {
+      val m = if (static || `super` != null) getAllKeys else value0
+      !m.isEmpty
+    }
 
-    @inline def containsKey(k: String): Boolean = getAllKeys.containsKey(k)
+    @inline def containsKey(k: String): Boolean = {
+      val m = if (static || `super` != null) getAllKeys else value0
+      m.containsKey(k)
+    }
 
-    @inline def containsVisibleKey(k: String): Boolean = getAllKeys.get(k) == java.lang.Boolean.FALSE
+    @inline def containsVisibleKey(k: String): Boolean = {
+      if (static || `super` != null) {
+        getAllKeys.get(k) == java.lang.Boolean.FALSE
+      } else {
+        val m = value0.get(k)
+        m != null && (m.visibility != Visibility.Hidden)
+      }
+    }
 
-    lazy val allKeyNames: Array[String] = getAllKeys.keySet().toArray(new Array[String](getAllKeys.size()))
+    lazy val allKeyNames: Array[String] = {
+      val m = if (static || `super` != null) getAllKeys else value0
+      m.keySet().toArray(new Array[String](m.size()))
+    }
 
-    lazy val visibleKeyNames: Array[String] = if(static) allKeyNames else {
-      val buf = mutable.ArrayBuilder.make[String]
-      getAllKeys.forEach((k, b) => if(b == java.lang.Boolean.FALSE) buf += k)
-      buf.result()
+    lazy val visibleKeyNames: Array[String] = {
+      if (static) {
+        allKeyNames
+      } else {
+        val buf = new mutable.ArrayBuilder.ofRef[String]
+        if (`super` == null) {
+          // This size hint is based on an optimistic assumption that most fields are visible,
+          // avoiding re-sizing or trimming the buffer in the common case:
+          buf.sizeHint(value0.size())
+          value0.forEach((k, m) => if (m.visibility != Visibility.Hidden) buf += k)
+        } else {
+          getAllKeys.forEach((k, b) => if (b == java.lang.Boolean.FALSE) buf += k)
+        }
+        buf.result()
+      }
     }
 
     def value(k: String,
@@ -240,24 +321,31 @@ object Val{
               self: Obj = this)
              (implicit evaluator: EvalScope): Val = {
       if(static) {
-        valueCache.getOrElse(k, null) match {
+        valueCache.get(k) match {
           case null => Error.fail("Field does not exist: " + k, pos)
           case x => x
         }
       } else {
         val cacheKey = if(self eq this) k else (k, self)
-        valueCache.getOrElse(cacheKey, {
+        val cachedValue = valueCache.get(cacheKey)
+        if (cachedValue != null) {
+          cachedValue
+        } else {
           valueRaw(k, self, pos, valueCache, cacheKey) match {
             case null => Error.fail("Field does not exist: " + k, pos)
             case x => x
           }
-        })
+        }
       }
     }
 
     private def renderString(v: Val)(implicit evaluator: EvalScope): String =
       evaluator.materialize(v).transform(new Renderer()).toString
 
+    /**
+     * Merge two values for "nested field inheritance"; see
+     * https://jsonnet.org/ref/language.html#nested-field-inheritance for background.
+     */
     def mergeMember(l: Val,
                     r: Val,
                     pos: Position)
@@ -286,12 +374,12 @@ object Val{
     def valueRaw(k: String,
                  self: Obj,
                  pos: Position,
-                 addTo: mutable.HashMap[Any, Val] = null,
+                 addTo: util.HashMap[Any, Val] = null,
                  addKey: Any = null)
                 (implicit evaluator: EvalScope): Val = {
       if(static) {
-        val v = valueCache.getOrElse(k, null)
-        if(addTo != null && v != null) addTo(addKey) = v
+        val v = valueCache.get(k)
+        if(addTo != null && v != null) addTo.put(addKey, v)
         v
       } else {
         val s = this.`super`
@@ -306,7 +394,7 @@ object Val{
                 case supValue => mergeMember(supValue, vv, pos)
               }
             } else vv
-            if(addTo != null && m.cached) addTo(addKey) = v
+            if(addTo != null && m.cached) addTo.put(addKey, v)
             v
         }
       }
@@ -341,13 +429,8 @@ object Val{
       fields: Array[Expr.Member.Field],
       internedKeyMaps: mutable.HashMap[StaticObjectFieldSet, java.util.LinkedHashMap[String, java.lang.Boolean]],
       internedStrings: mutable.HashMap[String, String]): Obj = {
-    // Set the initial capacity to the number of fields divided by the default load factor + 1 -
-    // this ensures that we can fill up the map to the total number of fields without resizing.
-    // From JavaDoc - true for both Scala & Java HashMaps
-    val hashMapDefaultLoadFactor = 0.75f
-    val capacity = (fields.length / hashMapDefaultLoadFactor).toInt + 1
-    val cache = mutable.HashMap.empty[Any, Val]
-    val allKeys = new util.LinkedHashMap[String, java.lang.Boolean](capacity, hashMapDefaultLoadFactor)
+    val cache = Util.preSizedJavaHashMap[Any, Val](fields.length)
+    val allKeys = Util.preSizedJavaLinkedHashMap[String, java.lang.Boolean](fields.length)
     val keys = new Array[String](fields.length)
     var idx = 0
     fields.foreach {
@@ -357,7 +440,8 @@ object Val{
         allKeys.put(uniqueKey, false)
         keys(idx) = uniqueKey
         idx += 1
-      case _ =>
+      case other =>
+        throw new Error(s"Unexpected non-literal field in static object: ${other} of class ${other.getClass}")
     }
     val fieldSet = new StaticObjectFieldSet(keys)
     new Val.Obj(pos, null, true, null, null, cache, internedKeyMaps.getOrElseUpdate(fieldSet, allKeys))
diff --git a/sjsonnet/src/sjsonnet/ValVisitor.scala b/sjsonnet/src/sjsonnet/ValVisitor.scala
index 441acabe..85901c00 100644
--- a/sjsonnet/src/sjsonnet/ValVisitor.scala
+++ b/sjsonnet/src/sjsonnet/ValVisitor.scala
@@ -17,7 +17,7 @@ class ValVisitor(pos: Position) extends JsVisitor[Val, Val] { self =>
   }
 
   def visitObject(length: Int, index: Int): ObjVisitor[Val, Val] = new ObjVisitor[Val, Val] {
-    val cache = mutable.HashMap.empty[Any, Val]
+    val cache = new java.util.HashMap[Any, Val]()
     val allKeys = new util.LinkedHashMap[String, java.lang.Boolean]
     var key: String = null
     def subVisitor: Visitor[_, _] = self