Add option to insert list in ets:insert, ets:lookup refactor #1405

TheSobkiewicz · 2024-12-17T16:01:18Z

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

Changes:

Enabled ets:insert/2 to accept lists for bulk insertion.
Extracted helper functions for ets:lookup/2 and ets:insert/2 that do not apply table locks.

Use Cases for the Helper Functions:

The new helper functions can be utilized in the following ETS operations to reduce code duplication:

ets:update_element/3
ets:insert_new/2
ets:update_counter/3
ets:update_counter/4
ets:take/2
ets:delete_object/2

Every mentioned function will be implemented after merging of this PR.

bettio · 2024-12-22T21:16:11Z

src/libAtomVM/ets.c

+        }
+        EtsErrorCode result = ets_table_insert(ets_table, tuple, ctx);
+        if (result != EtsOk) {
+            AVM_ABORT(); // Abort because operation might not be atomic.


We usually don't do VM abort: calling AVM_ABORT() means that an unrecoverable happened, such as memory corruption, a bad internal bug and any other kind of situation that required an entire VM crash and reboot.
Are we in this specific situation?

Now we don't have any other tool to ensure atomicity here. In case the insert fails at the Nth element, elements (0,N -1) will be inserted into the list, which could result in hard-to-debug behavior. It is unlikely to happen.

To expand: without this abort, if we're short on memory, we'd leave list partially inserted. If someone tries to persist inserts someday we'd leave the system in inconsistent state.

To avoid that we need to either abort or allocate the list of previous values and rollback in case of error (ensuring that nothing allocates in rollback path since we're most likely dealing with OOM). Abort is easier to do here.

This check needs to have UNLIKELY.

Ok, fair point, there is another feasible approach:
table nodes can be pre-allocated before making any change to the list, so in case of allocation failure freeing up allocated nodes can be easily done before making any actual change.

ets_hashtable_insert will need an additional node parameter, and a dedicated allocation function might be created (e.g. ets_hashtable_new_node). Furthermore key and and entry parameters can be moved to the ets_hashtable_new_node function if it can help.
This change will have a very small impact since ets_hashtable_insert is used in just one or two places.
I suggest doing this with an additional commit inside this PR, so we can make the review easier and separate this activity in 2 tasks.

This change will remove any implicit allocation and make abort not necessary.

src/libAtomVM/ets.c

tests/erlang_tests/test_ets.erl

src/libAtomVM/ets.c

jakub-gonet · 2025-01-12T01:44:52Z

src/libAtomVM/ets.c

+        return EtsTableNotFound;
+    }
+
+    EtsErrorCode result = ets_table_lookup(ets_table, key, ret, ctx);


While working on this code recently, I noticed that hashtable lookup take keypos arg which isn't needed (we have node->key and keypos can't change after table creation). May be worth to do it in this PR or in the followup.

Signed-off-by: Tomasz Sobkiewicz <[email protected]>

jakub-gonet · 2025-01-15T16:24:09Z

src/libAtomVM/ets.c

    }
    if (!term_is_nil(iter)) {
        return EtsBadEntry;
    }

+    struct HNode **hnode_list = malloc(size * sizeof(struct HNode *));


Is this correct HNode? I don't see it exposed via header file and we have few structures that are named HNode in the project. Small nit: maybe hnodes or nodes? This is an array, not a list.

jakub-gonet · 2025-01-15T16:26:05Z

src/libAtomVM/ets.c

+        return EtsAllocationFailure;
+    }
+
+    int cur = 0;


Why not i?

jakub-gonet · 2025-01-15T16:27:19Z

src/libAtomVM/ets.c

        list = term_get_list_tail(list);
    }

+    for (size_t i = 0; i < size; i++) {


jakub-gonet · 2025-01-15T16:28:01Z

src/libAtomVM/ets.c

+        return EtsAllocationFailure;
+    }
+
+    int cur = 0;
    while (term_is_nonempty_list(list)) {


Can't we do it in the previous loop instead of iterating twice?

jakub-gonet · 2025-01-15T16:29:33Z

src/libAtomVM/ets_hashtable.c

+        return NULL;
+    }
+    size_t size = (size_t) memory_estimate_usage(entry);
+    if (memory_init_heap(heap, size) != MEMORY_GC_OK) {


I was wondering, why we create new heap instead of piggybacking on owner process' heap?

jakub-gonet · 2025-01-15T16:30:10Z

src/libAtomVM/ets_hashtable.c

+
+void free_hashtable_node_array(struct HNode **allocated, size_t size, GlobalContext *global)
+{
+    for (size_t j = 0; j < size; j++) {


Why j instead of i? Also change post-increment to pre-increment please.

jakub-gonet · 2025-01-15T16:31:07Z

src/libAtomVM/ets_hashtable.c

+void free_hashtable_node_array(struct HNode **allocated, size_t size, GlobalContext *global)
+{
+    for (size_t j = 0; j < size; j++) {
+        if (allocated[j]) {


Do we need to check it?

jakub-gonet · 2025-01-15T16:31:26Z

src/libAtomVM/ets_hashtable.c

+    return new_node;
+}
+
+void free_hashtable_node_array(struct HNode **allocated, size_t size, GlobalContext *global)


I'm not sure if this should live here or in ets.c instead.

jakub-gonet · 2025-01-15T16:32:38Z

src/libAtomVM/ets_hashtable.c

+                    memory_destroy_heap(new_node->heap, global);
+                    free(new_node);


I don't like this, we shouldn't take ownership of the node here.

jakub-gonet · 2025-01-15T16:33:58Z

src/libAtomVM/ets_hashtable.c

                    memory_destroy_heap(node->heap, global);
-                    node->heap = heap;
+                    node->heap = new_node->heap;
+                    free(new_node);


I think we should either swap the node entirely or pass contents of it instead of using it as impromptu container, especially when we do swap it on hash collision.

This was referenced Dec 17, 2024

Add ets:update_counter TheSobkiewicz/AtomVM#1

Closed

Add ets:update_counter #1406

Open

bettio requested changes Dec 22, 2024

View reviewed changes

TheSobkiewicz mentioned this pull request Jan 7, 2025

Add ets:delete/1, ets:delete refactor #1461

Open

TheSobkiewicz force-pushed the thesobkiewicz/nifs/ets/refactor_insert branch 4 times, most recently from 76774f0 to 6ac7831 Compare January 9, 2025 15:44

jakub-gonet suggested changes Jan 12, 2025

View reviewed changes

TheSobkiewicz force-pushed the thesobkiewicz/nifs/ets/refactor_insert branch from 6ac7831 to c2bc9d2 Compare January 12, 2025 03:18

Add option to insert list in ets:insert, ets:lookup refactor

01456a3

Signed-off-by: Tomasz Sobkiewicz <[email protected]>

TheSobkiewicz force-pushed the thesobkiewicz/nifs/ets/refactor_insert branch from c2bc9d2 to 01456a3 Compare January 12, 2025 03:29

Insert list refactor. Moved node creation to separate function

45eccdc

Signed-off-by: Tomasz Sobkiewicz <[email protected]>

TheSobkiewicz force-pushed the thesobkiewicz/nifs/ets/refactor_insert branch from f54ef62 to 45eccdc Compare January 15, 2025 16:13

jakub-gonet suggested changes Jan 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to insert list in ets:insert, ets:lookup refactor #1405

Add option to insert list in ets:insert, ets:lookup refactor #1405

TheSobkiewicz commented Dec 17, 2024

bettio Dec 22, 2024

TheSobkiewicz Jan 7, 2025 •

edited

Loading

jakub-gonet Jan 12, 2025 •

edited

Loading

bettio Jan 13, 2025

jakub-gonet Jan 12, 2025

jakub-gonet Jan 15, 2025

jakub-gonet Jan 15, 2025

jakub-gonet Jan 15, 2025

jakub-gonet Jan 15, 2025

jakub-gonet Jan 15, 2025

jakub-gonet Jan 15, 2025

jakub-gonet Jan 15, 2025

jakub-gonet Jan 15, 2025

jakub-gonet Jan 15, 2025

jakub-gonet Jan 15, 2025

Add option to insert list in ets:insert, ets:lookup refactor #1405

Are you sure you want to change the base?

Add option to insert list in ets:insert, ets:lookup refactor #1405

Conversation

TheSobkiewicz commented Dec 17, 2024

Changes:

Use Cases for the Helper Functions:

Choose a reason for hiding this comment

TheSobkiewicz Jan 7, 2025 • edited Loading

Choose a reason for hiding this comment

jakub-gonet Jan 12, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheSobkiewicz Jan 7, 2025 •

edited

Loading

jakub-gonet Jan 12, 2025 •

edited

Loading