-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to insert list in ets:insert, ets:lookup refactor #1405
base: main
Are you sure you want to change the base?
Add option to insert list in ets:insert, ets:lookup refactor #1405
Conversation
src/libAtomVM/ets.c
Outdated
} | ||
EtsErrorCode result = ets_table_insert(ets_table, tuple, ctx); | ||
if (result != EtsOk) { | ||
AVM_ABORT(); // Abort because operation might not be atomic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually don't do VM abort: calling AVM_ABORT() means that an unrecoverable happened, such as memory corruption, a bad internal bug and any other kind of situation that required an entire VM crash and reboot.
Are we in this specific situation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we don't have any other tool to ensure atomicity here. In case the insert fails at the Nth element, elements (0,N -1) will be inserted into the list, which could result in hard-to-debug behavior. It is unlikely to happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To expand: without this abort, if we're short on memory, we'd leave list partially inserted. If someone tries to persist inserts someday we'd leave the system in inconsistent state.
To avoid that we need to either abort or allocate the list of previous values and rollback in case of error (ensuring that nothing allocates in rollback path since we're most likely dealing with OOM). Abort is easier to do here.
This check needs to have UNLIKELY
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, fair point, there is another feasible approach:
table nodes can be pre-allocated before making any change to the list, so in case of allocation failure freeing up allocated nodes can be easily done before making any actual change.
ets_hashtable_insert
will need an additional node parameter, and a dedicated allocation function might be created (e.g. ets_hashtable_new_node
). Furthermore key and and entry parameters can be moved to the ets_hashtable_new_node
function if it can help.
This change will have a very small impact since ets_hashtable_insert
is used in just one or two places.
I suggest doing this with an additional commit inside this PR, so we can make the review easier and separate this activity in 2 tasks.
This change will remove any implicit allocation and make abort not necessary.
76774f0
to
6ac7831
Compare
return EtsTableNotFound; | ||
} | ||
|
||
EtsErrorCode result = ets_table_lookup(ets_table, key, ret, ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While working on this code recently, I noticed that hashtable lookup take keypos arg which isn't needed (we have node->key
and keypos can't change after table creation). May be worth to do it in this PR or in the followup.
6ac7831
to
c2bc9d2
Compare
Signed-off-by: Tomasz Sobkiewicz <[email protected]>
c2bc9d2
to
01456a3
Compare
Signed-off-by: Tomasz Sobkiewicz <[email protected]>
f54ef62
to
45eccdc
Compare
} | ||
if (!term_is_nil(iter)) { | ||
return EtsBadEntry; | ||
} | ||
|
||
struct HNode **hnode_list = malloc(size * sizeof(struct HNode *)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct HNode? I don't see it exposed via header file and we have few structures that are named HNode in the project. Small nit: maybe hnodes
or nodes
? This is an array, not a list.
return EtsAllocationFailure; | ||
} | ||
|
||
int cur = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not i
?
list = term_get_list_tail(list); | ||
} | ||
|
||
for (size_t i = 0; i < size; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: ++i
return EtsAllocationFailure; | ||
} | ||
|
||
int cur = 0; | ||
while (term_is_nonempty_list(list)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we do it in the previous loop instead of iterating twice?
return NULL; | ||
} | ||
size_t size = (size_t) memory_estimate_usage(entry); | ||
if (memory_init_heap(heap, size) != MEMORY_GC_OK) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering, why we create new heap instead of piggybacking on owner process' heap?
|
||
void free_hashtable_node_array(struct HNode **allocated, size_t size, GlobalContext *global) | ||
{ | ||
for (size_t j = 0; j < size; j++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why j
instead of i
? Also change post-increment to pre-increment please.
void free_hashtable_node_array(struct HNode **allocated, size_t size, GlobalContext *global) | ||
{ | ||
for (size_t j = 0; j < size; j++) { | ||
if (allocated[j]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to check it?
return new_node; | ||
} | ||
|
||
void free_hashtable_node_array(struct HNode **allocated, size_t size, GlobalContext *global) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this should live here or in ets.c
instead.
memory_destroy_heap(new_node->heap, global); | ||
free(new_node); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this, we shouldn't take ownership of the node here.
memory_destroy_heap(node->heap, global); | ||
node->heap = heap; | ||
node->heap = new_node->heap; | ||
free(new_node); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should either swap the node entirely or pass contents of it instead of using it as impromptu container, especially when we do swap it on hash collision.
These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).
SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
Changes:
Use Cases for the Helper Functions:
The new helper functions can be utilized in the following ETS operations to reduce code duplication:
Every mentioned function will be implemented after merging of this PR.