Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embed hash value in hash type entry #1579

Open
wants to merge 7 commits into
base: unstable
Choose a base branch
from

Conversation

zuiderkwast
Copy link
Contributor

@zuiderkwast zuiderkwast commented Jan 17, 2025

For a hash type key represented as a hash table, embed the field and value in one allocation when they together fit in the size of one cache line. For larger fields and values, another layout is used where the value is a separately allocated sds string.

Implementation

The hashTypeEntry pointer is changed to be the field sds, i.e. a pointer to the embedded field content, which is located after the sds header in memory. We encode the entry layout in the field sds field for unused space, so sdsavail(entry) gives the entry encoding.

Entry with embedded field and value, used when it fits in a cache line:

+--------------+----------------+--------------+
| field        | 1 byte         | value        |
| hdr "foo" \0 | value-hdr-size | hdr "bar" \0 |
+------^-------+----------------+--------------+
       |
       |
     entry pointer = field sds

Entry with value-pointer and embedded field, used for larger field-value pairs:

+-------+--------------+
| value | field        |
| ptr   | hdr "foo" \0 |
+-------+------^-------+
               |
               |
            entry pointer = field sds

This implementation does not require the fields nor values to be of a specific SDS encoding, but it does convert the field from sds5 to sds8 in the second entry layout (the value-pointer embedded-field), so we can encode the type. An embedded sds5 field implies the first entry layout.

Fixes #1551.

Copy link

codecov bot commented Jan 17, 2025

Codecov Report

Attention: Patch coverage is 88.33333% with 14 lines in your changes missing coverage. Please review.

Project coverage is 71.03%. Comparing base (7fc958d) to head (e40ccfd).

Files with missing lines Patch % Lines
src/t_hash.c 88.13% 14 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1579      +/-   ##
============================================
+ Coverage     70.80%   71.03%   +0.23%     
============================================
  Files           121      121              
  Lines         65132    65226      +94     
============================================
+ Hits          46118    46336     +218     
+ Misses        19014    18890     -124     
Files with missing lines Coverage Δ
src/object.c 82.04% <100.00%> (ø)
src/sds.c 86.87% <100.00%> (+0.02%) ⬆️
src/server.h 100.00% <ø> (ø)
src/t_hash.c 94.72% <88.13%> (-1.41%) ⬇️

... and 10 files with indirect coverage changes

Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
@zuiderkwast zuiderkwast force-pushed the hash-entry-embed-value branch from d3a6269 to 62cb46f Compare January 21, 2025 09:14
Signed-off-by: Viktor Söderqvist <[email protected]>
@zuiderkwast zuiderkwast requested a review from ranshid January 21, 2025 09:51
@zuiderkwast zuiderkwast marked this pull request as ready for review January 21, 2025 09:51
Comment on lines +126 to +132
/* We can't use SDS_TYPE_5 to encode extra information in the unused
* allocation size, so convert to SDS_TYPE_8. */
struct sdshdr8 *sh = (void *)entry->field_data;
sh->flags = SDS_TYPE_8;
sh->len = sdslen(field);
embedded_field_sds = (sds)(entry->field_data + sizeof(struct sdshdr8));
memcpy(embedded_field_sds, field, sdslen(field) + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I suggest something?
We can add another type of sdscopytobuffer which forces a specific type and can also be used by sdscopytobuffer:

/* This method copies the sds `s` into `buf` which is the target character buffer. */
size_t sdscopytobufferwithtype(unsigned char *buf, size_t buf_len, char type, const sds s, uint8_t *hdr_size) {
    assert((s[-1] & SDS_TYPE_MASK) <= type);
    size_t header_len = sdsHdrSize(type);
    size_t str_len = sdslen(s);
    size_t alloc_size = header_len + str_len + 1;
    if (hdr_size) *hdr_size = sdsHdrSize(type);
    if (buf == NULL) {
        return alloc_size;
    }
    assert(buf_len >= alloc_size);
    sds embed_sds = (sds)(buf + header_len);
    embed_sds[-1] = type & SDS_TYPE_MASK;
    sdssetlen(embed_sds, str_len);
    sdssetalloc(embed_sds, alloc_size);
    memcpy(embed_sds, s, str_len + 1);
    return alloc_size;
}

Comment on lines +38 to +60
/* The hashTypeEntry pointer is the field sds. We encode the entry layout in the
* field sds field for unused space, so sdsavail(entry) gives the entry encoding.
*
* ENTRY_ENC_EMB_VALUE, used when it fits in a cache line:
*
* +--------------+----------------+--------------+
* | field | 1 byte | value |
* | hdr "foo" \0 | value-hdr-size | hdr "bar" \0 |
* +------^-------+----------------+--------------+
* |
* |
* entry pointer = field sds
*
* ENTRY_ENC_PTR_VALUE, used for larger fields and values:
*
* +-------+--------------+
* | value | field |
* | ptr | hdr "foo" \0 |
* +-------+------^-------+
* |
* |
* entry pointer = field sds
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for turning into Scrooge McDuck.
I actually thought about this the opposite way around. I think we can FORCE non sds5 fields only when the value IS embedded.
So the question of: canUseEmbeddedValueEntry can take into account the new field sds header length and the value sds size.
This may also allow us to use the alloc of embedded fields to hold the value header size, so when the field sds alloc is 0 I actually know this is a ENTRY_ENC_PTR_VALUE and in case it is not 0 I know it is ENTRY_ENC_EMB_VALUE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Follow-up of #1502: Embed field and value in hashTypeEntry
2 participants