Reading for 10/26: Fancy Memory Management #406
Replies: 13 comments 30 replies
-
I was impressed by the cleverness of hiding the time cost of redirection for compaction without relocation in the existing virtual page lookup TLB, resulting 16% memory savings with 2% runtime overhead is a tradeoff appealing to a wide range of applications. |
Beta Was this translation helpful? Give feedback.
-
The paper doesn't give us much context on what a Robson bound is. That led me to this paper which gives upper and lower bounds for "the amount of store necessary to operate a dynamic storage allocation system, subject to certain constraints, with no risk of breakdown due to storage fragmentation." Robson defines
And then puts lower and upper bounds on them in the second page. These are really hairy equations. I think in big-O notation the lower bound is |
Beta Was this translation helpful? Give feedback.
-
The introduction writes that
I can imagine why we would do this when there is a strict limit to memory we can have in a program. I'm not sure, but maybe early Nintendo games used these techniques when the address space was small? I wonder if any entity is doing such heinous optimizations in the current world and why. |
Beta Was this translation helpful? Give feedback.
-
The paper only compares with Firefox, Redis, and SPECint2006. For SPECint2006, the authors write that
This is surprising because this paper is fairly new (2019). Are there really no benchmarks that folks can use to measure memory operations? |
Beta Was this translation helpful? Give feedback.
-
I thought that the paper offered an intriguing approach to tackling memory fragmentation, a prevalent issue in C/C++ applications. By presenting a novel memory allocator that opportunistically merges disjoint free blocks, they provide a potential solution that bridges the gap between traditional memory allocators and garbage collectors. One key question that arises is how generalizable this approach is across a wide variety of real-world applications and workloads. While their results are promising, it would be interesting to explore the overhead introduced by this system in different scenarios and whether certain use cases might present edge cases where Mesh's benefits diminish. Additionally, how does Mesh perform in multi-threaded or concurrent applications, which have their own unique set of challenges? |
Beta Was this translation helpful? Give feedback.
-
Like @rcplane, I was also pretty impressed with the clever way the authors perform relocation. One thing I think that this paper made me realize is how big of a deal fragmentation is. Mesh itself introduces quite a few opportunities for increased memory usage. For example, allocating a 32 and 64-byte object requires 8Kb due to needing 2 spans, a 33-byte object needs to be rounded up into the next object size, and the memory needed for the allocation bitmaps and other required metadata. I suppose I never needed to think about fragmentation too much, but seeing the significant overall memory savings while also needing extra memory made this stick out to me. With that being said when running on larger programs that need a lot of memory, the extra page per object size, per thread, and other additional memory introduced by Mesh probably doesn't account for much of the total memory usage anyway. |
Beta Was this translation helpful? Give feedback.
-
Like what others commented, I am really impressed by the memory usage reduction by 16% with only a 2% performance hit. I think Mesh would be a really good drop-in replacement for a system's implementation of Also, I am really intrigued by Mesh's use of carefully placed locks to avoid using a stop-the-world scheme. It does mention some concurrent garbage collectors that also don't use stop-the-world schemes, but these also add a decent amount of overhead, that Mesh doesn't seem to add. It does seem like the locks are more fine-grained, but it's not super clear to me how exactly it avoids the same level of overhead as garbage collectors. |
Beta Was this translation helpful? Give feedback.
-
I'm especially fascinated with the use of randomization for memory management; I'm used to the paradigm of programming languages and compilers seeking to attain completely verified correctness, but it seems like there are plenty of performance gains to be made if we relax some guarantees. I'm wondering if there are other applications of randomization/approximate algorithms in PL? In practice, I would feel very comfortable with probabilities claimed in the paper. My opinion is if one wants better certainty than say |
Beta Was this translation helpful? Give feedback.
-
I found the analysis section super enjoyable to read. It was neat that for a tool with such improvements on performance with little overhead, the proofs of theoretical guarantees were quite simple/elegant. After our first discussion, I've been noticing whether papers do a good job of describing experimental setup. I appreciated that this one in particular does pretty well. |
Beta Was this translation helpful? Give feedback.
-
I thought this was a really cool paper overall! The way that algorithm analysis was integrated with lower-level systems concepts in the presentation of this paper was one fact that particularly intrigued me—it’s not often that you see a reduction to an NP-Hard problem and detailed descriptions of how paging works in the same paper. I also really liked the way the authors analyzed their method: profiling real-world use cases beyond standard benchmark suites is certainly something to commend. |
Beta Was this translation helpful? Give feedback.
-
This is the best paper we've read so far in my opinion. It is a really clever use of However, I wonder if relocating the physical pages can cause some momentary cache thrashing. Maybe as a portion of total runtime, additional cache misses are very small. But I'd image that MESH still wouldn't be good for programs that need very consistent pacing with no stutter, like video games. |
Beta Was this translation helpful? Give feedback.
-
This was a great read! I did have a couple thoughts:
|
Beta Was this translation helpful? Give feedback.
-
The paper uses the canonical idea of "randomize when you are (literally) stuck" in a clever way. Adding some random jitter when you are backing off during transmission, or throwing some local coins when you are in a deadlock, are examples of the same mindset. Basically, if you are trying to fit all possible scenarios into the same bag, you have to pay the price for the worst one among them, whereas by randomizing you give yourself a higher chance of dodging that price. When compared to the GC dichotomy we've been seeing, namely RC vs mark'n'sweep, this paper seems to take a more holistic approach to memory management as opposed to just collecting garbage. I guess this is the key perspective that has enabled them to present a seemingly new approach that demonstrates new potentials and possibilities. Taken as a whole, memory management starting from allocation to collection provides us with more options and knobs to play with. |
Beta Was this translation helpful? Give feedback.
-
Hi everyone!
Here's the discussion thread for the Fancy Memory Management paper discussion on Thursday October 26. The paper can be found here.
Post any thoughts, questions, or comments you might have before the discussion. Looking forward to it!
Beta Was this translation helpful? Give feedback.
All reactions