SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Down Memory Lane:
Two Decades with the Slab Allocator
CTO
bryan@joyent.com
Bryan Cantrill
@bcantrill
Two decades ago…
Aside: Me, two decades ago…
Aside: Me, two decades ago…
Slab allocator, per Vahalia
First <3: Allocator footprint
I <3: Cache-aware data structures
I <3: Cache-aware data structures
Hardware Antiques Roadshow!
I <3: Magazine layer
I <3: Magazine autoscaling
I <3: Debugging support
• For many, the most salient property of the slab allocator is its
eponymous allocation properties…
• …but for me, its debugging support is much more meaningful
• Rich support for debugging memory corruption issues:
allocation auditing + detection of double-free/use-after-free/
use-before-initialize/buffer overrun
• Emphasis was leaving the functionality compiled in (and
therefore available in production) — even if off by default
• kmem_flags often used to debug many kinds of problems!
I <3: Debugging support
I <3: Debugger support
I <3: MDB support
• Work in crash(1M) inspired deeper kmem debugging support
in MDB, including commands to:
• Walk the buffers from a cache (::walk kmem)
• Display buffers allocated by a thread (::allocdby)
• Verify integrity of kernel memory (::kmem_verify)
• Determine the kmem cache of a buffer (::whatis)
• Find memory leaks (::findleaks)
• And of course, for Bonwick, ::kmastat and ::kmausers!
I <3: libumem
• Tragic to have a gold-plated kernel facility while user-land
suffered in the squalor of malloc(3C)…
• In the (go-go!) summer of 2000, Jonathan Adams (then a Sun
Microsystems intern from CalTech) ported the kernel memory
allocator to user-land as libumem
• Jonathan returned to Sun as a full-time engineer to finish and
integrate libumem; became open source with OpenSolaris
• libumem is the allocator for illumos and derivatives like
SmartOS — and has since been ported to other systems
libumem performance
• The slab allocator was designed to be scalable — not
necessarily to accommodate pathological software
• At user-level, pathological software is much more common…
• Worse, the flexibility of user-level means that operations that
are quick in the kernel (e.g., grabbing an uncontended lock)
are more expensive at user-level
• Upshot: while libumem provided great scalability, its latency
was worse than other allocators for applications with small,
short-lived allocations
libumem performance: node.js running MD5
I <3: Per-thread caching libumem
• Robert Mustacchi added per-thread caching to libumem:
• free() doesn’t free buffer, but rather enqueues on a per-
size cache on ulwp_t (thread) structure
• malloc() checks the thread’s cache at given size first
• Several problems:
• How to prevent cache from growing without bound?
• How to map dynamic libumem object cache sizes to fixed
range in ulwp_t structure without slowing fast path?
I <3: Per-thread caching libumem
• Cache growth problem solved by having each thread track
the total memory in its cache, and allowing per-thread cache
size to be tuned (default is 1M)
• Solving problem of optimal malloc() in light of arbitrary
libumem cache sizes is a little (okay, a lot) gnarlier; from

usr/src/lib/libumem/umem.c:
I <3 the slab allocator!
• It shows the importance of well-designed, well-considered,
well-described core services
• We haven’t need to rewrite it — it has withstood ~4 orders of
magnitude increase in machine size
• We have been able to meaningfully enhance it over the years
• It is (for me) the canonical immaculate system: more one to
use and be inspired by than need to fix or change
• It remains at the heart of our system — and its ethos very
much remains the zeitgeist of illumos!
Acknowledgements
• Jeff Bonwick for creating not just an allocator, but a way of
thinking about systems problems and of implementing their
solutions — and the courage to rewrite broken software!
• Jonathan Adams for not just taking on libumem, but also
writing about it formally and productizing it
• Robert Mustacchi for per-thread caching libumem
• Ryan Zezeski for bringing the slab allocator to a new
generation with his Papers We Love talk
Further reading
• Jeff Bonwick, The Slab Allocator: An Object-Caching Kernel Memory
Allocator, Summer USENIX, 1994
• Jeff Bonwick and Jonathan Adams, Magazines and Vmem: Extending
the Slab Allocator to Many CPUs and Arbitrary Resources, USENIX
Annual Technical Conference, 2001
• Robert Mustacchi, Per-thread caching in libumem, blog entry on
dtrace.org, July 2012
• Ryan Zezeski, Memory by the Slab: The Tale of Bonwick’s Slab
Allocator, Papers We Love NYC, September 2015
• Uresh Vahalia, UNIX Internals, 1996

Weitere ähnliche Inhalte

Was ist angesagt?

Run containers on bare metal already!
Run containers on bare metal already!Run containers on bare metal already!
Run containers on bare metal already!bcantrill
 
The DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps PlaybookThe DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps Playbookbcantrill
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondbcantrill
 
Leaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guideLeaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guidebcantrill
 
Oral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generationsOral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generationsbcantrill
 
Docker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote APIDocker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote APIbcantrill
 
The Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of dataThe Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of databcantrill
 
node.js in production: Reflections on three years of riding the unicorn
node.js in production: Reflections on three years of riding the unicornnode.js in production: Reflections on three years of riding the unicorn
node.js in production: Reflections on three years of riding the unicornbcantrill
 
Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)bcantrill
 
Manta: a new internet-facing object storage facility that features compute by...
Manta: a new internet-facing object storage facility that features compute by...Manta: a new internet-facing object storage facility that features compute by...
Manta: a new internet-facing object storage facility that features compute by...Hakka Labs
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathbcantrill
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarebcantrill
 
Docker-N-Beyond
Docker-N-BeyondDocker-N-Beyond
Docker-N-Beyondsantosh007
 
Course 101: Lecture 5: Linux & GNU
Course 101: Lecture 5: Linux & GNU Course 101: Lecture 5: Linux & GNU
Course 101: Lecture 5: Linux & GNU Ahmed El-Arabawy
 
Uklug2011.lotus.on.linux.report.technical.edition.v1.0
Uklug2011.lotus.on.linux.report.technical.edition.v1.0Uklug2011.lotus.on.linux.report.technical.edition.v1.0
Uklug2011.lotus.on.linux.report.technical.edition.v1.0dominion
 
Course 101: Lecture 1: Introduction to Embedded Systems
Course 101: Lecture 1: Introduction to Embedded SystemsCourse 101: Lecture 1: Introduction to Embedded Systems
Course 101: Lecture 1: Introduction to Embedded SystemsAhmed El-Arabawy
 
Docker - Hack Salem! - November 2014
Docker - Hack Salem! - November 2014Docker - Hack Salem! - November 2014
Docker - Hack Salem! - November 2014Charles Anderson
 
Meetup Mesos : Mesos, Chronos and Marathon in CI/CD factory
Meetup Mesos : Mesos, Chronos and Marathon in CI/CD factoryMeetup Mesos : Mesos, Chronos and Marathon in CI/CD factory
Meetup Mesos : Mesos, Chronos and Marathon in CI/CD factoryLaurent Grangeau
 
DevOps'n the Operating System
DevOps'n the Operating SystemDevOps'n the Operating System
DevOps'n the Operating SystemC4Media
 

Was ist angesagt? (20)

Run containers on bare metal already!
Run containers on bare metal already!Run containers on bare metal already!
Run containers on bare metal already!
 
The DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps PlaybookThe DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps Playbook
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyond
 
Leaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guideLeaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guide
 
Oral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generationsOral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generations
 
Docker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote APIDocker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote API
 
The Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of dataThe Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of data
 
node.js in production: Reflections on three years of riding the unicorn
node.js in production: Reflections on three years of riding the unicornnode.js in production: Reflections on three years of riding the unicorn
node.js in production: Reflections on three years of riding the unicorn
 
Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)
 
Manta: a new internet-facing object storage facility that features compute by...
Manta: a new internet-facing object storage facility that features compute by...Manta: a new internet-facing object storage facility that features compute by...
Manta: a new internet-facing object storage facility that features compute by...
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data path
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system software
 
Docker-N-Beyond
Docker-N-BeyondDocker-N-Beyond
Docker-N-Beyond
 
Course 101: Lecture 5: Linux & GNU
Course 101: Lecture 5: Linux & GNU Course 101: Lecture 5: Linux & GNU
Course 101: Lecture 5: Linux & GNU
 
Uklug2011.lotus.on.linux.report.technical.edition.v1.0
Uklug2011.lotus.on.linux.report.technical.edition.v1.0Uklug2011.lotus.on.linux.report.technical.edition.v1.0
Uklug2011.lotus.on.linux.report.technical.edition.v1.0
 
Course 101: Lecture 1: Introduction to Embedded Systems
Course 101: Lecture 1: Introduction to Embedded SystemsCourse 101: Lecture 1: Introduction to Embedded Systems
Course 101: Lecture 1: Introduction to Embedded Systems
 
Docker - Hack Salem! - November 2014
Docker - Hack Salem! - November 2014Docker - Hack Salem! - November 2014
Docker - Hack Salem! - November 2014
 
Meetup Mesos : Mesos, Chronos and Marathon in CI/CD factory
Meetup Mesos : Mesos, Chronos and Marathon in CI/CD factoryMeetup Mesos : Mesos, Chronos and Marathon in CI/CD factory
Meetup Mesos : Mesos, Chronos and Marathon in CI/CD factory
 
DevOps'n the Operating System
DevOps'n the Operating SystemDevOps'n the Operating System
DevOps'n the Operating System
 
Elatt Presentation
Elatt PresentationElatt Presentation
Elatt Presentation
 

Andere mochten auch

Leadership Without Management: Scaling Organizations by Scaling Engineers
Leadership Without Management: Scaling Organizations by Scaling EngineersLeadership Without Management: Scaling Organizations by Scaling Engineers
Leadership Without Management: Scaling Organizations by Scaling Engineersbcantrill
 
The State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destructionThe State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destructionbcantrill
 
Debugging (Docker) containers in production
Debugging (Docker) containers in productionDebugging (Docker) containers in production
Debugging (Docker) containers in productionbcantrill
 
Corporate Open Source Anti-patterns
Corporate Open Source Anti-patternsCorporate Open Source Anti-patterns
Corporate Open Source Anti-patternsbcantrill
 
Debugging microservices in production
Debugging microservices in productionDebugging microservices in production
Debugging microservices in productionbcantrill
 
A crime against common sense
A crime against common senseA crime against common sense
A crime against common sensebcantrill
 
Software Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniquesSoftware Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniquesAngelos Kapsimanis
 
Application resiliency using netflix hystrix
Application resiliency using netflix hystrixApplication resiliency using netflix hystrix
Application resiliency using netflix hystrixSenthilkumar Gopal
 
Concepts of React
Concepts of ReactConcepts of React
Concepts of Reactinovex GmbH
 

Andere mochten auch (10)

Leadership Without Management: Scaling Organizations by Scaling Engineers
Leadership Without Management: Scaling Organizations by Scaling EngineersLeadership Without Management: Scaling Organizations by Scaling Engineers
Leadership Without Management: Scaling Organizations by Scaling Engineers
 
The State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destructionThe State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destruction
 
Debugging (Docker) containers in production
Debugging (Docker) containers in productionDebugging (Docker) containers in production
Debugging (Docker) containers in production
 
Corporate Open Source Anti-patterns
Corporate Open Source Anti-patternsCorporate Open Source Anti-patterns
Corporate Open Source Anti-patterns
 
Debugging microservices in production
Debugging microservices in productionDebugging microservices in production
Debugging microservices in production
 
A crime against common sense
A crime against common senseA crime against common sense
A crime against common sense
 
Software Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniquesSoftware Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniques
 
Application resiliency using netflix hystrix
Application resiliency using netflix hystrixApplication resiliency using netflix hystrix
Application resiliency using netflix hystrix
 
Concepts of React
Concepts of ReactConcepts of React
Concepts of React
 
Testing Microservices
Testing MicroservicesTesting Microservices
Testing Microservices
 

Ähnlich wie Down Memory Lane: Two Decades with the Slab Allocator

Search Architecture at Evernote: Presented by Christian Kohlschütter, Evernote
Search Architecture at Evernote: Presented by Christian Kohlschütter, EvernoteSearch Architecture at Evernote: Presented by Christian Kohlschütter, Evernote
Search Architecture at Evernote: Presented by Christian Kohlschütter, EvernoteLucidworks
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scalejimriecken
 
Hard Coding as a design approach
Hard Coding as a design approachHard Coding as a design approach
Hard Coding as a design approachOren Eini
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remanijaxconf
 
Make It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version ControlMake It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version Controlindiver
 
Vulnerability, exploit to metasploit
Vulnerability, exploit to metasploitVulnerability, exploit to metasploit
Vulnerability, exploit to metasploitTiago Henriques
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling SoftwareAbdelmonaim Remani
 
Introduction of vertical crawler
Introduction of vertical crawlerIntroduction of vertical crawler
Introduction of vertical crawlerJinglun Li
 
But we're already open source! Why would I want to bring my code to Apache?
But we're already open source! Why would I want to bring my code to Apache?But we're already open source! Why would I want to bring my code to Apache?
But we're already open source! Why would I want to bring my code to Apache?gagravarr
 
eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...
eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...
eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...Gaetano Giunta
 
Hands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestrationHands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestrationAmir Hossein Sorouri
 
NDC London 2020 - Challenges of Managing CoreFx Repo -- Karel Zikmund
NDC London 2020 - Challenges of Managing CoreFx Repo -- Karel ZikmundNDC London 2020 - Challenges of Managing CoreFx Repo -- Karel Zikmund
NDC London 2020 - Challenges of Managing CoreFx Repo -- Karel ZikmundKarel Zikmund
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP120bi
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsAchievers Tech
 
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...MayaData
 
2020 oct zowe quarterly webinar series
2020 oct zowe quarterly webinar series2020 oct zowe quarterly webinar series
2020 oct zowe quarterly webinar seriesOpen Mainframe Project
 
MPI Requirements of the Network Layer
MPI Requirements of the Network LayerMPI Requirements of the Network Layer
MPI Requirements of the Network Layerinside-BigData.com
 
Linux operating system
Linux operating systemLinux operating system
Linux operating systemITz_1
 
Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...
Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...
Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...Marina Peregud
 

Ähnlich wie Down Memory Lane: Two Decades with the Slab Allocator (20)

Search Architecture at Evernote: Presented by Christian Kohlschütter, Evernote
Search Architecture at Evernote: Presented by Christian Kohlschütter, EvernoteSearch Architecture at Evernote: Presented by Christian Kohlschütter, Evernote
Search Architecture at Evernote: Presented by Christian Kohlschütter, Evernote
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scale
 
Hard Coding as a design approach
Hard Coding as a design approachHard Coding as a design approach
Hard Coding as a design approach
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remani
 
Make It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version ControlMake It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version Control
 
Vulnerability, exploit to metasploit
Vulnerability, exploit to metasploitVulnerability, exploit to metasploit
Vulnerability, exploit to metasploit
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling Software
 
Introduction of vertical crawler
Introduction of vertical crawlerIntroduction of vertical crawler
Introduction of vertical crawler
 
But we're already open source! Why would I want to bring my code to Apache?
But we're already open source! Why would I want to bring my code to Apache?But we're already open source! Why would I want to bring my code to Apache?
But we're already open source! Why would I want to bring my code to Apache?
 
eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...
eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...
eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...
 
Hands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestrationHands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestration
 
NDC London 2020 - Challenges of Managing CoreFx Repo -- Karel Zikmund
NDC London 2020 - Challenges of Managing CoreFx Repo -- Karel ZikmundNDC London 2020 - Challenges of Managing CoreFx Repo -- Karel Zikmund
NDC London 2020 - Challenges of Managing CoreFx Repo -- Karel Zikmund
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web Applications
 
Random House
Random HouseRandom House
Random House
 
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
 
2020 oct zowe quarterly webinar series
2020 oct zowe quarterly webinar series2020 oct zowe quarterly webinar series
2020 oct zowe quarterly webinar series
 
MPI Requirements of the Network Layer
MPI Requirements of the Network LayerMPI Requirements of the Network Layer
MPI Requirements of the Network Layer
 
Linux operating system
Linux operating systemLinux operating system
Linux operating system
 
Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...
Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...
Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...
 

Mehr von bcantrill

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Presentbcantrill
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmakingbcantrill
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...bcantrill
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsbcantrill
 
Towards Holistic Systems
Towards Holistic SystemsTowards Holistic Systems
Towards Holistic Systemsbcantrill
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolutionbcantrill
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Agebcantrill
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesbcantrill
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Lawbcantrill
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineeringbcantrill
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemapsbcantrill
 
Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?bcantrill
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the unionbcantrill
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsbcantrill
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after darkbcantrill
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadershipbcantrill
 
Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindbcantrill
 

Mehr von bcantrill (17)

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Present
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmaking
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systems
 
Towards Holistic Systems
Towards Holistic SystemsTowards Holistic Systems
Towards Holistic Systems
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolution
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Age
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Law
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemaps
 
Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the union
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systems
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after dark
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadership
 
Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mind
 

Kürzlich hochgeladen

Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
The Kubernetes Gateway API and its role in Cloud Native API Management
The Kubernetes Gateway API and its role in Cloud Native API ManagementThe Kubernetes Gateway API and its role in Cloud Native API Management
The Kubernetes Gateway API and its role in Cloud Native API ManagementNuwan Dias
 
IEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
IEEE Computer Society’s Strategic Activities and Products including SWEBOK GuideIEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
IEEE Computer Society’s Strategic Activities and Products including SWEBOK GuideHironori Washizaki
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Governance in SharePoint Premium:What's in the box?
Governance in SharePoint Premium:What's in the box?Governance in SharePoint Premium:What's in the box?
Governance in SharePoint Premium:What's in the box?Juan Carlos Gonzalez
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
100+ ChatGPT Prompts for SEO Optimization
100+ ChatGPT Prompts for SEO Optimization100+ ChatGPT Prompts for SEO Optimization
100+ ChatGPT Prompts for SEO Optimizationarrow10202532yuvraj
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
UiPath Studio Web workshop series - Day 5
UiPath Studio Web workshop series - Day 5UiPath Studio Web workshop series - Day 5
UiPath Studio Web workshop series - Day 5DianaGray10
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
99.99% of Your Traces Are (Probably) Trash (SRECon NA 2024).pdf
99.99% of Your Traces  Are (Probably) Trash (SRECon NA 2024).pdf99.99% of Your Traces  Are (Probably) Trash (SRECon NA 2024).pdf
99.99% of Your Traces Are (Probably) Trash (SRECon NA 2024).pdfPaige Cruz
 

Kürzlich hochgeladen (20)

Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
The Kubernetes Gateway API and its role in Cloud Native API Management
The Kubernetes Gateway API and its role in Cloud Native API ManagementThe Kubernetes Gateway API and its role in Cloud Native API Management
The Kubernetes Gateway API and its role in Cloud Native API Management
 
IEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
IEEE Computer Society’s Strategic Activities and Products including SWEBOK GuideIEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
IEEE Computer Society’s Strategic Activities and Products including SWEBOK Guide
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Governance in SharePoint Premium:What's in the box?
Governance in SharePoint Premium:What's in the box?Governance in SharePoint Premium:What's in the box?
Governance in SharePoint Premium:What's in the box?
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
100+ ChatGPT Prompts for SEO Optimization
100+ ChatGPT Prompts for SEO Optimization100+ ChatGPT Prompts for SEO Optimization
100+ ChatGPT Prompts for SEO Optimization
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
UiPath Studio Web workshop series - Day 5
UiPath Studio Web workshop series - Day 5UiPath Studio Web workshop series - Day 5
UiPath Studio Web workshop series - Day 5
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
99.99% of Your Traces Are (Probably) Trash (SRECon NA 2024).pdf
99.99% of Your Traces  Are (Probably) Trash (SRECon NA 2024).pdf99.99% of Your Traces  Are (Probably) Trash (SRECon NA 2024).pdf
99.99% of Your Traces Are (Probably) Trash (SRECon NA 2024).pdf
 

Down Memory Lane: Two Decades with the Slab Allocator

  • 1. Down Memory Lane: Two Decades with the Slab Allocator CTO bryan@joyent.com Bryan Cantrill @bcantrill
  • 3. Aside: Me, two decades ago…
  • 4. Aside: Me, two decades ago…
  • 7. I <3: Cache-aware data structures
  • 8. I <3: Cache-aware data structures Hardware Antiques Roadshow!
  • 10. I <3: Magazine autoscaling
  • 11. I <3: Debugging support • For many, the most salient property of the slab allocator is its eponymous allocation properties… • …but for me, its debugging support is much more meaningful • Rich support for debugging memory corruption issues: allocation auditing + detection of double-free/use-after-free/ use-before-initialize/buffer overrun • Emphasis was leaving the functionality compiled in (and therefore available in production) — even if off by default • kmem_flags often used to debug many kinds of problems!
  • 12. I <3: Debugging support
  • 13. I <3: Debugger support
  • 14. I <3: MDB support • Work in crash(1M) inspired deeper kmem debugging support in MDB, including commands to: • Walk the buffers from a cache (::walk kmem) • Display buffers allocated by a thread (::allocdby) • Verify integrity of kernel memory (::kmem_verify) • Determine the kmem cache of a buffer (::whatis) • Find memory leaks (::findleaks) • And of course, for Bonwick, ::kmastat and ::kmausers!
  • 15. I <3: libumem • Tragic to have a gold-plated kernel facility while user-land suffered in the squalor of malloc(3C)… • In the (go-go!) summer of 2000, Jonathan Adams (then a Sun Microsystems intern from CalTech) ported the kernel memory allocator to user-land as libumem • Jonathan returned to Sun as a full-time engineer to finish and integrate libumem; became open source with OpenSolaris • libumem is the allocator for illumos and derivatives like SmartOS — and has since been ported to other systems
  • 16. libumem performance • The slab allocator was designed to be scalable — not necessarily to accommodate pathological software • At user-level, pathological software is much more common… • Worse, the flexibility of user-level means that operations that are quick in the kernel (e.g., grabbing an uncontended lock) are more expensive at user-level • Upshot: while libumem provided great scalability, its latency was worse than other allocators for applications with small, short-lived allocations
  • 18. I <3: Per-thread caching libumem • Robert Mustacchi added per-thread caching to libumem: • free() doesn’t free buffer, but rather enqueues on a per- size cache on ulwp_t (thread) structure • malloc() checks the thread’s cache at given size first • Several problems: • How to prevent cache from growing without bound? • How to map dynamic libumem object cache sizes to fixed range in ulwp_t structure without slowing fast path?
  • 19. I <3: Per-thread caching libumem • Cache growth problem solved by having each thread track the total memory in its cache, and allowing per-thread cache size to be tuned (default is 1M) • Solving problem of optimal malloc() in light of arbitrary libumem cache sizes is a little (okay, a lot) gnarlier; from
 usr/src/lib/libumem/umem.c:
  • 20. I <3 the slab allocator! • It shows the importance of well-designed, well-considered, well-described core services • We haven’t need to rewrite it — it has withstood ~4 orders of magnitude increase in machine size • We have been able to meaningfully enhance it over the years • It is (for me) the canonical immaculate system: more one to use and be inspired by than need to fix or change • It remains at the heart of our system — and its ethos very much remains the zeitgeist of illumos!
  • 21. Acknowledgements • Jeff Bonwick for creating not just an allocator, but a way of thinking about systems problems and of implementing their solutions — and the courage to rewrite broken software! • Jonathan Adams for not just taking on libumem, but also writing about it formally and productizing it • Robert Mustacchi for per-thread caching libumem • Ryan Zezeski for bringing the slab allocator to a new generation with his Papers We Love talk
  • 22. Further reading • Jeff Bonwick, The Slab Allocator: An Object-Caching Kernel Memory Allocator, Summer USENIX, 1994 • Jeff Bonwick and Jonathan Adams, Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources, USENIX Annual Technical Conference, 2001 • Robert Mustacchi, Per-thread caching in libumem, blog entry on dtrace.org, July 2012 • Ryan Zezeski, Memory by the Slab: The Tale of Bonwick’s Slab Allocator, Papers We Love NYC, September 2015 • Uresh Vahalia, UNIX Internals, 1996