• Latest
  • Trending
  • All
VMware ESXi Purple Screen of Death (PSOD): Diagnose and Recover (2026) - cover image

VMware ESXi Purple Screen of Death (PSOD): Diagnose and Recover (2026)

June 14, 2026
ssh command cheatsheet

SSH Command Cheatsheet: Connect, Keys, scp, Tunnels (2026)

June 16, 2026
chmod-chown-cheatsheet

chmod and chown Cheatsheet: Linux Permissions, Decoded (2026)

June 16, 2026
systemctl-journalctl-cheatsheet

systemctl + journalctl Cheatsheet: Services and Logs (2026)

June 16, 2026
grep-cheatsheet

The grep Cheatsheet: Search a File, Search a Tree (2026)

June 16, 2026
rsync-cheatsheet

The rsync Cheatsheet: Mirror, Sync, Copy Over SSH (2026)

June 16, 2026
curl-cheatsheet

curl Cheatsheet: Download Files and Test APIs (2026)

June 16, 2026
iptables-vs-nftables-cheatsheet cheatsheet

iptables vs nftables: Linux Firewall Cheatsheet, Side by Side

June 16, 2026
nmcli-cheatsheet cheatsheet

nmcli Cheatsheet: Wi-Fi and Network Connections From the Linux Terminal

June 16, 2026
powershell-networking-cheatsheet cheatsheet

PowerShell Networking Cheatsheet: Test-NetConnection, IP, DNS (2026)

June 16, 2026
tar command cheatsheet

The tar Command Cheatsheet: Create, Extract, Stop Guessing (2026)

June 16, 2026
Linux find command cheatsheet

The find Command Cheatsheet: Every Recipe You Actually Use (2026)

June 15, 2026
Linux networking commands cheatsheet, ip and ss

Linux Networking Commands in 2026: the ip and ss Cheatsheet

June 15, 2026
  • Online Tools
  • Network Tools
  • Developer Tools
  • Security Tools
Tuesday, June 16, 2026
  • Login
People Are Geek
  • Online Tools
  • Network Tools
  • Developer Tools
  • Security Tools
No Result
View All Result
People Are Geek
No Result
View All Result
Home Online Tools

VMware ESXi Purple Screen of Death (PSOD): Diagnose and Recover (2026)

by People Are Geek
June 14, 2026
in Online Tools, Server Tools
0
VMware ESXi Purple Screen of Death (PSOD): Diagnose and Recover (2026) - cover image
0
SHARES
13
VIEWS
Share on FacebookShare on Twitter

Fix guide VMware ESXi · 9 min read · Published June 2026

A host just dropped every VM it was running and threw a wall of purple text at the console. Welcome to the Purple Screen of Death (PSOD), which is ESXi’s take on a Windows blue screen. The VMkernel hit something it couldn’t recover from and froze on purpose. The alternative? Letting it scribble garbage all over your VM data, and that’s worse. Looks like the end of the world. It usually isn’t. That screen is basically a confession, and once you know which lines to read, it’ll point straight at whoever did it. Here’s the order I work a PSOD in. Read the screen. Narrow it to one of maybe five usual suspects, pull the coredump while it’s still there, bring the host back, then make sure the same thing isn’t waiting to bite you next week.

Annotated ESXi purple screen of death showing the five things to read: the ESXi build version, the exception type (#PF Exception 14 = page fault = driver), the backtrace top frame naming the failing storage HBA driver, the coredump written confirmation, and a warning when no dump target is configured.
Figure 1. The five bits of a PSOD I actually read. The exception type and the top backtrace frame hand you the cause between them. A #PF Exception 14 sitting next to a driver name is nearly always that driver misbehaving. A LINT1/NMI is the hardware screaming. And please, photograph the whole thing before you reboot. I’ve kicked myself for skipping that, more than once.

Contents

  1. What a PSOD actually is
  2. The five things to read on the screen
  3. The causes behind almost every PSOD
  4. Capture the coredump before you reboot
  5. Recover the host
  6. Stop it happening again
  7. FAQ

What a PSOD actually is

Under the hood, ESXi is a tiny purpose-built kernel called the VMkernel, sitting right on the metal. When it trips over something it can’t safely keep running from (a bad memory access, or a non-maskable interrupt thrown up by the hardware, or some internal sanity check that just doesn’t add up), it doesn’t try to limp along. It stops dead and paints that purple diagnostic screen. Continuing would mean gambling with your VM data, which is a worse outcome than an outage, so it doesn’t. Every VM on the box freezes in that instant. Treat the PSOD as a symptom, never the disease. Most of the time the real culprit is a flaky driver, or hardware on its way out. An actual ESXi bug? That one’s rare, and honestly I’d bet against it before I’d bet on it, though I’ll admit I’ve been surprised once or twice.

Recommended homelab gearWe may earn a commission, at no extra cost to you.
Mini Pc HomelabCheck price on Amazon →Nas EnclosureCheck price on Amazon →Ups Battery BackupCheck price on Amazon →2.5 Sata SsdCheck price on Amazon →

The five things to read on the screen

First thing, before you lay a finger on the keyboard: photograph the whole screen. Then walk these five spots, numbered on Figure 1. That’s where the answer’s hiding.

  1. The ESXi build up top. You’ll want it later to check the failing driver against the VMware Compatibility Guide for that exact build. “Close enough” doesn’t count here, and I learned that the annoying way.
  2. The exception type. A #PF Exception 14 is a page fault, which means a driver reached into memory it had no business touching. See LINT1 motherboard interrupt or NMI instead? That bubbled up from the hardware. A VMFS or heap message is storage waving its hand at you.
  3. The top of the backtrace. Read top-down. The first named module is almost always your guy: a NIC driver, or some storage HBA driver, maybe a multipathing plugin. Everything below it is mostly the kernel falling over. It’s that top line I care about.
  4. The coredump status. “Successfully wrote dump file” is the line you’re hoping for, because now you’ve got something to dig into. “No place on disk to dump data” means nothing was set up to catch it. You just lost your evidence.
  5. Whether a dump target even exists. If it doesn’t, fix that before anything else, so the next PSOD actually leaves you something to work with.

The causes behind almost every PSOD

Signature on screenMost likely cause
#PF Exception 14 + a driver nameFaulty or mismatched driver (NIC, HBA, RAID), update or roll back the driver to the HCL version
LINT1 / NMIHardware: bad memory, failing CPU or PCIe card, check the server hardware logs
VMFS / heap exhaustionStorage heap ran out, raise the heap setting or upgrade ESXi; rebalance large VMDKs
PCPU N locked upA CPU stuck in a driver/firmware spinlock, usually firmware; update BIOS and the implicated driver
Repeated after a recent changeThe driver, firmware or VIB you just installed, roll it back

If I had to bet on what’s actually behind a PSOD in the wild, my money goes on a network or storage driver that doesn’t line up with the ESXi build. It loves showing up right after an upgrade, where the inbox driver quietly shifted underneath you and nobody clocked it. Second place? Hardware, and usually that means memory. So read the exception line and figure out which family you’re sitting in. Most of the diagnosis is already behind you at that point. What’s left is mostly confirming the hunch.

Capture the coredump before you reboot

The coredump is the gold here. It’s the thing that lets you, or VMware support once you open a ticket, pin down the exact instruction that blew up. Not a vague “something in the storage driver,” the actual instruction. So before anything else, make sure there’s somewhere for the host to write one:

esxcli system coredump partition list
esxcli system coredump partition get
# if none is set, configure the local diagnostic partition:
esxcli system coredump partition set --enable true --smart
# or send dumps to a network collector:
esxcli system coredump network set --interface-name vmk0 \
  --server-ipv4 10.0.0.50 --server-port 6500 --enable true

Got a dump? Good. Once the host is back, run vm-support to bundle it together with the logs that go with it. That one archive is what you hand to support. It spares you the “can you also send us the vmkernel log” reply, the one that always lands three emails deep.

Seriously, don’t reboot a PSOD’d host until you’ve photographed the screen. The moment it cycles, that on-screen backtrace is gone for good. The only thing that survives the reboot is the coredump, and that’s assuming one even got written in the first place.

Recover the host

The host is wedged. No graceful save here. Recovery means a clean restart, then the real work of figuring out why. In that order:

  1. Restart the host. Power-cycle it through the out-of-band controller (iLO, iDRAC, whatever your vendor badges it as), or walk over and do it by hand if you have to. The VMs come back up on it. Or HA restarts them somewhere else, assuming you set that up ahead of time.
  2. Make sure it came back clean. Run esxcli system version get, then read /var/log/vmkernel.log for whatever happened in the seconds right before the crash. That’s usually where the breadcrumb’s waiting.
  3. Go after the cause. Update or roll back the driver the backtrace named (esxcli software vib list | grep <driver>), push fresh firmware if it was a hardware fault, or bump the setting if you ran a heap dry. You already know which from the exception line.
  4. If it crashes again right away, stop fighting it live. Drop the host into maintenance mode and pull the VMs off, so a crash loop isn’t dragging your workloads down with it every five minutes.

Stop it happening again

  • Live on the HCL. Run only the driver and firmware versions VMware actually lists against your exact ESXi build. Not the newest one, the listed one. I keep harping on this because most of the PSODs I’ve chased trace right back to someone wandering off it.
  • Never run a host without a dump target. A PSOD that wrote nothing is an outage you paid for and learned nothing from, and that one stings. Set up a partition or a network dump collector on every single host, no exceptions.
  • Move firmware and drivers as a pair. Use your server vendor’s ESXi custom image or addon so the two stay in lockstep. Updating one and forgetting the other is basically how you manufacture your next purple screen.
  • Test the memory on any host that went down with a hardware NMI before you trust it with production again. A reboot that “fixes” it just means the bad DIMM is sitting there, waiting for you.
  • Keep ESXi reasonably current. Newer builds quietly nudge heap limits up and patch the nastier driver-interaction bugs, so a good chunk of the problems on this page just stop showing up.

FAQ

Is a PSOD always a VMware bug?

Almost never, and I’ll happily take that bet. The vast majority come down to a third-party driver that doesn’t match the ESXi build, or hardware on its way out (usually memory). ESXi only halts to keep your data safe, and it drops the actual offender right there on the screen for you. So start by treating it as a driver or hardware problem, not a VMware one. You’ll be right far more often than you’re wrong.

What does #PF Exception 14 mean on a PSOD?

It’s a page fault. A kernel module reached for memory it had no right to touch. That’s the textbook fingerprint of a buggy or mismatched driver, and honestly it’s the one I run into most. Whatever module sits at the top of the backtrace is your target. Update it, or roll it back to the version the VMware Compatibility Guide says belongs on your build.

How do I find the cause if there is no coredump?

No dump means you’re working with less, but not nothing. You lean on the photo you took (exception type, plus that top backtrace frame) and on /var/log/vmkernel.log from just before it went down. Often that’s enough to name the family. Then, and I do mean today rather than “when there’s time”, set up a coredump partition or a network collector. Next time you get the full picture instead of a guessing game.

Can I recover the VMs that were running on the host?

They went down hard when the host froze. Think yanking the power cord, not a clean shutdown. Once the host is back they boot again on it, or vSphere HA brings them up on another host if you’ve got HA running, and they come back to their last written-out state. The only thing you actually lose is in-flight I/O that never made it to disk. So yeah, HA plus storage you trust quietly earns its keep on a bad day.

What is the difference between a PSOD and a host showing Not Responding?

Night and day, even though both feel like “the host is gone.” A PSOD is a hard kernel halt. Purple screen, console frozen, the VMkernel is done. “Not responding” in vCenter usually means the host is alive and fine, but its management agent lost the thread, which is a much gentler problem you can often clear without rebooting a thing. The tell is the console. Pull up the physical or remote one and you’ll know in about two seconds which of the two you’re staring at.

How do I configure a coredump target on ESXi?

Want it local? Run esxcli system coredump partition set --enable true --smart and you’re set. Prefer a central collector for a whole cluster? That’s esxcli system coredump network set with the collector’s IP and port. Either way, confirm it stuck with esxcli system coredump partition get. Don’t just assume. And do this on every host while everything’s calm, because finding out it was never configured, mid-crash, is about the worst time to find out.

Hit another VMware error?

Type in any ESXi or vCenter code or message and get the likely cause plus the exact steps to fix it, all in one searchable place, no rabbit holes.

VMware error reference →

Sources & further reading

  • Broadcom TechDocs, VMware vSphere
  • Broadcom Knowledge Base (former VMware KB)
ShareTweetPin
People Are Geek

People Are Geek

I'm Stephane, a network and systems engineer with over 15 years of hands-on experience on production infrastructure, virtualization (ESXi, Proxmox), networking, and self-hosting. Earlier in my career I built and ran a Linux resource site that became a well-known reference for sysadmins. Today I focus on cybersecurity, and I also work as a technical trainer, teaching networking and security to people who do it for a living. Everything on People Are Geek comes from real-world practice, not theory. I build every tool on this site myself, and I write about what I've actually deployed, broken, and fixed. If it's here, I've used it.

People Are Geek

Copyright © 2017 JNews.

Navigate Site

  • About PeopleAreGeek
  • Affiliate Disclosure
  • All Tools and Articles
  • Contact
  • Cookie Policy
  • Hyper-V Hub: Tools, Error Fixes and Lab Guides
  • Linux Hub: Cross-Distro Reference, Articles, Tools
  • Privacy Policy
  • Sample Page
  • Terms of Service
  • VMware vSphere & ESXi Hub: Tools, Error Fixes and Guides

Follow Us

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Online Tools
  • Network Tools
  • Developer Tools
  • Security Tools

Copyright © 2017 JNews.