BMOW title
Floppy Emu banner

Windows 10 Ongoing Crashes

My new computer continues to experience unexplained freezes and reboots, and I’m running out of ideas for fixing it. A few weeks ago I wrote about crashes in the Intel integrated graphics driver, and the symptoms have since changed, but problems continue. I’m now looking at either

  1. reinstalling Windows and all my applications, which would be a huge undertaking and might not fix anything
  2. replacing the whole computer and reinstalling all my applications, which would also be a huge undertaking
  3. resigning to accept a computer that can’t go more than a few days without crashing.

My previous Windows 7 computer easily ran for weeks without trouble, so this kind of instability is disappointing.

The new computer is a Windows 10 laptop (HP Elitebook x360 G2 1030). I use it with the lid closed, connected to an external monitor, keyboard, and mouse. I suspect this is the root cause of my problems. While this setup should work and does work most of the time, running a laptop with the lid closed doesn’t seem 100% robust. The computer sometimes seems to lose communication with the external monitor, or gets confused because the primary (internal) monitor is off. I see occasional event log entries like “A pointer device has no information about the monitor it is attached to”, window manager crashes, and other graphics related errors. And all of these problems happen while I’m away from the computer, when the external monitor is in power-save mode, but the computer is not asleep. When I return to the computer later, I sometimes find that it has frozen or unexpectedly rebooted.

I should emphasize that the computer rarely sleeps and never hibernates, so I don’t believe this is a sleep-related problem.

 
History

The first problem was crashes in the Intel integrated graphics driver igdkmd64.sys while I was away from the computer. Nearly every day, I’d return to the computer in the morning to find that it had crashed and rebooted sometime since its last use. Here’s what’s happened since then:

June 15 – I forced an update to the latest Intel graphics driver version 26.20.100.6890. This required completely uninstalling the HP-provided graphics driver first. Since then, I have not seen any more crashes in igdkmd64.sys.

June 16 – The computer was still running since the day before, but the Start menu and Cortana did not work. I restarted the computer and Start/Cortana began working normally again.

June 17 – The computer fan was blowing 100%, but both the external and internal screens were dark. I could tell that Windows was still running, because I heard the Windows disconnection sound effect when I unplugged USB devices, but nothing I did would get an image to appear on any screen. I forced a reboot with the power button. The event log showed several prior errors about the embedded controller (EC) not responding. In light of this, I disabled the 3rd-party program Notebook Fan Control. After rebooting, the computer hung on the boot screen with the HP logo, and did not proceed into Windows. After rebooting a second time, all seemed normal.

June 18 – The computer was still running since the day before, but once again the Start menu did not work. I used the task manager to terminate WindowsShellExperienceHost.exe, which then automatically restarted and restored normal Start menu functionality. Afterwards I fully uninstalled Notebook Fan Control. Following a tip related to broken Start menus, I also turned off the Windows option for Settings -> Accounts -> Sign in Options -> Use my sign-in info to automatically finish setting up my device and reopen my apps after an update or restart.

June 19 – All OK. The computer was still running since the day before, with no problems.

June 20 – All OK.

June 21 – All OK.

June 22-23 – Didn’t use the computer.

June 24 – All OK.

June 25 – The computer was still running since the day before, but again the Start menu did not work. I terminated WindowsShellExperienceHost.exe to get the Start menu working again.

June 26 – All OK.

June 27 – Didn’t use the computer.

June 28 – The computer fan was blowing 100%, but both the external and internal screens were dark. Nothing I did would get an image to appear on any screen. I had to hold the power switch to reboot the computer. The event log showed lots of errors in the preceding hours, including multiple desktop window manager crashes, and an application hang error from Microsoft.Photos.exe every 15 minutes stretching on for hours, all during a time when I wasn’t using the computer.

June 29 – The computer fan was blowing 100%, but both the external and internal screens were dark. Nothing I did would get an image to appear on any screen. I had to hold the power switch to reboot the computer. Nothing interesting was found in the Windows event log.

I don’t understand what’s going on here, but it’s a hot mess.

Read 15 comments and join the conversation 

15 Comments so far

  1. Thomas Wolf June 29th, 2019 11:32 am

    Even when it seems application related, this kinda behavior to me warrants running the memtest86 application just to rule out the ram (even if new) being the culprit.

  2. ERIC A GUNNERSON June 29th, 2019 6:45 pm

    Weird behavior in a lot of different areas smells like a memory issue.

  3. Joe STROSNIDER June 30th, 2019 12:21 pm

    Oldschool PC Tech here. I can second the previous two users. Random crashing is indicative of “data path” issues. That is: CPU, RAM, Motherboard, Storage Device.

    Remove and reseat the RAM completely. Then Run a memory test (MemTest86) and let it run overnight. If you have a solid state drive, physically remove it (if it’s not one of the newfangled ones that’s soldered to the board), reinstall it, then run a manufacturer’s test on it. If the laptop has a “hard drive protection tool” on it that protects it from g-shock, uninstall that. It may be tripping due to a faulty accelerometer and causing the hard drive to park heads for no reason. (also, if you have solid state, you don’t need that app. I’ve seen many HPs with solid states have that app on them for no freaking reason, which will “park heads” and cause a blip in I/O or outright crashing.) Open up the laptop and check to see if your fans are clogged. If you get no-go with any of those, I would suspect a poor solder joint between the graphics card/CPU and the main board. That last one is usually a warranty fix. With CPU integrated graphics, could also be a faulty CPU, although that is the rarest possibility. That’s also a warranty fix, because the CPU is usually BGA soldered to the board.

  4. Steve June 30th, 2019 2:14 pm

    Thanks for the suggestions, I’ll try the mem test. This is a laptop with soldered-down RAM, so if the RAM is bad I’m out of luck. The computer was purchased used from eBay, so I’m past the return period and there’s no warranty. But it’s spotless and looks brand new – I think it was an open box or a return. Disk is an NVMe SSD. Aside from the Start menu trouble, all of the problems seem related to graphics hardware in some way.

  5. zardam July 1st, 2019 3:10 am

    You may not be out of luck. It is possible to prevent Windows using specific memory blocks, see here for more details : https://docs.microsoft.com/fr-fr/windows-hardware/drivers/whea/how-to-manage-the-pfa-memory-list. There may be some conversion needed to create the required parameters.

    If you were running Linux, MemTest86+ can directly produce a “badram” pattern to prevent the kernel to use defective memory areas.

  6. Ricardo Menzer July 1st, 2019 4:31 am

    This is a blind shot: have you looked for updated “BIOS”?
    (HP only lets me enter a Serial Number for looking for updates! o.O)

  7. Steve July 1st, 2019 6:01 am

    I ran Memtest86 from a bootable USB stick, and it didn’t find any errors. But there’s definitely something odd about the dual display setup. With my external monitor connected but unused (everything happening on the internal display), I couldn’t get memtest to boot. The display would constantly flicker on and off during booting, then it would stop before ever reaching the memtest main menu.

    Even the UEFI boot menu has problems when the external display is connected. The UEFI menu appears for an instant, then the screen goes black. A few seconds later, the menu reappears scaled to 1/4 size in the upper-left corner of the display. Then it repeats a cycle of going black for a few seconds, reappearing for a moment, then going black again, over and over. It’s as if the video controller is constantly resetting. Good idea on checking for a BIOS update – I’ll look.

  8. Wesley July 1st, 2019 10:25 am

    Have you checked the serial number with HP to verify that it is out of warranty?

  9. Steve July 1st, 2019 12:01 pm

    I confirmed that I already have the latest BIOS. I’m going to try running with the laptop’s lid open for a while, with the display in mirror mode, and see if that helps. I could also try connecting the external monitor with DisplayPort instead of HDMI.

    HP shows there’s still 24 months of “HP Bundled Hardware Offsite Support”, which is a surprise to me. But I’m reluctant to go down that road. Maybe I’m too pessimistic, but I’ve never once had a good experience with manufacturer warranty support for electronics – it’s a giant hassle and typically requires shipping stuff and waiting weeks for a resolution that may not solve the problem anyway. Normally I will just return stuff to the store if it’s obviously faulty, assuming that’s an option. If anyone’s had a great experience with HP support that might change my mind, I would be interested to hear about it.

    My thinking is that I paid $600 for this computer and could probably resell it as-is for at least $200 for parts – assuming this is a hardware problem with the computer and not a software issue or even an external monitor issue. I still don’t have any great theories about what’s going wrong. So if I spend more than $400 worth of time trying to resolve this problem, then I would have been better off just replacing the computer. And if I’m being honest, I’ve probably already spent more than $400 worth of time on this.

  10. Wesley July 1st, 2019 12:19 pm

    Quite honestly, it is sounding a bit to me like a hardware problem, or at least partially. You also have a business-class machine, which typically has better support folks to deal with than your usual home-user hardware. My experience with HP business-class support has been pretty good. If it were me, I would certainly give it a shot. My notes have HP Elite Support at 866-625-1175.

  11. TetrisMaxRules July 4th, 2019 7:34 am

    A note about memory testing, which you should definitely do: I have personally seen systems that took four days before they showed any errors. Others have reported weeks.

    Good luck either way, and never rule out bad hardware, especially bad RAM. 🙂

  12. jerry July 4th, 2019 11:14 am

    My favorite hardware stability test it to install linux and continuously compile the linux kernel. If it crashes, it\’s a hardware problem. I had one computer that I caught an issue after letting it run for 2 days.

  13. Kris Jones July 4th, 2019 11:53 am

    It sounds like it might be over-heating. Have you tried opening it up and making sure the fan filter isn’t clogged with junk? That task is at the top of my yearly hardware maintenance routine. I’ve worked on dozens of laptops, and clogged fan filters are very common. The symptom of having the fan run at 100% for a long period of time could definitely be caused by a clogged fan exhaust.

    FYI, don’t expect any graphics driver support from an OEM. It’s an unfortunate fact, and one of the biggest problems I’ve run into with Windows laptops. If a Windows patch introduces instability with the graphics driver, your options are limited. Sometimes you can install new drivers for Intel on-board graphics, but if it’s discrete (AMD or Nvidia), the manufacturer’s generic drivers won’t work. It has to come from the OEM. And for low-end laptops, they often release an initial revision and never release another update. I’ve run into so many issues with this system that I go out of my way to NOT buy laptops with discrete graphics!

  14. Znep July 8th, 2019 4:53 am

    Hello,

    When it comes to memory testing, memtest86+ is my tool of choice only to determine if a RAM chip has severe defects. The problem wit memtest is that it creates very little load on the memory controller. Btw, the other windows based memtest applications are even more unreliable.

    What I saw when troubleshooting a RAM related problem on my machine is that the system behaves quite different when stressing RAM under heavy load. For this purpose I used Prime95 FFT tests configured for RAM tests. memtest showed after many hours of running no errors, while Prime95 had memory faults already after a few minutes.

    Give this one a try, and if everything runs OK then RAM & memory controller are not causing the problem.

    Best Regards,
    Znep

  15. Peter July 9th, 2019 4:04 am

    For years, my business settled on a Dell model that was a couple of years old and bought one after the other from eBay as a standard issue laptop or desktop for employees. We had no issues and saved a lot of money doing this over new systems. And standardizing on one model allowed us to use failed ones for parts to keep others running.

    But then something changed. A significant percentage of systems we bought had some kind of hard to diagnose issue like what you are experiencing. Invariably, it was some piece of hardware that was failed or failing and caused intermittent problems. But importantly, the systems mostly seemed to work and would pass and initial smell test, but were unreliable and caused a lot of frustration for the users.

    My theory is that sales of used systems on eBay become more and more dominated by people buying lots of systems from manufacturers that had been returned due to some issue, and could not be resold as refurbished by the OEM. They then passed them off as working systems on eBay.

    We eventually gave up on buying used systems from eBay because we just couldn’t seem to get reliable systems that way. Now, when we buy used, we buy refurbished directly from the manufacturer, with some kind of limited warranty.

    I can’t help but wonder if you had a full hardware test suite from the manufacturer, that it would identify something failed that needed replacing, such that it would exceed the used value of the laptop.

Leave a reply. For customer support issues, use the Contact page instead of comments.