mntmn,
@mntmn@mastodon.social avatar

is it normal for vddgfx of amdgpu to fluctuate and drop to 731mV?

mntmn,
@mntmn@mastodon.social avatar

trying to figure out why, once again, my work pc (intel i9-9900) has freezing issues again (display stays on, sound stops, needs power cycle). before the freezing i get audio dropouts and cursor stuttering

gsuberland,
@gsuberland@chaos.social avatar

@mntmn any MCA/MCE logs being generated?

gsuberland,
@gsuberland@chaos.social avatar

@mntmn I ask because I've had three hardware issues that matched these symptoms, and each time the MCA/MCE logs generated by hardware were critical to tracking down the source.

first one was caused by PCIe ASPM; worked around by disabling ASPM, and fixed in a later BIOS.

second one was a bad DIMM with a very transient fault. ECC correction events told me which DIMM.

third case showed internal cache faults on one of the CPUs; the cause was motherboard failure.

mntmn,
@mntmn@mastodon.social avatar

@gsuberland where would i find these logs?

manawyrm,
@manawyrm@chaos.social avatar

@mntmn @gsuberland dmesg as well (that's why I mentioned the serial)

On consumer boards, that might not work very well though, shabby firmware development causes MCA/MCE to not be reported properly on many, many boards.
(not to mention that you're not running ECC memory!)

Oh yes, I forgot a step: Run Memtest86+ for an hour, minimum! Ideally, keep the inside of the PC warm (50-60°C) with a space heater, etc.

gsuberland,
@gsuberland@chaos.social avatar

@manawyrm @mntmn yeah I was tracking these issues down on a workstation with a dual Xeon board that cost £700 so I may have had an easier time of it than is typical with consumer boards.

manawyrm,
@manawyrm@chaos.social avatar

@gsuberland @mntmn You wouldn't believe how many enterprise boards don't properly report memory errors...

We have a little resistor on a stick now, which you can use to short out data pins on a running machine to cause a fault on purpose.

scy,
@scy@chaos.social avatar

@manawyrm "So, what's your preferred debugging method?" – "Resistor on a stick."

manawyrm,
@manawyrm@chaos.social avatar

@scy Someday I'll need to do a "hardware crimes" post with my favorite crimes against poor defenseless hardware :P

manawyrm,
@manawyrm@chaos.social avatar
wanderingincode,
@wanderingincode@noc.social avatar

@manawyrm @scy fun fact, you can drop pin headers into the outside edges of an isa slot and then push a couple of credit cards into the slot to push the pins outward to engage the headers.

image/jpeg

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • Hentai
  • doujinshi
  • announcements
  • general
  • All magazines