PowerColor Begins Collecting Serials of Overheating Radeon 7900 Series GPUs
The saga of the overheating Radeon 7900 series GPUs is ongoing, and sadly it still seems like a confusing mess. There have been a few new developments since it was initially reported last week. Some users have tried to RMA their cards due to hotspots hitting 110C, but have been denied. Amidst the confusion, one of AMD’s partners is now stepping in to try to add some organization to the situation.
To recap, the brouhaha started a week ago. A website in Germany noticed several users on its forums reporting radical temperature deltas on AMD-designed Radeon RX 7900 XTX GPUs. Those are the red and black “reference” designs. The issue does not appear to affect partner boards with different coolers. This hints that the issue is due to the way the cooler interacts with the chiplets on the PCB. In some cases, the delta between the main die and a hotspot was as high as 53C, which seems out of spec. That resulted in cards running at 110C, resulting in thermal throttling and fans running at full speed. AMD announced it was investigating the issue, which brings us to now.
A thread on the AMD subreddit is titled, “Do not buy a 7900 XTX, or anything else for that matter, directly from AMD.” The user is experiencing the overheating issue, and says AMD told them it’s all “in-spec.” However, the thread got enough attention that AMD agreed to RMA the board. Another Redditor with the same issue was also refused an RMA by AMD. In the thread, they report AMD told them, “The temperatures are normal.” This has caused quite an uproar in the r/AMD subreddits.
Finally, an AMD engineer waded into the thread to provide some clarity (image above). An engineering lead named Kevin says AMD is aware of the issue and actively investigating by collecting serials and trying to reproduce the problem. Additionally, Kevin says that a delta of 90C on the main die with a 110C hotspot is “within spec.” However, he says a delta of 70C and 110C is “not ideal.” Kevin says they’re looking at whether it can be mitigated via firmware or drivers, but “it’s not clear yet.”
Since there was still some confusion among Redditors about who to speak with about the issue, a hero emerged. A user named PowerColorSteven waded into the choppy waters to offer PC component vendor PowerColor‘s assistance. Steven said everyone should email him, or message him on Reddit, regardless of which card they have. He will begin to collect serials, and then hand that information off to AMD. When describing the number of people afflicted, he wrote “def more than a handful.”
It seems as if, for now, PowerColorSteven is the point man on this operation. All information should flow through him, which will then be disseminated to AMD and its partners. Remember that unlike Nvidia, some of its partners make reference boards. Once he’s collected enough data, he says he’ll post all the info he has. The big question seems to be, why is it hitting 110C to begin with, and is that within spec like AMD says it is? Also, what is it about AMD’s design that is enabling this behavior?
One interesting ripple is Wccftech dug up an old AMD blog post about the RX 5700 series from 2019. The blog discussed operating temperatures for the card. In it, AMD explained 110C is just fine. It reads, “Operating at up to 110C Junction Temperature during typical gaming usage is expected and within spec.” Of course, that’s a different architecture, but it would be surprising if that’s also expected behavior on the 7900 XTX. If AMD has to admit that publicly to defuse this crisis, it could make it worse. For reference, most GPUs are expected to run at about 70C or so, and most are able to do that too, including AMD’s cards. Our sister site PCMag detected no overheating in its review of the RX 7900 XTX (below).
For now, this is an ongoing investigation. However, at this point, it can be labeled a real problem, and not user error or a handful of bad cards. We don’t know how widespread it is yet, but hopefully, we’ll find out more soon. It’s a real bummer for AMD though, which has always been in a David vs. Goliath scenario with Nvidia. AMD also has a history of buggy drivers and has been working hard on regaining gamers’ confidence. This is surely a setback, but it’s not clear how it will impact AMD’s reputation with gamers.