Oxidisation of Seagate & WDC PCBs

I came across a reference to this Russian forum thread in a WDC forum:

formatting link

Here is Google's translator:

formatting link

The thread discusses oxidisation of contact pads in current Seagate and Western Digital hard drives. The drives were used in typical office and home environments, and are about a year old. The thread has several detailed photos. All except the older tinned PCB appear to show evidence of serious corrosion.

Is this the fallout from RoHS? Surely it's not the result of some cost saving measure?

- Franc Zabkar

--
Please remove one 'i' from my address when replying by email.
Reply to
Franc Zabkar
Loading thread data ...

I've seen PCB pads oxidize on old Conner RLL 5.25 FF drives. Remove the torx screws and pull the PCB, buff the pads reassemble and the drive was good as new.

Reply to
Meat Plow

The silver ones are not oxydized. Silver reacts with sulphur, not oxygen. It is normal and cannot really prevented. It is also not a problem in contacts that are not used, as the process stops itselft after at thin coating is reached.

The golden ones look like the same thing to me. Maybe the used a high silver content gold here. Sorry, I am noch a chemist. But my parents used to deal in silver jewelery and the look is characteristic.

I suspect air pollution as the root cause. As I said, it is not a problem in this case, the sulphurisarion (?) process will not eat through the traces. They are rather better protected with this.

It would be a problem on the connectors though. But they will have better and thicker gold anyways.

Arno

--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
Reply to
Arno

p://maccentre.ru/board/viewtopic.php?t=3D70953&start=3D15

ru&tl=3Den

Maybe not. There are other known culprits, like the drywall (gypsum board, sheetrock... whatever it's called in your region) that outgasses hydrogen sulphide. Some US construction of a few years ago is so bad with this toxic and corrosive gas emission that demolition of nearly-new construction is called for.

Corrosion of nearby copper is one of the symptoms of the nasty product.

Reply to
whit3rd

Hi!

I've seen minor occurrences of it and wondered what it was, but only on the "one use" contact pads on the bottom of the drive's PCB. (My guess is that these are used to set the drive up for its first time use and do some basic tests to assure the new drive is functional.)

Some drives had more of this apparent oxidation than others, but all of the ones I've seen had it from the moment they were removed from the package. It hasn't gotten any worse and these drives continue to operate properly. I checked a few at random and did not find a similar effect on the contacts going to the spindle motor or headstack.

William

Reply to
William R. Walsh

Does this mean we should apply contact protector, such as De-Oxit, to the PCBs to prevent corrosion?

Reply to
larry moe 'n curly

No need.

Arno

--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
Reply to
Arno

On Thu, 8 Apr 2010 14:03:39 -0700 (PDT), whit3rd put finger to keyboard and composed:

It's not just Russia that has this problem. The same issue comes up frequently at the HDD Guru forums.

- Franc Zabkar

--
Please remove one 'i' from my address when replying by email.
Reply to
Franc Zabkar

On Thu, 8 Apr 2010 20:20:55 -0700 (PDT), "larry moe 'n curly" put finger to keyboard and composed:

One of the sticky threads at the HDD Guru forums recommends that the preamp contacts on WD drives be scrubbed clean with a soft white pencil eraser whenever they come in for data recovery.

- Franc Zabkar

--
Please remove one 'i' from my address when replying by email.
Reply to
Franc Zabkar

I'm right here in the US and I had 3 of 3 WD 1TB drives failed at the same time in RAID1 thus making the entire array dead. It is not that you can simply buff that dark stuff off and you're good to go. Drive itself tries to recover from failures by rewriting service info (remapping etc.) but connection is unreliable and it trashes the entire disk beyound repair. Then you have that infamous "click of death"... BTW, it is not just WD; others are also that bad.

They had good old gold plated male/female headers on older drives and those were reliable. Newer drives had, sorry for an expression, "gold plated" pads and springy contacts from the drive heads. That would have them something like $0.001 saving per drive wrt those headers and they took that road. Gold plating was also of a cheapest variety possible, probably immersion so it wouldn't last long. Newest drives from Seagate also have that construction but pads look like tin plated, no gold. Don't know how long it would last.

What we are looking at is an example of a brilliant design with a touch of genius--it DOES last long enough so they work past their warranty period and at the same time it will NOT last enough to make it work very long past the manucturer's warranty. I don't know if it is just greed/incompetence or a deliberate design feature but if it is the latter my kudos to their engineers for job well done :(

--
******************************************************************
*  KSI@home    KOI8 Net  < >  The impossible we do immediately.  *
*  Las Vegas   NV, USA   < >  Miracles require 24-hour notice.   *
******************************************************************
Reply to
Sergey Kubushyn

That sounds like BS to me. A soft pencil eraser cannot remove silver sulfide, it is quite resilient. There are special silver cleaning cloths that will do the trick.

Still, I doubt that this is a problem. It shoud not crawl between working contacts, only unused ones.

Arno

--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
Reply to
Arno

It is extremly unlikely for a slow chemical process to achive this level of syncronicity. About as unlikely that it would be fair to call it impossible

Your array died from a different cause that would affect all drives simultaneously, such as a power spike.

Tin lasts pretty long, unless you unplug/replug connectors. That is its primary weakness.

I think you are on the wrong trail here. Contact mechanics and chemistry is well understood and has been studied longer than modern electronics. So has metal plating technology in general.

Arno

--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
Reply to
Arno

A better way to do this is to drag the English translation tool to the tool browser bar:

The browser will automagically ask if you want to translate any foreign language web page that you view.

It's not oxidation. Oxides of both tin and lead are white in color. My guess(tm) is lead sulphide (galena), as lead sulphate and tin sulphate are usually white.

It's difficult to tell from the photos if the PCB contacts are gold or tin-silver. It's also difficult to tell if there was a mix of contact materials. Mixing gold and tin contacts usually results in black crud and fretting:

(see Fig 3). Contact material galvanic mismatch is another possibility.

Another possible culprit is a contaminated or poorly washed PC board. The typical kitchen environment will also cause a problem. I see it on machines and drives fairly often. If necessary, I just clean the contacts with a pink pencil eraser, and reassemble. I've NEVER had a drive failure that was directly attributed to such contact corrosion. It's usually something else that kills the drive.

Nope. If the contacts were tin-silver, 5% lead, or one of the other low lead alloys, the corrosion would probably be white or light gray in color. The dark black suggests there's at least some lead involved or possibly dissimilar contact material.

--
Jeff Liebermann     jeffl@cruzio.com
150 Felker St #D    http://www.LearnByDestroying.com
Santa Cruz CA 95060 http://802.11junk.com
Skype: JeffLiebermann     AE6KS    831-336-2558
Reply to
Jeff Liebermann

I think people are jumping to conclusion, because the discolorarion is what they can see (and think they understand). There is a posting in this thread with a person that has had a 3-way RAID1 fail and attributes it to the contact discoloration. Now, whith a slow chemical process, the required level of synchronicity is as unlikely that calling it impossible is fair.

Actually pure silver also sulfidizes (?) in this way. The look is very characteristic. I think this is silver plating we see. It is typically not a problem on contacts that are in use, it does not crawl between contact points.

I suspect in the observed instances, this is a purely aestetic problem and has no impact on HDD performance or reliability whatsoever.

Arno

--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
Reply to
Arno

Yes, they did not die from contacts oxidation at that very same moment. I can not even tell they all died the same month--that array might've been running in degraded mode with one drive dead, then after some time second drive died but it was still running on one remaining drive. And only when the last one crossed the Styx the entire array went dead. I don't use Windows so my machines are never turned off unless there is a real need for this. And they are rarely updated once they are up and running so there is no reboots. Typical uptime is more than a year.

I don't know though how I could miss a degradation alert if there was any.

All 3 drives in the array simply failed to start after reboot. There were some media errors reported before reboot but all drives somehow worked. Then the system got rebooted and all 3 drives failed with the same "click of death."

The mechanism here is not that oxidation itself killed the drives. It never happens that way. It was a main cause of a failure, but drives actually performed suicide like body immune system kills that body when overreacting to some kind of hemorrargic fever or so.

The probable sequence is something like this:

- Drives run for a long time with majority of the files never accessed so it doesn't matter if that part of the disk where they are stored is bad or not

- When the system is rebooted RAID array assembly is performed

- While this assembly is being performed a number of sectors on a drive found to be defective and drive tries to remap them

- Such action involves rewriting service information

- Read/write operations are unreliable because of failing head contacts so the service areas become filled with garbage

- Once the vital service information is damaged the drive is essentially dead because its controller can not read vital data to even start the disk

- The only hope for the controller to recover is to repeat the read in hope that it might somehow get read. This is that infamous "click of death" sound when drive tries to read the info again and again. There is no way it can recover because that data are trashed.

- Drives do NOT fail while they run, the failure happens on the next reboot. The damage that would kill the drives on that reboot happened way before that reboot though.

That suicide also can happen when some old file that was not accessed for ages is read. That attempt triggers the suicide chain.

--
******************************************************************
*  KSI@home    KOI8 Net  < >  The impossible we do immediately.  *
*  Las Vegas   NV, USA   < >  Miracles require 24-hour notice.   *
******************************************************************
Reply to
Sergey Kubushyn

In article , Arno writes

It's a technique that has been used on edge connectors for many years.

--
Mike Tomlinson
Reply to
Mike Tomlinson

Yup, and it works. I learned the technique when servicing Multibus I systems, and still use it to this day.

Reply to
JW

That's the real problem with RAID using identical drives. When one drive dies, the others are highly likely to follow. I had that experience in about 2003 with a Compaq something Unix server running SCSI RAID 1+0 (4 drives). One drive failed, and I replacing it with a backup drive, which worked. The drive failure was repeated a week later when a 2nd drive failed. When I realized what was happening, I ran a complete tape backup, replaced ALL the drives, and restored from the the backup. That was just in time as both remaining drives were dead when I tested them a few weeks later. I've experienced similar failures since then, and have always recommended replacing all the drives, if possible (which is impractical for large arrays).

--
Jeff Liebermann     jeffl@cruzio.com
150 Felker St #D    http://www.LearnByDestroying.com
Santa Cruz CA 95060 http://802.11junk.com
Skype: JeffLiebermann     AE6KS    831-336-2558
Reply to
Jeff Liebermann

Ah, I see. I did misunderstand that. May still be something else but the contacts are a possible explanation with that.

So your disks worked and then refused to restart? Or you are running a RAID1 without monitoring?

Well, if it is Linux with mdadm, it only sends one email per degradation event in the default settings.

I run long smart selftest on all my drives (RAID or no) every

14 days to prevent that. Works well.

Yes, that makes sense. However you should do surface scans on RAIDed disks regularly, e.g. by long SMART selftests. This will catch weak sectors early and other degradation as well.

Arno

--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
Reply to
Arno

For high reliability requirements it is also a good idea to use different brand drives, to get a better distributed times between failures. Some people have reported the effect you see.

A second thing that can cause this effect is when the disks are not regularly surface scanned. I run a long SMART selftest on all disks, also the RAIDed ones for this every 14 days. The remaining disks are under more stress during array rebuild, especially if the have weak sectors. This additional load can cause the remaining drives to fail a lot faster, in the wort case during array rebuild.

Arno

--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
Reply to
Arno

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.