Thursday, May 18, 2023

Synology DS218j Dies Mysteriously

I've enjoyed my Synology DS218j (NAS) for several dependable, trouble-free years, until yesterday.


The device began to lose connectivity with my Samsung smart TV, very once-in-a-while at first, but lately almost daily. Rebooting the TV seemed to get around the increasingly problematic problem until finally the device was a ghost on my network, never to be seen again.

Some initial research suggested that yes, the device's ethernet adapter could fail, and yes, that particular part per Synology cannot be replaced.

With nothing to lose I decided to crack open the unit to see if any components had fallen away due to a cold solder joint. Right away, I noticed the circuitboard has a button cell, specifically a KTS CR 1220 3-volt lithium ion battery.


I popped out the old battery installed at the factory and tested it with a battery tester like this one. Nothing, not even enough voltage to power the LCD to register a reading! To verify I tested a fresh CR 1220 right out of the package and it was just under 3v.


I discarded the old battery, installed its replacement, then reassembled the unit and started it up. Now, Advanced IP Scanner verifies it's back on my LAN and ready to go.



On PC button cells have been used quite a while (though seemingly less so nowadays) to sustain non-volatile memory to maintain a system's BIOS settings.

Curious whether this was Synology on the down-low including some planned obsolesence into their product design here. Unless you're comfortable tinkering with hardware as a longtime IT guy like me has become, you as a mere consumer would more likely seek a replacement than open up a device to troubleshoot. 

Interestingly, the product manual (PDF) you will note just one, solitary reference to a "battery" in the form of an explosion hazard on page 5. Nothing about replacement or needing to be replaced.
 

I will say, it does not inspire confidence when your product quietly includes a button cell. Conveniently here for Synology it resides on the circuitboard which is upside down beneath the drive assembly and thus not visible when 
doing the designed task of adding or replacing hard drives.



Sunday, May 7, 2023

Hard Drive Replacement and USB

Just got a shiny new Western Digital Black 10TB hard drive to replace my somewhat ailing WD 6TB Blue with just over 10 years of solid 24/7 performance. 


Electromechanical hard drives like many electronics in general will likely go the distance if no failures develop in the first few days, weeks, months of use, but over time they do simply begin to wear out. My venerable drive for games and media was beginning to develop subtle issues, like large copy operations interrupted with occasional retries but finally completing, or files that suddenly proved unreadable when trying to access them.

S.M.A.R.T. software on board reported no anomalies other than some higher than normal temperatures (mostly due to a dead case fan I replaced along with the drive) and a couple hundred sector reallocations indicating the drive found some  physically bad sectors and moved what data it could to different ones.

I decided to go the relatively lazy route of taking the old drive out, installing the new one, and planning to transfer the data via USB 3.0. Lazy, mainly because given I'd already installed the new drive and put my rig back together I didn't want to crack the case open again despite the significantly faster direct SATA transfer speeds. That prompted a series of irritating events and I wouldn't be sharing them here otherwise, in case you, dear reader, have recently experienced the following yourself, so there's that.

Upon rebooting my Windows 10 system, all was well for the most part except for the fact that upon opening Disk Management to initialize, allocate partitions, and format the drive, my system suffered its first BSOD of the afternoon, citing a file mrcbt.sys which belongs to backup software I use, Macrium Reflect. Simply restarting afterward seems to have gotten around that issue and my setup booted normally with the newly partitioned and formatted 10TB drive online.

New drive up and running in Disk Management.

Here's where things got irritating. Using a decent USB 3 external hard drive bay by Sabrent I installed my old drive, powered the enclosure up, then plugged it into my USB 3 hub. Immediate BSOD, citing "memory management" issues with no mention of specific files.

I fell back to review my options after a couple more tries leading to the same outcome. I could hook the enclosure up to my wife's PC on our LAN, share it from there, and transfer over the wire. This though would add a lot more time and overhead on top of asking a lot of my senior citizen hard drive in its time of subtle decline. Also, could've hooked it up to a laptop but that would take even longer for transfer of the over 4 TB of data and burden the old drive further.

Ultimately I paid for my laziness and ended up cracking the case open, and as I type the old drive is resting a bit precariously but securely on the floor beside the case providing data to the new drive. Meanwhile I wondered why memory, in this case, was cited as the culprit.

Even after decades of supporting it, Windows to me is largely a black box, even with almost a decade of software and database development on Windows systems. I don't care to delve into the plumbing and avoid doing so whenever possible, but one thing I recalled is the fact that I had been using the old drive to host my system's page file as well as the go-to place for applications and the OS to store temporary files.

From my wife's PC I plugged the drive enclosure in with the old drive on board and tried to delete the TEMP folder. Curiously, Windows wouldn't allow me to do so, presumably because my user account on my PC was owner of that folder. Okay, so I took ownership of the TEMP folder, and then tried to delete all the contents. Curiously, even this didn't quite work. As is typical when trying to delete files from your page file and temp file folder in this case one file refused to be deleted from Windows Explorer.

I opened an administrator authorized Command Prompt, and curiously it not only didn't allow me to delete the file, but kept complaining the file could not be found. What? It's right there, what gives?? I found even after taking ownership, even after being allowed in Explorer to rename the TEMP folder, that one file, a GUID named one resembling daf2743a-311f-4315-9272-be2dca1fa178.tmp, refused to die. Even deleting the folder failed though renaming worked. Weird!

Sabrent USB 3 drive enclosure, for 2.5" and 3.5" drives / SSDs.


I'm used to situations where a Windows PC might BSOD if certain types of USB thumb drives or other peripherals are plugged in and powered on at boot, but this wasn't the case here as the system BSOD'd whether the drive was connected via USB at boot or after the fact. 
That on a foreign system Windows seemingly "respected" the other system's TEMP folder as seems the case here is curious, but perhaps something prompted for security reasons. 

Maybe for Microsoft it was sort of a quick and dirty remediation for a security vulnerability involving bad actors trying to recover data from a stolen operating system drive, where that TEMP folder would typically reside. This though just happened to be a case where instead of using my main SSD with the operating system for it I was using my media drive to house the pagefile and temp files, so perhaps that's why events conspired to not allow that magic to happen via USB.

Simply connecting the old drive directly via SATA to my rig to do the transfer has been operating for minutes now without issue, and given the old drive's TEMP folder still lingers, USB may be the extra ingredient that frustrated my initial, lazy attempts.