troubleshooting

By aaron.axvig, Thu, 12/12/2019 - 10:39

I left our Honda EU2200i generator out in the rain and it wouldn't start.  This wasn't a crazy torrential rain and the outlets were on the leeward side so I thought it would be OK with the extensive plastic shrouding but I guess not and I will try to keep it mostly dry in the future.

To troubleshoot I took of the big side panel (for oil changing and air cleaner access) and things looked pretty dry in there.  I also suspected the kill switch so managed to get my multimeter probes on the contacts there and it made and broke contact appropriately.  A peek in at the spark plug showed it was all dry (didn't actually pull the spark plug).  I pulled off the end panel on the outlet side and there was water pooled on a few components there...probably the problem even though many things are potted.  I blew into all of the various outlets.  Removal of the outlet plate itself showed very minimal water inside of there.

But still it wouldn't start.  Twice it sputtered for one second to taunt me.

Then I tried slamming the whole unit into the floor from 2-3 inches up.  It has pretty good shock absorbing feet so this didn't even jostle the generator that hard, but it fired right up after that!  So I guess a few drops were shorting out something (likely in the kill switch circuit) and the abrupt movement shook them loose.

BTW when we run the generator there the exhaust gasses can swirl around behind the dodger so we keep the companionway fully closed.  And there are some small cracks there but our dorades and the overall suction on the back of the dodger should keep air flowing outwards through those cracks, plus we would not sleep with it running.

By aaron.axvig, Sun, 12/08/2019 - 08:58

On our boat we have a Color Control GX that is mounted at the nav station.  And we have a MultiPlus inverter/charger that is under the couch.  We used to reach under the couch to turn the inverter on and off.  This is a minor hassle, and the switch feels sort of flimsy.  So it was exciting when I discovered one day that we could turn the inverter on and off using the CCGX.  So now we leave the physical switch On and control it using the CCGX.

That works great, except about once a month it does nothing when we choose On on the CCGX.  Troubleshooting the first few times this happened involved rebooting things, updating firmware, unplugging and replugging communications cables, and then finally just letting it sit.  And after a while it would eventually turn on.

Now I have discovered an easy workaround.  On the CCGX I choose Charger Only (which is a third setting in addition to On and Off) and that immediately takes effect.  Then I choose On and that immediately takes effect.

My best guess is that some part of the system thinks it is already on so does not accept the command to actually turn on.  But also I think it happens more often if I have a large AC load switched on, such as the electric kettle in the morning.

By aaron.axvig, Thu, 04/24/2008 - 03:00

Well I wouldn't say we are experts, but we definitely did have a near disaster.

It started 3 days ago when I installed an NVIDIA driver and program to monitor the RAID 5 running on our server's M2N-E motherboard.  It reported to me that the array was in a degraded state, but did not allow for repairing it from within Windows, requiring some BIOS-level maintenance.  So I planned to go investigate the situation the next day.

When I got to the server and hooked up a monitor and keyboard the video output was garbled.  We could tell it was going into BIOS but it was unreadable.  That's what we get for using a 10 year old PCI video card I guess.  So I planned to come back in a day or two with a different video card.

Of course it then decided to drop another drive from the array.  This took down our website because it was stored on the array.  By website, I mean Default Website, which has the misfortune of being linked to Outlook Web Access for Exchange 2007.  Which means the web.config no longer existed.  OWA would no longer run, and reportedly the only way to fix that is to do a complete re-install of IIS and Exchange 2007.

Needless to say, we really wanted to get that RAID going again.  We rebooted between the RAID BIOS and Windows many times trying to rebuild it, but it would never show up in Windows.  After 10 fruitless repetitions of this we about ready to call it a loss and get ready to wipe the array and start over.

But then I thought of how the monitoring driver and utility I installed had been a relatively new version, and maybe it had incompatibilities with the older BIOS.  Updating the BIOS was worth a shot.

We downloaded the 1305 BIOS from the Asus website, which is still as horrible as it has ever been.  It wouldn't even load from the server, so we had to use another computer, network the downloaded files onto the server, and from there put them on the floppy.

After flashing the board from within BIOS we were greeted with a lockup immediately after the splash screen.  I spent 1/2 an hour unplugging things one-by-one trying to root out the problem, but it didn't help.  I tried unplugging all the cables and plugging them all back in (it sounds weird, but I have seen it fix many problems).  Visions of RMAing the board and going server-less for 2 weeks were running through my head, and I wasn't happy.  Finally I tried removing the battery for a minute, which somehow un-froze the BIOS.

From there it was a quick boot into Windows and then some big smiles as we saw that the array was back, and rebuilding.  No big re-installs this time.

But we weren't without issues.  Immediate attempts to copy files off of the drives were met by network timeouts--it seemed someone had forgotten to plug the network cable back in. :)

And this morning I saw that all the e-mails I received overnight reported being delivered on January 1st, 2008.  The clock needed to be set.  Also, OWA was still not running.  Some Google-ing of the error messages revealed that a few stopped services were probably the cause.  Starting them did get OWA running.  But outbound e-mails were not sending.  I figured this was due to the time issue still, and all the Exchange processes would need to be restarted, so I just rebooted the server.

20 minutes later the server was still not responding (I was at a remote location).  I quickly realized that it was probably halted at the BIOS screen because no keyboard was connected and the default of the new BIOS would be to halt on all errors.  Consider that lesson learned.

And now the happy ending.  Everything is running as it used to, no data was lost, e-mail is working, and I don't have to (get to?) spend my weekend re-doing a server!

By aaron.axvig, Sun, 03/23/2008 - 03:00

Yet another productive thing I tried to do tonight: get my old Mexico blog online.  As one might guess given that IIS7 is quite different from IIS6, Dasblog has some serious problems with Server 2008.  Those are resolvable, per Mr. Hanselman's instructions.

Unfortunately there are further problems.  This time with Registry changes.  According to Mr. Starr's analysis, Dasblog seems to think it's cool to pull time zone information from the Registry.  Seeing registry accesses in a web application does not make me happy:

 

NullReferenceException: Object reference not set to an instance of an object.]
   newtelligence.DasBlog.Util.WindowsTimeZone.LoadTimeZonesFromRegistry() +283
   newtelligence.DasBlog.Util.WindowsTimeZone..cctor() +76

[TypeInitializationException: The type initializer for 'newtelligence.DasBlog.Util.WindowsTimeZone' threw an exception.]
   newtelligence.DasBlog.Web.SiteConfig.set_DisplayTimeZoneIndex(Int32 value) +23
   Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializationReaderSiteConfig.Read7_SiteConfig(Boolean isNullable, Boolean checkType) +4743
   Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializationReaderSiteConfig.Read8_SiteConfig() +76

By aaron.axvig, Wed, 03/05/2008 - 03:00

I just saw yesterday that FTP7 was out.  So of course I went off to install it.  When I was done, the Default Web Site on the server no longer worked, including Outlook Web Access.  Much panic ensued, as in the past OWA has been very finicky and once broken it stays broken.

It turns out that I had uninstalled the IIS6 components as part of cleaning up before doing FTP7 (I had installed the IIS 6 things for FTP6).  I'm not sure if IIS6 components and FTP7 can live alongside each other as I uninstalled FTP7 when trying to get the site working, but I would guess so.  Anyways, simply re-installing all of the IIS6 components fixed the problem.

By aaron.axvig, Mon, 10/22/2007 - 03:00

We had an issue at work where on any computer with Acrobat Professional installed Word 2007 would run really slow.  Opening, closing, saving, and creating new documents all took 3-4 seconds during which the entire computer would lock-up and the processor would be at 100% utilization.  Even switching between programs would seem to do this.

 A quick Google for Word 2007 lockups didn't turn up much other than a few references to the Add-ins menu, so I checked that out and found one for Acrobat PDFMaker Office COM Addin which didn't really seem necessary.  Turning it off instantly gave Word the excellent performance that I expected.  Here's how:

  • Click the Office button in the top left, and go to "Word Options" (bottom edge of the menu).
  • Select "Add-ins" on the left, and notice the "Acrobat PDFMaker Office COM Addin" under "Active Application Add-ins."
  • Towards the bottom of the dialogue you will see "Manage:" followed by a drop down menu. "COM Add-ins" should already be selected.
  • Click "Go" and on the next screen uncheck the box next to the "Acrobat PDFMaker Office COM Addin."
  • OK your way out of the all the menus, and you should have improved performance without even restarting Word.
By aaron.axvig, Fri, 09/28/2007 - 03:00

I got this error when trying to install the Field Scoring component of the BEST scoring software.  After much angst and reading things on Google about how it's related to an incorrect directory inside the (InstallShield created) setup file, I finally stumbled upon a solution:

Earlier in the day I had renamed the Administrator account of the computer, and also changed its password (while logged in as Administrator).  I had not logged off between then and installing the software.  Logging off and then logging back on fixed the problem.

By aaron.axvig, Wed, 08/22/2007 - 03:00

Nitevision: a name which strikes fear into the hearts of many a Medora call-center workers.

To be fair, it's not that bad of a program.  It's what Medora has used for motel reservations for many years.  Written by REMco Software out of Dickinson, ND, Nitevision is a client/server application which keeps tracks of motel reservations, who is checked in, which rooms are clean, etc.  As near as I can tell the client sends raw SQL queries to the server which then spits back some data for the client to display.  A workable strategy, I think.  However, there are some problems.

I started working in Medora's call center at the beginning of the summer of 2005, as a lowly Customer Service Representative (don't be fooled, I have really really really enjoyed all of my jobs in Medora).  I vaguely remember how the Nitevision server had to be restarted quite often because all the clients on the workstations would simply lock up.  I remember more closely how this also happened in the summer of 2006 when I was a team leader, supervisor of CRSs.  At one point I was even trained in as to how to restart the server because the IT guy wanted a day off.  I don't think I ended up having to restart it, but the Internet connection did go down at one point while he was gone, which is another story on its own (credit card processing requires an Internet connection).  Anyways, the restarts were so frequent that Nitevision got its own server so the ticketing system could stay up while it was rebooting.

Enter me (again), in the summer of 2007, as the IT assistant.  Now instead of crossing my fingers in hopes that it didn't crash, I had half of the workers asking me why Nitevision crashed on them all the time.  I didn't really know, but us two IT guys spent a lot of time thinking about it.  Many hours were spent on the phone with REMco support, and they even remoted into the server to delete some rows in a logging table that looked like they were taking a lot of space.  The problem went on though, with crashes becoming a daily occurrence, and often-times hourly during busy times of the day (early morning: lots of reservations, and mid afternoon: lots of checkins).  We poured over all the diagnostics we could find: CPU usage, RAM usage, HDD activity (which is actually difficult to monitor), network activity, and the Event Viewer.

Finally I cracked open the SQL Server logs.  I should have done this sooner, but SQL Server Management Studio wasn't installed on the server and I didn't have it on my desktop.  When I got it installed on my laptop though, I found the following error message repeated tens of times in the minutes leading up to each server crash: "This SQL Server has been optimized for 8 concurrent queries. This limit has been exceeded by xx queries and performance may be adversely affected."  xx would be a 1 or 2 for about 20 minutes (always spaced evenly exactly one minute apart) and then it would jump to 20 or 30 for the last few minutes before the crash.

Shortly thereafter we discovered that the server was running the Microsoft Data Engine, better known as MSDE, also well-known for being limited to 8 concurrent queries.  We have 10 call center computers, 6 front desk computers, call accounting, online reservations, accounting staff, and 3 group sales computers fighting for database access. REMco would not really acknowledge that this was the problem, and it's quite possible that they had no experience with this scenario, because judging by a list on their website of their customers, I suspect that we are their largest.  In the end though they did decide to help us move to a trial version of full-blown SQL Server 2000.

Migration day was quite exciting.  I arrived for work at 1:00pm to discover that they had taken down the server at 10:00am to start the migration.  And it still wasn't up.  I found a number of funny things going on:

  • They had backed up the databases and were then restoring them.  One backup was corrupted, and they were going to restore to the backup made during the night, losing an entire morning of new reservations.  So I taught the REMco tech how to detach and attach a database.

  • They were using "SQL Editor."  I had seen this tool before on the Nitevision server.  It seems like some watered down version of Management Studio.  I suspect the tool does not have functionality for attaching and detaching databases, which may be why they weren't doing that before.  I don't think it supports Windows authentication either, because they weren't able to connect to the new database engine...and that's because...


  • They installed the new engine with only Windows authentication.  Yes, the entire Nitevision program runs using SQL authentication.  SQL Editor uses SQL authentication also.  Upon pointing this out, it seemed that it wasn't merely an oversight on their part.  Rather, I think they genuinely did not know the difference between the two authentication methods.

We finally got the thing running around 3:00pm.  Since then it has only required reboots every other week as it gradually begins to more frequently freeze up for 30 seconds at a time.  End result?  Nitevision humming along acceptably, except for some annoying accessory apps running on the server that are poorly setup.  I'll elaborate on them some other time...along with several other interesting stories as I remember them.

By aaron.axvig, Tue, 08/21/2007 - 03:00

Well here we go; I'm going to detail the unpleasant experience of setting up our new server as best as I can remember it.

Problem 1:  Floppy disk with drivers needed for RAID functionality.  We actually had a floppy drive, and even a computer to connect it to, but no floppy disks could be found.  So we drove a couple miles to someone's house and found one floppy--and old Intel motherboard driver disk.  We fired up the ancient computer there, put the disk in, put the CD-ROM from Asus in the optical drive...and got stuck.  It wouldn't read the disk.  Closer examination revealed that it was actually a DVD disk, which the 5+ year old computer couldn't read.  We took the floppy home and made the disk there.

Problem 2:  Getting the computer to boot correctly.  Having not dealt with a floppy drive for several years, we were both unfamiliar with the cause of those cryptic "failure to find boot disk" messages, which were very vexing.  We initially blamed it on the RAID and how that fit into the boot order.

Problem 3:  Not having disk 2.  Server 2003 64-bit comes on two CDs.  We had 2 MSDN-iso burned disks, one labeled disk 1 and one labeled disk 2.  The second one was most certainly not disk 2.  Off to MSDN to download...and in the meantime we went ahead and installed updates and Service Pack 2.

Problem 4:  We ran disk 2, only to get a warning that Service Pack 2 had already been installed.  We proceeded on anyways.  Around this time we started getting random lockups.  Then a message popped up detailing that the RAID had entered a degraded state.  After messing around in the RAID software for a couple minutes, we decided that one of the drives was bad, and that we would have to reinstall on a RAID composed of the three remaining disks.

Problem 5:  Windows installed again, everything updated, RAID fails again.  So this time I backed up an image of everything we had setup to another computer, re-installed Windows on one of the SATA HDDs (not in a RAID) and restored from the backup.  This seemed to work alright, until we started to have a LOT of problems installing Exchange Server 2007.

So we re-installed again (fourth time if you're counting).  By now I figured that something was up and these disks weren't actually failing.  But we were also sick of the RAID idea so just installed Windows on a spare IDE HDD we had laying around.  In the meantime, we figured out that the disks probably hadn't been given adequate time to rebuild (although I'm still not sure why a new RAID with empty disks needs to be built).

This is the install we are currently running on, and it's working quite well.  After the RAID was given time to build (I went into the BIOS RAID control panel and told it to rebuild) it has been running fine.  We had quite a lot of trouble again with Exchange Server 2007, but that is another story altogether...

By aaron.axvig, Wed, 04/25/2007 - 03:00

So you've got a shiny new Vista install on your desktop, and you're happily installing all of your favorite programs.  One of them requires a reboot, and begrudgingly you comply.  On startup you get an error message reading "BOOTMGR is missing.  Press Ctrl+Alt+Del to restart."  That could be a problem.

As near as I've been able to figure out, Vista gets confused when installed while you have a SATA drive and an IDE drive plugged in and powered on.  In my case, I had one of each, and was installing to the SATA drive.  The IDE drive was already formatted as NTFS (not sure whether that matters, but it might).  So when I installed Vista, it put the Windows files on the SATA drive, but somehow decided to put the boot files on the IDE drive.  It worked like this for a couple of reboots, but then stopped working, giving me the mentioned error.

Vista repair to the rescue, right?  Reportedly it works quite well for some things, but not this case I guess.  I tried the auto repair function that specifically looks for startup problems, but it couldn't find any.  There is a command line option though, so in I dove...

The idea is that some files are placed on the wrong hard drive and they need to be on the other one.  It just so happened that the command line I got though the repair interface gave me C:\ as the drive that had the boot files (incorrectly), and the D:\ as the main (SATA) drive that I wanted the files on.  To find out which is which in your case, just use the following commands (all quotes in this post are NOT part of the command, and the commands are NOT case sensitive):

  • "c:" to switch to the C drive.
  • "dir /a" to view all the files.  The "/a" switch is to show the hidden ones (the files you are looking for, listed below, are hidden).
  • "d:" to switch to the D drive.
  • "dir /a" to view all the files there.

Basically what you are looking for is the drive that has a BOOT folder with files in it, and also a file named BOOTMGR (not in the BOOT folder).  And you want to identify the drive that has the "WINDOWS" folder in it, as that is where you are going to copy those files.

As far as copying the files, just use the following commands:

  • "c:" to switch to the C drive, or wherever you determined the boot files to be.
  • "xcopy /h bootmgr d:" to copy the BOOTMGR file to the D drive, or whichever drive you need to copy them too.  The "/h" switch makes sure that it sees hidden files.
  • "robocopy c:\boot d:\boot /mir" to copy the entire BOOT folder from the C drive to the D drive, again switching the drive letters as you deem necessary.  The "/mir" switch mirrors the entire directory structure, and is necessary because the BOOT folder contains some other folders with files.

That should do it, provided you just saw a bunch of files and copy commands scrolling by.  Just to be sure though, you should test to make sure this problem isn't going to rear its ugly head in the future.  What I'm thinking of is that maybe the files didn't get copied right and it is finding some weird way to use the files that are still on the wrong drive.  So I would recommend unplugging all the cables for the non-OS drive and making sure that you are able to boot Vista completely and that everything is in order.  Then you can plug that drive back in safe in the knowledge that it is not using the files on that drive and you can remove it in the future without repurcussions.

(Note: Lots of information was derived from this Lifehacker post's comments, although my circumstances were a bit different and so my instructions are modified (mainly because those instructions copy the entire drive contents over, and I had data on the second drive that wouldn't fit on the OS drive).  Also, some thanks goes out to this post over at Scott Hanselman's blog.)