Richard's Management Blog

Save the Drama for your Momma

May 2004 - Posts

Uggghhhh!!!!! My Sync Tool isn't working either!!!!

Man oh man doesn't it just suck when you work so hard on automating your Patch Management infrastructure and its someone else's software configuration screwing up that is causing you headaches. Again this month reports of peoples Security Update Bulletin Catalog (MSSECURE.CAB) aren't downloading the latest copy. This is due to ISP's caching the old one. Unfortunately the only way to fix is to call up your ISP, tell them to clear the cache, and tell them (try and be nice) to stop caching going forward.

Now....I know there are a lot of smart fellers up there in Redmond. A lot of them are buddies of mine. Perhaps there is something we can do in the future in the Sync Tool code to work around this issue (wink, wink, nudge, nudge). ;-)

Anyone in the community got an idea? I'd love to hear them? How about renaming the cab to reflect the month? Send them to me or post them!

Hooray! An SMS guy blogging!
Craig from the SMS Product Group started a new blog.  Great to see a member of the SMS team blogging.  Maybe we can convince some more to come out of hiding <g>
Reaction Times

Thinking about feature sets in products, its really evident that there will always be give and take on both ends. How much is to much though? Back in the day SMS Administrators used to have a pretty rough time creating big packages for patch management. There was detection of the OS, seeing which updates were needed, installing the proper ones, verifying that they were installed, etc, etc, etc. What a pain that was. I find myself asking though - are we better off having simplicity given by Microsoft into the product if the accuracy is lower than using our in-house developed tools?

Don't get me wrong I LOVE the new Software Updates features of SMS 2.0 and especially the UI interaction in SMS2K3. But I can also throw down with the best of them when it comes to packaging and scripting. And my Hotfix packages never used to have the detection problems like we see today with the MBSA. So basically what this means is you have a bunch of very technical scripters and packagers that used to code and are now given a solution set from Microsoft because of product feedback but now when there is an issue with detection, etc, they have to sit in front of management and explain that although their jobs are easier to do now they have more hassles to deal with.

Not only that, but add to the fact that its not always that there is an MBSA detection issue but many times it flat out can't detect it and you are no better off than you were before. How long has the SUSFP for 2.0 been out now? Over 1 1/2 years? Personally I'm a person that does something all or nothing and expect the same from members of my team. Its very frustrating when I'm paying money for a product and it only does 85% of what it claims to do.

And I'm not just slamming MS or the MBSA here. Shavlik and other vendors have the same trouble (case in point, no one detects OE or IM updates which is a joke since OE is part of the Operating System really). Yes I realize that file checksums and registry checks must work together to make sure everything was installed fine, but what would rock would be a system that you could bypass one or the other and use the intrinsic SMS functions to detect needed updates. Like instead of SMS_DEF.MOF editing if the Software Updates Scan Tool had an INI file that could have lines added for registry checks that would roll up into SMS.

I just hate having something that only works 85% of the time when I could do it myself (yes with a bit more work) and be more accurate.  Microsoft and other ISVs know about these issues plauging the field.  Its about time to react and fix them instead of contstantly hearing about roadmaps.

Bogus TCO
A lot of discussion goes into TCO for organizations.  Choosing this Management product over this other one because it does X,Y, and Z while the other only does X and Y will produce a lower TCO.  Running this Operating System as opposed to this other one will give you a lower TCO in this scenario.  Who came up with this new age version of Voodoo Economics?  IT departments need to constantly get feedback from their user community as to what they need to do their jobs better.  They should actually be going out of their ways to learn the daily processes and job functions of the customer base they are supporting so that they can be better informed to make decisions that will benefit the company to the greatest extent.  Does having a snazzy web reporting module showing disk queue length for all your servers really benefit your customers?  Does this justify paying more for a product when the other one does the same thing through an Administrators Console?  How about Patch Management features in your product?  Does it allow you to customize patching, reboots, etc around your customer's schedules or does it abruptly interrupt work?  Doesn't this work stoppage work against TCO and in fact raise it?  Its all about the 'X' factor.  Management of your enterprise is about having product sets that are flexible enough to adjust to your organizations needs.  Leave the bells and whistles for Hallmark.
Throwing Water on the Fire

Looks like we've got another worm on our hands.  Many Administrators are getting sleep tonight because they know their systems have been patched for a bit now.  They have set up the proper processes and/or technologies throughout the enterprise to be proactive and patch before something like this happens.  But what about those Administrators that haven't patched their systems yet (this isn't necessarily a failure on an Administrators part but could have been a political, business, or other issue that caused this) and are stuck fighting the worm?

There are good practices and procedures that should be followed for remediation across the Enterprise as well.  Similar to how you handle Management across your systems, if you aren't organized with a good plan and using the proper tools you will be spinning your wheels during remediation as well.

The first step that needs to be done is organization and ownership of resources.  Many times large organizations have clients and servers spread across different buildings and groups that are assumed to be managed however are not.  The best step in this process is to have a contact person or 'Incident Owner' that can designate which groups are responsible for what machines and when they need to address them by.  Usually this person in a large company is a member of the Security Team and also owns other processes such as vulnerability scanning.

After organization and delegation of duties the method of worm/virus cleaning and patching should be agreed upon as well.  Usually most large Anti Virus vendors provide tools to clean systems but a good move more recently has been made by Microsoft to provide tools as well for this process.  Almost all come with silent command line switches so you can easily incorporate them into an SMS package and distribute remotely however another option is to use a scripting language such as VBS or Perl to loop through systems, copy the tool locally and execute it (such as seen here).  Its important to note that a mistake made by many is to immediately disconnect or 'blackhole' vulnerable systems from the network when an outbreak occurs in order to contain it.  Unless the number of unpatched systems is small, this is usually the most counterproductive move that can be made.  By doing this you will not only impact business greatly but also cause your desktop support personnel and possibly your Administrative staff to 'Sneakernet' there way to possibly hundreds of systems to clean and patch them.  A more effective means of containing a worm like this is to attempt cleaning and patching an infected or vulnerable system remotely (as stated previously with SMS or a script - the tools are there for you to do this!) before removing it from the network.  Not only does this save workload from your staff but can also be much more efficient in resolving the problem faster.  Another strategy depending on the size of the outbreak is to only disable infected systems while still continuing remote patching efforts.

Eventually there will almost always need to be a physical visit to some systems when you have an outbreak.  Usually your Desktop Support Staff will be dispatched to clean the virus and patch the system.  I cannot stress enough here how important it is here to communicate instructions on how to do this to all vested parties.  Many times Administrators will understand a worm and create tools to remediate those systems however never communicate this information to the support staff that may need it the most. 

Finally I'd like to talk about scanning tools.  If you are in a pinch here you may not have the proper tools to find vulnerable systems still left on your network.  Foundstone has a great one out but its not the best for reporting.  There are also the new MBSA scripts which have awesome reporting but the performance isn't that great.  Just remember to keep scanning your networks for a bit after remediation efforts are complete so that you know you've put this beast to bed.

After all this schedule a post mortem-type meeting with all parties to discuss improvements for next time - especially why your Patch Management tool/process didn't cover all your systems!