Cluebat-man to the rescue

A weblog dedicated to Visual C++, interoperability and other stuff.

Planning the plant shutdown, part 2

The preparation of the plant shutdown continues unabated. More and more I can understand why this needs months of intense planning.

There are lots of dependencies between everything that needs doing. For example: the floors of the cold room need new coating. For this the entire room needs to be brought to near environmental temperature, but this cannot happen too suddenly or the floor will crack.

Sanding away the old coating can be done while the temperature is still low. And during the first day, another team has access to the ceiling panels, but not during the final sanding or coating stages. Other work needs to wait until the coatings are dry. The other teams can then work in the cold room as the temperature is dropping again.

But as the temperature is re-established, it is important that the controller loops can run uninterrupted. This means from that time onwards I cannot disrupt the DCS network anymore because that would cause problems for the people who rely for their work on the correct functioning of the DCS network.

After a lot of discussion back and forth, I've gotten 1 weekend (from Friday evening to Sunday evening) where I can do with the system as I please. During the same weekend, the utilities will all be down, as well as mains electrical distribution.

During the shutdown I need to do quite some things. But as luck has it, we have bought additional server racks and servers. Several servers need replacing anyway, so this is a rough sequence of how we will do things:

  1. Create an additional Domain Controller, put it in the new rack, disconnect it from the network, and connect it to a new switch. That switch will be the backbone of a 'clone' of the DCS domain.
  2. Rename it to the same name as the old master server which has to be replaced.
  3. Issue a development freeze so that we can restore the databases to the new server without risking the loss of work.
  4. Restore the master databases and perform the software upgrade.
  5. Add new servers in the rack to replace the other machines that needed replacing. All new servers will be connected to the cloned domain.
  6. Move the engineering workstation to the cloned domain.
  7. Wait for official signoff of the DCS network. At this moment we have the major infrastructure running on a cloned domain.
  8. As soon as we have ownership of the network we shutdown the live domain controller and all other servers that have a replacement servers running in the cloned domain.
  9. As soon as the domain controllers are down, the backbone switch of the cloned domain is connected to the live network so that the new domain controller can take over.
  10. The automation engineers can start their work as soon as that happens, because the engineering workstation is already back online.
  11. While the automation engineers do their thing, we can move the remaining servers into their new slots in the new server racks and reconfigure them.
  12. The old servers which were replaced can be taken out of the existing racks and put aside for repurposing.
  13. The additional servers which were bought for new functionality can be moved into the existing racks and configured.
  14. While all this is going on, the operator terminals can all be reconfigured 1 by 1 after their software upgrade.

This is a rough first draft of how what I will do during shutdown. I'll need to make a more detailed planning and see what can be done in parallel. The brunt of the work needs to be done beforehand so that the cloned domain can be prepared well before the shutdown. We only have 2 days of shutdown to do what we need, so everything needs to be done as efficiently as possible.

Posted: Sep 29 2008, 12:25 AM by vanDooren | with no comments
Filed under: ,