Home
Do you have a server reboot/restart policy?
  v3.0 Posted at 11/07/2017 2:32 AM by Tiago Araujo

If your servers are down or have to go down during business hours you should notify the users at least 15 minutes beforehand so you will not get 101 people all asking you if the computer is down.

For short outages that only affect a few people,  IM is the best method. If you use Teams or Skype, simply open a new chat window, click the "Invite" button, Select All the recipients, and then hit OK to add them to the Message window. This is best used for short outages and when you have a small number of users, or if you will be performing maintenance out of hours and you expect users will not be using the affected systems. If they are not online on Live Messenger or Skype, they can't complain that they were not warned.

For extended or planned outages, or if you have a larger number of users (50+) Email is the suggested method.

If you send an email it is a good idea to tell the user a way to monitor the network themselves. Eg. Software solutions like SCOM or WhatsUp Gold.

Include a "To Myself". It gives visibility to others who are interested in what needs to be done to fix the problem and makes
it easier to remember to send the done email. E.g. "done - CRM is alive again".

E.g.:

Subject: Network Outage
To: SSWALL

Planned/Unplanned:Planned
Change Description:MERMAID – install Windows Server 2008 SP2 at 9 PM
Risk (see table below):LOW RISK (LOW Probability and MEDIUM Impact)
Reason For Change:Windows 2008 SP2 is a prerequisite for TFS SharePoint integration
Uptime over last month:94.059%
Planned Outage (mins):150
Planned Start Time:26/10/2009 9:00 PM
Planned Finish Time:26/10/2009 11:30 PM
Affected Services:
\\Mermaid
http://sharepoint.ssw.com.au
http://intranet.ssw.com.au
http://projects.ssw.com.au


Risk Lookup Table by Probability and Impact:

Risk

Probability

Low

Medium

High

Unknown

Impact

Low

Low Risk

Low Risk

Low Risk

Medium Risk

Medium

Low Risk

Medium Risk

Medium Risk

High Risk

High

Medium Risk

High Risk

High Risk

High Risk

Unknown

Medium Risk

High Risk

High Risk

High Risk


Note: The following servers will be affected (if this is a HyperV host)

rule-outage-1.jpg
http://owl/NmConsole/Reports/Full/Group/Performance/RptGroupPingAvailability/
RptGroupPingAvailability.asp?_nDeviceGroupID=-1&_sStartDate=2/11/2008&_sEndDate=
3/11/2008&_nStartTime=1205154000000&_nEndTime=1205154000000&RptGroupPingAvailability.
oTablePingAvail=&RptGroupPingAvailability.oTablePingAvailSumamry=&_nDeviceID=71&
DeviceStatus.nWorkspaceID=10012&_nReportID=145&_oComboDateRange=Custom&_sStartTime=
12:00%20AM&_sEndTime=12:00%20AM


rule-outage-2.jpg

To myself,

To show others who are interested in what needs to be done to fix the problem:

Detailed Change Plan:
1) Lock out users via IIS
2) Backup server
3) Install Service Pack (Windows Server 2008 SP2) 
4) Reboot server
5) Follow test plan
6) Based on result of test plan, follow backout plan if procedure failed
7) Procedure completed
Test Plan:
1) Check Event log for errors
2) Check each affected service is running
3) Call test users to start “Test Please” on the affect services 
4) Get result of user “Test Please” by email by 11:15 PM
Backout Plan:
1) Restore server from backup
Note:<This is as per rule What is your server reboot/restart policy?>


Immediately before the scheduled downtime, check for logged in users, file access, and database connections.

Users

Open 'Windows Task Manager' (Run > taskmgr) and select the 'Users' tab. Check with users if they have active connections, then have them log off.

rule-outage-3.gif
Figure: Connected users can be viewed in Task Manager

Files

Open 'Computer Management' (Run > compmgmt.msc), then 'System Tools > Shared Folders'. Check 'Session' and 'Open Files' for user connections.

rule-outage-4.gif
Figure: Computer Management 'Open Files' View

Database

Open SQL Server Management Studio on the server. Connect to the local SQL Server. Expand 'Management' and double-click 'Activity Manager'.

rule-outage-5.gif
Figure: SQL Management Studio 'Active Connections' View

Once these have been checked for active users, and users have logged off, maintenance can be carried out.

Restarts should only be performed during the following time periods

  1. Between 7am and 7:05am
  2. Between 1pm and 1:05pm
  3. Between 7pm and 7:05pm

If a scheduled shutdown is required, use the PsShutdown utility from Microsoft's Sys Internals page.

Reply Done when you finish the task.

Related rules

    Do you feel this rule needs an update?

    If you want to be notified when this rule is updated, please enter your email address:

    Comments:

    Note: Social Media login for Yotpo is not working in IE or Safari, please use Chrome. We are waiting for Yotpo to fix it.