Thursday, February 21, 2019
Disaster Recovery
casualty convalescence is the supply and implementation of a process whereby a company tail recover from a catastrophic culture technology failure. The three main categories of misadventure expo indisputable totallyow inhering threads and hazards (including hurri rout outes, flooding, earthquakes and fire), technical and mechanical hazards ( much(prenominal) as power prohibitedages, burn step to the fore leaks, accidental or deliberate Halon discharges, or chemical spills) and human activities and threats (like computing machine error, neediness of records, vandalism, sabotage or epidemic) (Rike, 2003).The goal of misfortune convalescence formulation in in entropy formation technology is to restore access to fear info and system resources as quickly as workable, as swell as to minimize info loss and visible resource loss. Disaster convalescence must address each of the main categories of threat, assess the likely invasion and the chance of runrence of ea ch unrivalled and visualize reactions and facilities accordingly. Disaster recuperation is not only important for the IT-based company, but for any company which is under fire(predicate) to cancel misfortune or malicious fall upon.Proper planning of a cataclysm retrieval fashion model give increase response prison term, minimize selective information loss and speed retrieval and regained access to data and computing resources. Disaster retrieval planning for information technology intromits data assurance with a proper easement and restore map network continuity infraction espial and response proper facilities planning including air conditioning, fire detection and control and environmental sensors and strength training in order to train proper response.A occupancys catastrophe retrieval modelling may extend beyond its information technology into facilities management, human resources and separate trading operations. Disaster recovery is a comparatively new f acet of information technology planning which has speedily become to a greater extent(prenominal) important as notees have become more dependent on technology resources. Many modern businesses come to a standstill without their technology base, and this erect be devastating to the business. Rike (2003) famous that 93% of companies which suffer a major data loss go out of business at bottom five years sp atomic number 18-time activity that loss.However, according to Rike, legion(predicate) companies are unprotected from this danger two surveys noted that only 35% of small and midsize businesses have a cataclysm recovery framework in place, while only 36% of all businesses and government offices have such a framework. Disaster Recovery Case Studies One of the first discussions of cataclysm recovery in information technology occurred afterward the 1995 Kobe earthquake in Japan. Garland and Morimoto (1996) provide an account of the outcome of the Kobe University catastroph e recovery framework on their IT infrastructure, as well as the effects of the earthquake itself.The Kobe earthquake, referred to as the Great Hanshin Earthquake Disaster, struck the Kobe area in the untimely morning hours of January 17, 1995. aft(prenominal)shocks and fires worsened the damage ca exampled by the earthquake, cutting off communication theory and electricity to the region. Transportation routes were solely blocked due to collapsed road modes and damaged civilize lines. The earthquake, which measured at 7. 2 on the Richter scale and left nigh 5,400 dead as well as 400,000 homeless in its incite, was unrivaled of the worst haps that have occurred in modern Japan.The university, where the authors were teaching at the time, wooly-minded two professors and thirty nine students, as well as all its laboratory animals. Data loss was coarse, and computing equipment loss was exacerbated by corporeal damage ca handlingd by falling furniture and books. The universitys teleph integrity(a) and fax connections were completely cut off. However, despite the damage to the universitys infrastructure and corporation, Internet connectivity was able to be restored within a few hours of the earthquake.The resulting email access ( in that location were no extensive Web-based resources at the time) allowed students and staff immaterial communication, a means to reassure love ones and provided a connection to government disaster recovery resources. University personnel in like manner used cellular foretells, a then-nascent technology, to connect to the outside world. Kobe University was using the trump out available technology at the time, which allowed for quick recovery of the lightweight machines.The IT personnel at the university noted specifically that the hardest-hit IT resources were the older-style, stationary, heavyweight servers and warehousing units, rather than the newer equipment which was intentional to be moved and handled. Specific su ccesses of the Kobe University disaster recovery included use of alternate routes of communication, communicate communication to all personnel involved (including students and staff), agile restoration of outside connectivity, luckup of alternate email access points and gateways to hold back to provide communication and the use of more robust, newer hardware resources.Some of the problems with the universitys disaster recovery were lack of system-wide backup plan leading(a) to widespread data loss, unsteady bodily set forth leading to damage, including fall damage to computer equipment displace inappropriately c endure to dissimilar hazards and environmental system failure leading to the death of the lab animals. Because Kobe University is the first instance of formalized study of disaster recovery in information technology, there are a design of questions which arise from the planning and execution of the recovery.What are the priorities of the business or memorial tab let when planning? How do you put into place organization-wide policies, such as data backup, which reduce the bump of failure? How do you deal with facilities and functions (such as overt utility infrastructure) that are out of your control? A more new-fangled demonstration of the magnificence of disaster set and recovery was Hurricane Katrina, in 2005.Chenoweth, Peters and Naremore (2006) examine the disaster cooking and recovery response of a parvenue siege of Orleans hospital during the hurricane and the flooding that followed. due east Jefferson General hospital, located in Jefferson parish, was one of three hospitals in New Orleans to remain open during and after the storm. The hospital intend for a two to three day indispensableness situation staffers brought appropriate supplies for only a few days. there were over 3,000 people, including staff, patients and community members, as well as a handful of pets, sheltering at the hospital by the time the storm hit New Orleans on August 28. The hospitals IT staff worked quickly to move full of life equipment out of harms way they moved data center equipment to upper floors and PCs and new(prenominal) equipment away from windows, printed out hard copies of patient records, contact information and other vital data, and set up a hospital command post with PCs, telephones and fax machines for outside connectivity.The hospital itself did not sustain a high degree of physical damage in the storm, in contrast with Kobe University. However, the infrastructure of the city itself was intimately destroyed, with electricity, telephone and water cut off, roads blocked and food and drink water supplies tight. The hospital was isolated from the rest of the world for over a week as external recovery crews worked. East Jefferson friendship Hospital did have a written disaster recovery framework in place prior to Hurricane Katrina.According to Chenoweth et al (2006), the IT department had a hot rate arrange ment with SunGard weekly backups of the hospitals data were stored in a local record vault, occasionally retrieved for safe storage in SunGards offsite facility in New Jersey. Unfortunately, the evacuation of the vaults staff left the tapes inaccessible. During the storm, the hospital lost first football field power and then generator power communications were lost as the Bell South CO, then the onsite CO, and finally the hospitals Cox internet crease connection went down.The rapidly changing situation, according to the authors, forced a reprioritization of IT resources and efforts from upcountry systems maintenance to restoring and maintaining communication with the outside world. The IT staff found a usable dialup line and set up email access using most of the PCs on-site they also leveraged spotty cellular go and messaging operate to maximize communications, which allowed them to coordinate with rescue teams and officials and arrange for food, water and generator deliveri es. The native telephone system was also utilized to maintain communication passim the hospital.A present momentary concern to the hospital, according to Chenoweth et al (2006), was its employees particularly, circumventing the normal payroll system, which was inaccessible, in order to provide funds to employees who were suffering high expenses due to evacuation. This was all over by using the Internet to provide a funds get rid of to each employee approximating their last paycheck. Similar workarounds were created for accounts receivable, with employees manually entering charges and emailing them to the system supplier for processing.The hospitals outsourced IT provider also had its own issues to deal with it had to locate missing employees (which was elegant within three days by using a broadcast approach of Internet connections and message boards and contacting family and friends of the staffers this is in contrast to legion(predicate) other companies, which were still st ruggling to locate employees by November) and pr accompaniment employee burnout by placement for relief staffers. East Jefferson participation Hospitals IT infrastructure was back up and running only a week after the storm hit, and began providing patient serve immediately.Its disaster recovery framework, as well as quick sentiment in repositioning the framework when it became clear that it did not match the indite of the disaster it was supposed to counter, was a clear f doer in the hospitals spendthrift recovery and return to service. Following the experience during Katrina, the hospitals IT staff investigated its disaster recovery framework and cited a number of changes which should be made, including increased emergency communications capacity, maintaining high-speed Internet access and implementing an automatic switching mechanism should one generator go down again.Disaster Recovery Framework Design The experiences of Kobe University and East Jefferson Community Hospital clearly indicate the need for robust disaster recovery planning. While disaster recovery is not always a matter of life and death as it was in these two cases, it can often mean the difference between a company that recovers successfully and one that is driven out of business by a vital failure. How can a company begin to develop a disaster recovery framework, and how extensive does this framework need to be?Benton (2007) suggested that the disaster recovery framework must begin with a formal business affect assessment. This assessment draws on the knowledge and experience of the IT staff and the CIO to witness what the faultfinding pieces of IT infrastructure are for a given company. A business force synopsis (BIA) is a way in which the contribution or importance of a given business resource can be canvass and expressed in dollars and cents terms, in order to allow corporate officers to place the correct emphasis during disaster recovery.The BIA also includes unobjective observations of the resources importance, giving an boilers suit view of the organization to the decision makers. The hour piece of the decision-making process is the adventure analysis. What kinds of disasters are likely, Benton asked, and how much damage are they likely to cause should they occur? Exactly how likely is a disaster to happen? Benton urged caution on this question as he pointed out, the risk of being unprepared is say-soly far greater than the constitute of preparedness.Rike (2003) discussed the risk analysis that should be put to deathed ahead beginning a business inventory analysis and disaster recovery planning. Risks should be analyzed in three different dimensions the character reference of risk, the likelihood of the risk and the order of the risk. Rike divided risk types into three general categories natural threats and hazards, technical and mechanical hazards and human activities and threats. Rike noted that it is not always contingent to predict approximately types of disasters, such as human activities, while nigh activities, such as common weather phenomena, can be planned for in advance.The third dimension of risk analysis is the magnitude of the potential risk. Rike identified three categories of magnitude community-wide disasters, such as the Kobe earthquake and Hurricane Katrina as discussed above localized to a building or a chemical group of buildings, such as water leak or electricity outage or single, or only affecting a atomic number 53 organization, department or role player. A dissatisfied worker sabotaging data exemplifies this situation. Rike (2003) outlined a proposed archive and method for designing a disaster recovery framework.The first step, obtaining pilfer management buy-in and support, is critical in order to fund and implement the disaster recovery framework. It is also necessary for top staff to be inform of disaster recovery procedures because they result be ultimately responsible for it s implementation. The second step Rike suggested was to establish a planning committee staffed with personnel from facilities, information technology and other critical departments who depart be responsible for planning and implementing the policy. The third step in Rikes method is to manage a risk assessment and conduct a BIA.The risk assessment should include determining the type of risk the behavior is subject to and its likelihood, the consequences of each scenario, the estimated be of each scenario, replacement greet of data, equipment and staff recovery versus disaster framework implementation, and the potential risk of the worst-case scenario occurring. Rikes fourth step is close of critical business facilities business equipment, connectivity through Internet and phone lines, internal phone system, fire and fumigant systems and other facilities required to plow to operate.This step also includes the determination of disaster recovery procedures and documentation, vita l records and personnel. Step five is the procurement and cookery of disaster recovery facilities, including offsite storage facilities, inventory of critical documents, policy and procedure manuals, master lists of staff contact information, vendor information, account numbers and other vital information, and a review of security and environmental systems. Step half-dozen is preparation of a written framework, taking into account the information gathered in locomote one through five.Rike recommended that a standard format and software package should be used to write the framework, rather than a customized solution. The framework should then be reviewed on a frequent bum to ensure continued alignment with company business and goals as well as changes to potential risk. The final step in Rikes methodology is to test the written framework in order to make sure it is feasible. In order to begin developing a disaster preparedness framework, Benton suggested a company-wide IT invent ory, detailing application, storage and server assets.These assets could then be be into categories depending on the importance of the business application and replacement cost of the equipment. There are two main ranking criteria. Recovery time design (RTO) is the optimal maximum occur of time between disaster and service resumption. Recovery point objective (RPO) is the maximum amount of allowable data loss. Benton recommended a multi-tier system at the top level should be no data loss and minimal downtime, or an RTO and RPO of close to 0, reserved for mission-critical function and business units that provide immediate revenue for the company.Business units should then be ranked in descending order according to their revenue generating potential and criticality. At its lowest level, Benton suggested that the RTO could be extended out to 72-96 hours. Rike (2003) identified key questions to use when conducting the BIA, including how would the department in question operate if on line systems were not available? and what is the stripped space required for the department to operate? Benton prioritized two critical preplanning locomote for disaster recovery.The first was data consolidation, or optimizing the protection of data by assembling all critical data in a single location for ease of backup and recovery. This can be established by use of a centralized file server in a small organization or use of a SAN or NAS object in a larger one. The second prerequisite, which can be more modify than storage consolidation, is server consolidation. This step can be complicated because the performance profile of servers can vary, and processing and network access can vary between them. Benton further discussed the complexities of disaster recovery of data.Among the problems he noted are difficulties with logical consistency and order of recovery. If standard file backup technologies are used, these backups may not be logically consistent when they are recovered b ecause they will be recovered to a slightly different point in time. Newer snapshot technologies can alleviate this problem, however. Another variety issue is data replication, which may be interrupted when the write heads lose power. Finally, order of recovery will be important because some applications and servers will be dependent on other servers being restored first in order to maintain logical consistency.Benton also noted that disaster recovery should be maintained separately from periodic backups and archival procedures, because data storage procedures for periodic backups and archival procedures may not be adequate or appropriate for disaster recovery. Finally, Benton remarked that hardware designated for disaster recovery should be exercised in a non-emergency situation in order to ensure that it is properly piece and connected. Rike (2003) recommended a course of action in the event that the disaster recovery framework needs to be put into action following a physical di saster.The first step in Rikes method is to perform a damage assessment in order to determine the reach and type of damage, the size of the area affected and what assets have been damaged. Rikes second step is damage control by environment stabilization. In the event of physical damage, the damage can become permanent very quickly. Rike suggested that the physical environment must be stabilized by drying the air, removing water and filth particles, restoring air conditioning and whatever other cleanup can be performed.She suggested that material such as power generators, sump pumps to remove stand water, high-powered fans, plastic sheeting, absorbent materials and other cleanup equipment should be unbroken on hand in order to speed environmental stabilization. erst the environment is stable, Rike prioritized activation of the emergency team as defined in the disaster recovery framework, and then restoration and cleanup this cleanup can in some cases be performed by business sta ff, but in some cases, such as a toxic spill or mold contamination, should be handled by specially trained professionals.While Rike discussed physical disaster recovery resulting from primarily natural or mechanical threats, Patnaik and Panda (2003) discussed data recovery from a malicious attack, addressing the human threat perspective. Malicious attack on data and application resources can come either from within the business (most often from a disgruntled employee) or outside the business (hackers or industrial spies). As Patnaik and Panda noted, it is not necessarily possible to distinguish a malicious attack from a legitimate data consummation.According to the authors, requirements for protecting data from malicious attack include protection from unlicensed users, detection of hostile activities and damage recovery. Unfortunately, as the authors noted, in the case of a database storage system it is not always possible, even with these precautions in place, to thingamabob all potential malicious relationss. This is particularly problematic when the malicious actor is someone who has trusted access to a system. If a malicious transaction is committed to the database, it is then seen as legitimate and may be propagated to other areas of the database through normal interactions.In order to prevent this spread, a quick recovery is required. Unfortunately, the authors noted, the size of database logs often precludes a fast recovery, due to extended periods of time spent accessing and applying the logs. In order to remedy this, Patnaik and Panda proposed a partitioned or segmented log solution which allows recovery of a malicious transaction to access only one of the log segments in order to perform recovery, rather than the full logs. This increases recovery time by an order of magnitude over applying the full redo log, according to the authors.Disaster recovery is a relatively inexpensive method of assuring business continuity in the wake of a natural, phys ical or human event or attack. The cost of not having a disaster recovery framework is, as Rike (2003) noted, super high 93% of businesses which suffer a major data loss go out of business within five years. The experiences of Kobe University and East Jefferson Community Hospital demonstrate the value of a disaster recovery framework, as well as the importance of examining priorities when deciding on the framework.While physical premises may be covered by insurance in some cases, the same is not typically true for data, institutional knowledge, continued business and personnel. In order to implement a data recovery framework, one can follow Rikes (2003) methodology, beginning with gaining the support of fourth-year staff and the appointment of a disaster recovery planning committee, performing risk analysis, a BIA, and determining and putting in writing a disaster recovery framework and finally testing the framework to ensure its viability.These steps will help to protect the bus iness in the event of a disaster, whether it is natural, mechanical or human in origin, and whether it is localized or community-wide. look into Proposal In order for a business to determine whether a disaster recovery framework is appropriate for their business, as well as to analyze the relative risks and costs of implementing a disaster recovery framework and replacing lost business assets and personnel in the event of a disaster. Following steps three and four of Rikes methodology will provide a determination of utility of a disaster preparedness framework for a given business.In order to perform this analysis, the assent of senior staff members should be obtained. This analysis can be conducted in the following manner. First, perform Rikes third step, that of risk analysis and assessment. This assessment should adjudicate the potential threat to the business and its effects in three dimensions type of threat (natural, mechanical or human), magnitude of threat (individualized, localized, community-wide), and likelihood (certain, likely, unlikely, extremely unlikely). Questions that should be asked during this risk assessment include What is the natural environmental exercise of the geographic area? Is the area subject to earthquakes, flooding, hurricanes or other natural phenomena? Are current environmental control provisions such as Halon systems and fire detection systems up to date? How likely is attack by a human threat? Does the company tend to have disgruntled workers, or no? How much access does any individual worker have to the data and application servers? What is the replacement cost of data, equipment and staff versus the cost of disaster recovery framework implementation? What is the potential for the worst-case scenario to occur?After the risk analysis is complete, step four of Rikes methodology, determination of critical business resources, should be implemented. This step includes asking the following questions What is the minimum amo unt of servers, Internet connectivity, communications capacity, space, documentation, data and staff the company can continue to operate on? Who is the critical staff? What is the critical data? How many single points of failure are there?Step four of Rikes methodology, the business impact analysis or BIA, is the final method of analysis in determining the benefit of the disaster recovery framework to an individual organization. The BIA examines each aspect of a businesss function and determines which functions are critical to the businesss continued operation, as well as which functions can be brought back online after the most critical operations are stabilized. This examination should include all facets of a business, including seemingly unimportant functions such as facilities management, janitorial access and human resources records access.Business functions should be ranked on a matrix of direct and immediate benefit to the business, determined by their immediate monetary va lue as well as subjective perceptions of importance. Using a combination of a risk and cost analysis to determine the likelihood of risk occurring and the cost of implementation versus non-implementation, a business needs analysis to determine critical business requirements, and a BIA to determine critical business functions, it will be possible to determine whether a disaster recovery framework makes palpate for a given business, as well as what type of disaster recovery framework should be implemented.It is the authors contention that disaster recovery planning makes sense for every business, and should be implemented at a level that will ensure business continuity and hasten recovery should a disaster occur. Customization of disaster recovery planning should be done using the risk, cost and business needs analysis to create a framework that will allow the business to secure its own interests in the event of a small or large disaster.No disaster recovery framework is perfect, and there can always be situations that remain unconsidered, as East Jefferson Community Hospitals experience showed. However, having an initial disaster recovery plan in place made it easier to reprioritize resource allocation when there were unthought-of issues. As von Moltke remarked, no plan survives contact with the enemy but that is no reason not to plan.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment