The Practice of System and Network Administration, Volume 1 · 3rd Edition (Samantha Primeaux's Library) by Limoncelli, Thomas A. -- Read -- Imperial Library of Trantor

Index

About This E-Book Title Page Copyright Page Contents at a Glance Contents Preface

Who Should Read This Book Basic Principles What Is an SA? System Administration Matters Organization of This Book What’s New in the Third Edition What’s Next

Acknowledgments

For the Third Edition For the Second Edition For the First Edition

About the Authors Part I: Game-Changing Strategies

Chapter 1. Climbing Out of the Hole

1.1 Organizing WIP

1.1.1 Ticket Systems 1.1.2 Kanban 1.1.3 Tickets and Kanban

1.2 Eliminating Time Sinkholes

1.2.1 OS Installation and Configuration 1.2.2 Software Deployment

1.3 DevOps 1.4 DevOps Without Devs 1.5 Bottlenecks 1.6 Getting Started 1.7 Summary Exercises

Chapter 2. The Small Batches Principle

2.1 The Carpenter Analogy 2.2 Fixing Hell Month 2.3 Improving Emergency Failovers 2.4 Launching Early and Often 2.5 Summary Exercises

Chapter 3. Pets and Cattle

3.1 The Pets and Cattle Analogy 3.2 Scaling 3.3 Desktops as Cattle 3.4 Server Hardware as Cattle 3.5 Pets Store State 3.6 Isolating State 3.7 Generic Processes 3.8 Moving Variations to the End 3.9 Automation 3.10 Summary Exercises

Chapter 4. Infrastructure as Code

4.1 Programmable Infrastructure 4.2 Tracking Changes 4.3 Benefits of Infrastructure as Code 4.4 Principles of Infrastructure as Code 4.5 Configuration Management Tools

4.5.1 Declarative Versus Imperative 4.5.2 Idempotency 4.5.3 Guards and Statements

4.6 Example Infrastructure as Code Systems

4.6.1 Configuring a DNS Client 4.6.2 A Simple Web Server 4.6.3 A Complex Web Application

4.7 Bringing Infrastructure as Code to Your Organization 4.8 Infrastructure as Code for Enhanced Collaboration 4.9 Downsides to Infrastructure as Code 4.10 Automation Myths 4.11 Summary Exercises

Part II: Workstation Fleet Management

Chapter 5. Workstation Architecture

5.1 Fungibility 5.2 Hardware 5.3 Operating System 5.4 Network Configuration

5.4.1 Dynamic Configuration 5.4.2 Hardcoded Configuration 5.4.3 Hybrid Configuration 5.4.4 Applicability

5.5 Accounts and Authorization 5.6 Data Storage 5.7 OS Updates 5.8 Security

5.8.1 Theft 5.8.2 Malware

5.9 Logging 5.10 Summary Exercises

Chapter 6. Workstation Hardware Strategies

6.1 Physical Workstations

6.1.1 Laptop Versus Desktop 6.1.2 Vendor Selection 6.1.3 Product Line Selection

6.2 Virtual Desktop Infrastructure

6.2.1 Reduced Costs 6.2.2 Ease of Maintenance 6.2.3 Persistent or Non-persistent?

6.3 Bring Your Own Device

6.3.1 Strategies 6.3.2 Pros and Cons 6.3.3 Security 6.3.4 Additional Costs 6.3.5 Usability

6.4 Summary Exercises

Chapter 7. Workstation Software Life Cycle

7.1 Life of a Machine 7.2 OS Installation 7.3 OS Configuration

7.3.1 Configuration Management Systems 7.3.2 Microsoft Group Policy Objects 7.3.3 DHCP Configuration 7.3.4 Package Installation

7.4 Updating the System Software and Applications

7.4.1 Updates Versus Installations 7.4.2 Update Methods

7.5 Rolling Out Changes . . . Carefully 7.6 Disposal

7.6.1 Accounting 7.6.2 Technical: Decommissioning 7.6.3 Technical: Data Security 7.6.4 Physical

7.7 Summary Exercises

Chapter 8. OS Installation Strategies

8.1 Consistency Is More Important Than Perfection 8.2 Installation Strategies

8.2.1 Automation 8.2.2 Cloning 8.2.3 Manual

8.3 Test-Driven Configuration Development 8.4 Automating in Steps 8.5 When Not to Automate 8.6 Vendor Support of OS Installation 8.7 Should You Trust the Vendor’s Installation? 8.8 Summary Exercises

Chapter 9. Workstation Service Definition

9.1 Basic Service Definition

9.1.1 Approaches to Platform Definition 9.1.2 Application Selection 9.1.3 Leveraging a CMDB

9.2 Refresh Cycles

9.2.1 Choosing an Approach 9.2.2 Formalizing the Policy 9.2.3 Aligning with Asset Depreciation

9.3 Tiered Support Levels 9.4 Workstations as a Managed Service 9.5 Summary Exercises

Chapter 10. Workstation Fleet Logistics

10.1 What Employees See 10.2 What Employees Don’t See

10.2.1 Purchasing Team 10.2.2 Prep Team 10.2.3 Delivery Team 10.2.4 Platform Team 10.2.5 Network Team 10.2.6 Tools Team 10.2.7 Project Management 10.2.8 Program Office

10.3 Configuration Management Database 10.4 Small-Scale Fleet Logistics

10.4.1 Part-Time Fleet Management 10.4.2 Full-Time Fleet Coordinators

10.5 Summary Exercises

Chapter 11. Workstation Standardization

11.1 Involving Customers Early 11.2 Releasing Early and Iterating 11.3 Having a Transition Interval (Overlap) 11.4 Ratcheting 11.5 Setting a Cut-Off Date 11.6 Adapting for Your Corporate Culture 11.7 Leveraging the Path of Least Resistance 11.8 Summary Exercises

Chapter 12. Onboarding

12.1 Making a Good First Impression 12.2 IT Responsibilities 12.3 Five Keys to Successful Onboarding

12.3.1 Drive the Process with an Onboarding Timeline 12.3.2 Determine Needs Ahead of Arrival 12.3.3 Perform the Onboarding 12.3.4 Communicate Across Teams 12.3.5 Reflect On and Improve the Process

12.4 Cadence Changes 12.5 Case Studies

12.5.1 Worst Onboarding Experience Ever 12.5.2 Lumeta’s Onboarding Process 12.5.3 Google’s Onboarding Process

12.6 Summary Exercises

Part III: Servers

Chapter 13. Server Hardware Strategies

13.1 All Eggs in One Basket 13.2 Beautiful Snowflakes

13.2.1 Asset Tracking 13.2.2 Reducing Variations 13.2.3 Global Optimization

13.3 Buy in Bulk, Allocate Fractions

13.3.1 VM Management 13.3.2 Live Migration 13.3.3 VM Packing 13.3.4 Spare Capacity for Maintenance 13.3.5 Unified VM/Non-VM Management 13.3.6 Containers

13.4 Grid Computing 13.5 Blade Servers 13.6 Cloud-Based Compute Services

13.6.1 What Is the Cloud? 13.6.2 Cloud Computing’s Cost Benefits 13.6.3 Software as a Service

13.7 Server Appliances 13.8 Hybrid Strategies 13.9 Summary Exercises

Chapter 14. Server Hardware Features

14.1 Workstations Versus Servers

14.1.1 Server Hardware Design Differences 14.1.2 Server OS and Management Differences

14.2 Server Reliability

14.2.1 Levels of Redundancy 14.2.2 Data Integrity 14.2.3 Hot-Swap Components 14.2.4 Servers Should Be in Computer Rooms

14.3 Remotely Managing Servers

14.3.1 Integrated Out-of-Band Management 14.3.2 Non-integrated Out-of-Band Management

14.4 Separate Administrative Networks 14.5 Maintenance Contracts and Spare Parts

14.5.1 Vendor SLA 14.5.2 Spare Parts 14.5.3 Tracking Service Contracts 14.5.4 Cross-Shipping

14.6 Selecting Vendors with Server Experience 14.7 Summary Exercises

Chapter 15. Server Hardware Specifications

15.1 Models and Product Lines 15.2 Server Hardware Details

15.2.1 CPUs 15.2.2 Memory 15.2.3 Network Interfaces 15.2.4 Disks: Hardware Versus Software RAID 15.2.5 Power Supplies

15.3 Things to Leave Out 15.4 Summary Exercises

Part IV: Services

Chapter 16. Service Requirements

16.1 Services Make the Environment 16.2 Starting with a Kick-Off Meeting 16.3 Gathering Written Requirements 16.4 Customer Requirements

16.4.1 Describing Features 16.4.2 Questions to Ask 16.4.3 Service Level Agreements 16.4.4 Handling Difficult Requests

16.5 Scope, Schedule, and Resources 16.6 Operational Requirements

16.6.1 System Observability 16.6.2 Remote and Central Management 16.6.3 Scaling Up or Out 16.6.4 Software Upgrades 16.6.5 Environment Fit 16.6.6 Support Model 16.6.7 Service Requests 16.6.8 Disaster Recovery

16.7 Open Architecture 16.8 Summary Exercises

Chapter 17. Service Planning and Engineering

17.1 General Engineering Basics 17.2 Simplicity 17.3 Vendor-Certified Designs 17.4 Dependency Engineering

17.4.1 Primary Dependencies 17.4.2 External Dependencies 17.4.3 Dependency Alignment

17.5 Decoupling Hostname from Service Name 17.6 Support

17.6.1 Monitoring 17.6.2 Support Model 17.6.3 Service Request Model 17.6.4 Documentation

17.7 Summary Exercises

Chapter 18. Service Resiliency and Performance Patterns

18.1 Redundancy Design Patterns

18.1.1 Masters and Slaves 18.1.2 Load Balancers Plus Replicas 18.1.3 Replicas and Shared State 18.1.4 Performance or Resilience?

18.2 Performance and Scaling

18.2.1 Dataflow Analysis for Scaling 18.2.2 Bandwidth Versus Latency

18.3 Summary Exercises

Chapter 19. Service Launch: Fundamentals

19.1 Planning for Problems 19.2 The Six-Step Launch Process

19.2.1 Step 1: Define the Ready List 19.2.2 Step 2: Work the List 19.2.3 Step 3: Launch the Beta Service 19.2.4 Step 4: Launch the Production Service 19.2.5 Step 5: Capture the Lessons Learned 19.2.6 Step 6: Repeat

19.3 Launch Readiness Review

19.3.1 Launch Readiness Criteria 19.3.2 Sample Launch Criteria 19.3.3 Organizational Learning 19.3.4 LRC Maintenance

19.4 Launch Calendar 19.5 Common Launch Problems

19.5.1 Processes Fail in Production 19.5.2 Unexpected Access Methods 19.5.3 Production Resources Unavailable 19.5.4 New Technology Failures 19.5.5 Lack of User Training 19.5.6 No Backups

19.6 Summary Exercises

Chapter 20. Service Launch: DevOps

20.1 Continuous Integration and Deployment

20.1.1 Test Ordering 20.1.2 Launch Categorizations

20.2 Minimum Viable Product 20.3 Rapid Release with Packaged Software

20.3.1 Testing Before Deployment 20.3.2 Time to Deployment Metrics

20.4 Cloning the Production Environment 20.5 Example: DNS/DHCP Infrastructure Software

20.5.1 The Problem 20.5.2 Desired End-State 20.5.3 First Milestone 20.5.4 Second Milestone

20.6 Launch with Data Migration 20.7 Controlling Self-Updating Software 20.8 Summary Exercises

Chapter 21. Service Conversions

21.1 Minimizing Intrusiveness 21.2 Layers Versus Pillars 21.3 Vendor Support 21.4 Communication 21.5 Training 21.6 Gradual Roll-Outs 21.7 Flash-Cuts: Doing It All at Once 21.8 Backout Plan

21.8.1 Instant Roll-Back 21.8.2 Decision Point

21.9 Summary Exercises

Chapter 22. Disaster Recovery and Data Integrity

22.1 Risk Analysis 22.2 Legal Obligations 22.3 Damage Limitation 22.4 Preparation 22.5 Data Integrity 22.6 Redundant Sites 22.7 Security Disasters 22.8 Media Relations 22.9 Summary Exercises

Part V: Infrastructure

Chapter 23. Network Architecture

23.1 Physical Versus Logical 23.2 The OSI Model 23.3 Wired Office Networks

23.3.1 Physical Infrastructure 23.3.2 Logical Design 23.3.3 Network Access Control 23.3.4 Location for Emergency Services

23.4 Wireless Office Networks

23.4.1 Physical Infrastructure 23.4.2 Logical Design

23.5 Datacenter Networks

23.5.1 Physical Infrastructure 23.5.2 Logical Design

23.6 WAN Strategies

23.6.1 Topology 23.6.2 Technology

23.7 Routing

23.7.1 Static Routing 23.7.2 Interior Routing Protocol 23.7.3 Exterior Gateway Protocol

23.8 Internet Access

23.8.1 Outbound Connectivity 23.8.2 Inbound Connectivity

23.9 Corporate Standards

23.9.1 Logical Design 23.9.2 Physical Design

23.10 Software-Defined Networks 23.11 IPv6

23.11.1 The Need for IPv6 23.11.2 Deploying IPv6

23.12 Summary Exercises

Chapter 24. Network Operations

24.1 Monitoring 24.2 Management

24.2.1 Access and Audit Trail 24.2.2 Life Cycle 24.2.3 Configuration Management 24.2.4 Software Versions 24.2.5 Deployment Process

24.3 Documentation

24.3.1 Network Design and Implementation 24.3.2 DNS 24.3.3 CMDB 24.3.4 Labeling

24.4 Support

24.4.1 Tools 24.4.2 Organizational Structure 24.4.3 Network Services

24.5 Summary Exercises

Chapter 25. Datacenters Overview

25.1 Build, Rent, or Outsource

25.1.1 Building 25.1.2 Renting 25.1.3 Outsourcing 25.1.4 No Datacenter 25.1.5 Hybrid

25.2 Requirements

25.2.1 Business Requirements 25.2.2 Technical Requirements

25.3 Summary Exercises

Chapter 26. Running a Datacenter

26.1 Capacity Management

26.1.1 Rack Space 26.1.2 Power 26.1.3 Wiring 26.1.4 Network and Console

26.2 Life-Cycle Management

26.2.1 Installation 26.2.2 Moves, Adds, and Changes 26.2.3 Maintenance 26.2.4 Decommission

26.3 Patch Cables 26.4 Labeling

26.4.1 Labeling Rack Location 26.4.2 Labeling Patch Cables 26.4.3 Labeling Network Equipment

26.5 Console Access 26.6 Workbench 26.7 Tools and Supplies

26.7.1 Tools 26.7.2 Spares and Supplies 26.7.3 Parking Spaces

26.8 Summary Exercises

Part VI: Helpdesks and Support

Chapter 27. Customer Support

27.1 Having a Helpdesk 27.2 Offering a Friendly Face 27.3 Reflecting Corporate Culture 27.4 Having Enough Staff 27.5 Defining Scope of Support 27.6 Specifying How to Get Help 27.7 Defining Processes for Staff 27.8 Establishing an Escalation Process 27.9 Defining “Emergency” in Writing 27.10 Supplying Request-Tracking Software 27.11 Statistical Improvements 27.12 After-Hours and 24/7 Coverage 27.13 Better Advertising for the Helpdesk 27.14 Different Helpdesks for Different Needs 27.15 Summary Exercises

Chapter 28. Handling an Incident Report

28.1 Process Overview 28.2 Phase A—Step 1: The Greeting 28.3 Phase B: Problem Identification

28.3.1 Step 2: Problem Classification 28.3.2 Step 3: Problem Statement 28.3.3 Step 4: Problem Verification

28.4 Phase C: Planning and Execution

28.4.1 Step 5: Solution Proposals 28.4.2 Step 6: Solution Selection 28.4.3 Step 7: Execution

28.5 Phase D: Verification

28.5.1 Step 8: Craft Verification 28.5.2 Step 9: Customer Verification/Closing

28.6 Perils of Skipping a Step 28.7 Optimizing Customer Care

28.7.1 Model-Based Training 28.7.2 Holistic Improvement 28.7.3 Increased Customer Familiarity 28.7.4 Special Announcements for Major Outages 28.7.5 Trend Analysis 28.7.6 Customers Who Know the Process 28.7.7 An Architecture That Reflects the Process

28.8 Summary Exercises

Chapter 29. Debugging

29.1 Understanding the Customer’s Problem 29.2 Fixing the Cause, Not the Symptom 29.3 Being Systematic 29.4 Having the Right Tools

29.4.1 Training Is the Most Important Tool 29.4.2 Understanding the Underlying Technology 29.4.3 Choosing the Right Tools 29.4.4 Evaluating Tools

29.5 End-to-End Understanding of the System 29.6 Summary Exercises

Chapter 30. Fixing Things Once

30.1 Story: The Misconfigured Servers 30.2 Avoiding Temporary Fixes 30.3 Learn from Carpenters 30.4 Automation 30.5 Summary Exercises

Chapter 31. Documentation

31.1 What to Document 31.2 A Simple Template for Getting Started 31.3 Easy Sources for Documentation

31.3.1 Saving Screenshots 31.3.2 Capturing the Command Line 31.3.3 Leveraging Email 31.3.4 Mining the Ticket System

31.4 The Power of Checklists 31.5 Wiki Systems 31.6 Findability 31.7 Roll-Out Issues 31.8 A Content-Management System 31.9 A Culture of Respect 31.10 Taxonomy and Structure 31.11 Additional Documentation Uses 31.12 Off-Site Links 31.13 Summary Exercises

Part VII: Change Processes

Chapter 32. Change Management

32.1 Change Review Boards 32.2 Process Overview 32.3 Change Proposals 32.4 Change Classifications 32.5 Risk Discovery and Quantification 32.6 Technical Planning 32.7 Scheduling 32.8 Communication 32.9 Tiered Change Review Boards 32.10 Change Freezes 32.11 Team Change Management

32.11.1 Changes Before Weekends 32.11.2 Preventing Injured Toes 32.11.3 Revision History

32.12 Starting with Git 32.13 Summary Exercises

Chapter 33. Server Upgrades

33.1 The Upgrade Process 33.2 Step 1: Develop a Service Checklist 33.3 Step 2: Verify Software Compatibility

33.3.1 Upgrade the Software Before the OS 33.3.2 Upgrade the Software After the OS 33.3.3 Postpone the Upgrade or Change the Software

33.4 Step 3: Develop Verification Tests 33.5 Step 4: Choose an Upgrade Strategy

33.5.1 Speed 33.5.2 Risk 33.5.3 End-User Disruption 33.5.4 Effort

33.6 Step 5: Write a Detailed Implementation Plan

33.6.1 Adding Services During the Upgrade 33.6.2 Removing Services During the Upgrade 33.6.3 Old and New Versions on the Same Machine 33.6.4 Performing a Dress Rehearsal

33.7 Step 6: Write a Backout Plan 33.8 Step 7: Select a Maintenance Window 33.9 Step 8: Announce the Upgrade 33.10 Step 9: Execute the Tests 33.11 Step 10: Lock Out Customers 33.12 Step 11: Do the Upgrade with Someone 33.13 Step 12: Test Your Work 33.14 Step 13: If All Else Fails, Back Out 33.15 Step 14: Restore Access to Customers 33.16 Step 15: Communicate Completion/Backout 33.17 Summary Exercises

Chapter 34. Maintenance Windows

34.1 Process Overview 34.2 Getting Management Buy-In 34.3 Scheduling Maintenance Windows 34.4 Planning Maintenance Tasks 34.5 Selecting a Flight Director 34.6 Managing Change Proposals

34.6.1 Sample Change Proposal: SecurID Server Upgrade 34.6.2 Sample Change Proposal: Storage Migration

34.7 Developing the Master Plan 34.8 Disabling Access 34.9 Ensuring Mechanics and Coordination

34.9.1 Shutdown/Boot Sequence 34.9.2 KVM, Console Service, and LOM 34.9.3 Communications

34.10 Change Completion Deadlines 34.11 Comprehensive System Testing 34.12 Post-maintenance Communication 34.13 Reenabling Remote Access 34.14 Be Visible the Next Morning 34.15 Postmortem 34.16 Mentoring a New Flight Director 34.17 Trending of Historical Data 34.18 Providing Limited Availability 34.19 High-Availability Sites

34.19.1 The Similarities 34.19.2 The Differences

34.20 Summary Exercises

Chapter 35. Centralization Overview

35.1 Rationale for Reorganizing

35.1.1 Rationale for Centralization 35.1.2 Rationale for Decentralization

35.2 Approaches and Hybrids 35.3 Summary Exercises

Chapter 36. Centralization Recommendations

36.1 Architecture 36.2 Security

36.2.1 Authorization 36.2.2 Extranet Connections 36.2.3 Data Leakage Prevention

36.3 Infrastructure

36.3.1 Datacenters 36.3.2 Networking 36.3.3 IP Address Space Management 36.3.4 Namespace Management 36.3.5 Communications 36.3.6 Data Management 36.3.7 Monitoring 36.3.8 Logging

36.4 Support

36.4.1 Helpdesk 36.4.2 End-User Support

36.5 Purchasing 36.6 Lab Environments 36.7 Summary Exercises

Chapter 37. Centralizing a Service

37.1 Understand the Current Solution 37.2 Make a Detailed Plan 37.3 Get Management Support 37.4 Fix the Problems 37.5 Provide an Excellent Service 37.6 Start Slowly 37.7 Look for Low-Hanging Fruit 37.8 When to Decentralize 37.9 Managing Decentralized Services 37.10 Summary Exercises

Part VIII: Service Recommendations

Chapter 38. Service Monitoring

38.1 Types of Monitoring 38.2 Building a Monitoring System 38.3 Historical Monitoring

38.3.1 Gathering the Data 38.3.2 Storing the Data 38.3.3 Viewing the Data

38.4 Real-Time Monitoring

38.4.1 SNMP 38.4.2 Log Processing 38.4.3 Alerting Mechanism 38.4.4 Escalation 38.4.5 Active Monitoring Systems

38.5 Scaling

38.5.1 Prioritization 38.5.2 Cascading Alerts 38.5.3 Coordination

38.6 Centralization and Accessibility 38.7 Pervasive Monitoring 38.8 End-to-End Tests 38.9 Application Response Time Monitoring 38.10 Compliance Monitoring 38.11 Meta-monitoring 38.12 Summary Exercises

Chapter 39. Namespaces

39.1 What Is a Namespace? 39.2 Basic Rules of Namespaces 39.3 Defining Names 39.4 Merging Namespaces 39.5 Life-Cycle Management 39.6 Reuse 39.7 Usage

39.7.1 Scope 39.7.2 Consistency 39.7.3 Authority

39.8 Federated Identity 39.9 Summary Exercises

Chapter 40. Nameservices

40.1 Nameservice Data

40.1.1 Data 40.1.2 Consistency 40.1.3 Authority 40.1.4 Capacity and Scaling

40.2 Reliability

40.2.1 DNS 40.2.2 DHCP 40.2.3 LDAP 40.2.4 Authentication 40.2.5 Authentication, Authorization, and Accounting 40.2.6 Databases

40.3 Access Policy 40.4 Change Policies 40.5 Change Procedures

40.5.1 Automation 40.5.2 Self-Service Automation

40.6 Centralized Management 40.7 Summary Exercises

Chapter 41. Email Service

41.1 Privacy Policy 41.2 Namespaces 41.3 Reliability 41.4 Simplicity 41.5 Spam and Virus Blocking 41.6 Generality 41.7 Automation 41.8 Monitoring 41.9 Redundancy 41.10 Scaling 41.11 Security Issues 41.12 Encryption 41.13 Email Retention Policy 41.14 Communication 41.15 High-Volume List Processing 41.16 Summary Exercises

Chapter 42. Print Service

42.1 Level of Centralization 42.2 Print Architecture Policy 42.3 Documentation 42.4 Monitoring 42.5 Environmental Issues 42.6 Shredding 42.7 Summary Exercises

Chapter 43. Data Storage

43.1 Terminology

43.1.1 Key Individual Disk Components 43.1.2 RAID 43.1.3 Volumes and File Systems 43.1.4 Directly Attached Storage 43.1.5 Network-Attached Storage 43.1.6 Storage-Area Networks

43.2 Managing Storage

43.2.1 Reframing Storage as a Community Resource 43.2.2 Conducting a Storage-Needs Assessment 43.2.3 Mapping Groups onto Storage Infrastructure 43.2.4 Developing an Inventory and Spares Policy 43.2.5 Planning for Future Storage 43.2.6 Establishing Storage Standards

43.3 Storage as a Service

43.3.1 A Storage SLA 43.3.2 Reliability 43.3.3 Backups 43.3.4 Monitoring 43.3.5 SAN Caveats

43.4 Performance

43.4.1 RAID and Performance 43.4.2 NAS and Performance 43.4.3 SSDs and Performance 43.4.4 SANs and Performance 43.4.5 Pipeline Optimization

43.5 Evaluating New Storage Solutions

43.5.1 Drive Speed 43.5.2 Fragmentation 43.5.3 Storage Limits: Disk Access Density Gap 43.5.4 Continuous Data Protection

43.6 Common Data Storage Problems

43.6.1 Large Physical Infrastructure 43.6.2 Timeouts 43.6.3 Saturation Behavior

43.7 Summary Exercises

Chapter 44. Backup and Restore

44.1 Getting Started 44.2 Reasons for Restores

44.2.1 Accidental File Deletion 44.2.2 Disk Failure 44.2.3 Archival Purposes 44.2.4 Perform Fire Drills

44.3 Corporate Guidelines 44.4 A Data-Recovery SLA and Policy 44.5 The Backup Schedule 44.6 Time and Capacity Planning

44.6.1 Backup Speed 44.6.2 Restore Speed 44.6.3 High-Availability Databases

44.7 Consumables Planning

44.7.1 Tape Inventory 44.7.2 Backup Media and Off-Site Storage

44.8 Restore-Process Issues 44.9 Backup Automation 44.10 Centralization 44.11 Technology Changes 44.12 Summary Exercises

Chapter 45. Software Repositories

45.1 Types of Repositories 45.2 Benefits of Repositories 45.3 Package Management Systems 45.4 Anatomy of a Package

45.4.1 Metadata and Scripts 45.4.2 Active Versus Dormant Installation 45.4.3 Binary Packages 45.4.4 Library Packages 45.4.5 Super-Packages 45.4.6 Source Packages

45.5 Anatomy of a Repository

45.5.1 Security 45.5.2 Universal Access 45.5.3 Release Process 45.5.4 Multitiered Mirrors and Caches

45.6 Managing a Repository

45.6.1 Repackaging Public Packages 45.6.2 Repackaging Third-Party Software 45.6.3 Service and Support 45.6.4 Repository as a Service

45.7 Repository Client

45.7.1 Version Management 45.7.2 Tracking Conflicts

45.8 Build Environment

45.8.1 Continuous Integration 45.8.2 Hermetic Build

45.9 Repository Examples

45.9.1 Staged Software Repository 45.9.2 OS Mirror 45.9.3 Controlled OS Mirror

45.10 Summary Exercises

Chapter 46. Web Services

46.1 Simple Web Servers 46.2 Multiple Web Servers on One Host

46.2.1 Scalable Techniques 46.2.2 HTTPS

46.3 Service Level Agreements 46.4 Monitoring 46.5 Scaling for Web Services

46.5.1 Horizontal Scaling 46.5.2 Vertical Scaling 46.5.3 Choosing a Scaling Method

46.6 Web Service Security

46.6.1 Secure Connections and Certificates 46.6.2 Protecting the Web Server Application 46.6.3 Protecting the Content 46.6.4 Application Security

46.7 Content Management 46.8 Summary Exercises

Part IX: Management Practices

Chapter 47. Ethics

47.1 Informed Consent 47.2 Code of Ethics 47.3 Customer Usage Guidelines 47.4 Privileged-Access Code of Conduct 47.5 Copyright Adherence 47.6 Working with Law Enforcement 47.7 Setting Expectations on Privacy and Monitoring 47.8 Being Told to Do Something Illegal/Unethical 47.9 Observing Illegal Activity 47.10 Summary Exercises

Chapter 48. Organizational Structures

48.1 Sizing 48.2 Funding Models 48.3 Management Chain’s Influence 48.4 Skill Selection 48.5 Infrastructure Teams 48.6 Customer Support 48.7 Helpdesk 48.8 Outsourcing 48.9 Consultants and Contractors 48.10 Sample Organizational Structures

48.10.1 Small Company 48.10.2 Medium-Size Company 48.10.3 Large Company 48.10.4 E-commerce Site 48.10.5 Universities and Nonprofit Organizations

48.11 Summary Exercises

Chapter 49. Perception and Visibility

49.1 Perception

49.1.1 A Good First Impression 49.1.2 Attitude, Perception, and Customers 49.1.3 Aligning Priorities with Customer Expectations 49.1.4 The System Advocate

49.2 Visibility

49.2.1 System Status Web Page 49.2.2 Management Meetings 49.2.3 Physical Visibility 49.2.4 Town Hall Meetings 49.2.5 Newsletters 49.2.6 Mail to All Customers 49.2.7 Lunch

49.3 Summary Exercises

Chapter 50. Time Management

50.1 Interruptions

50.1.1 Stay Focused 50.1.2 Splitting Your Day

50.2 Follow-Through 50.3 Basic To-Do List Management 50.4 Setting Goals 50.5 Handling Email Once 50.6 Precompiling Decisions 50.7 Finding Free Time 50.8 Dealing with Ineffective People 50.9 Dealing with Slow Bureaucrats 50.10 Summary Exercises

Chapter 51. Communication and Negotiation

51.1 Communication 51.2 I Statements 51.3 Active Listening

51.3.1 Mirroring 51.3.2 Summary Statements 51.3.3 Reflection

51.4 Negotiation

51.4.1 Recognizing the Situation 51.4.2 Format of a Negotiation Meeting 51.4.3 Working Toward a Win-Win Outcome 51.4.4 Planning Your Negotiations

51.5 Additional Negotiation Tips

51.5.1 Ask for What You Want 51.5.2 Don’t Negotiate Against Yourself 51.5.3 Don’t Reveal Your Strategy 51.5.4 Refuse the First Offer 51.5.5 Use Silence as a Negotiating Tool

51.6 Further Reading 51.7 Summary Exercises

Chapter 52. Being a Happy SA

52.1 Happiness 52.2 Accepting Criticism 52.3 Your Support Structure 52.4 Balancing Work and Personal Life 52.5 Professional Development 52.6 Staying Technical 52.7 Loving Your Job 52.8 Motivation 52.9 Managing Your Manager 52.10 Self-Help Books 52.11 Summary Exercises

Chapter 53. Hiring System Administrators

53.1 Job Description 53.2 Skill Level 53.3 Recruiting 53.4 Timing 53.5 Team Considerations 53.6 The Interview Team 53.7 Interview Process 53.8 Technical Interviewing 53.9 Nontechnical Interviewing 53.10 Selling the Position 53.11 Employee Retention 53.12 Getting Noticed 53.13 Summary Exercises

Chapter 54. Firing System Administrators

54.1 Cooperate with Corporate HR 54.2 The Exit Checklist 54.3 Removing Access

54.3.1 Physical Access 54.3.2 Remote Access 54.3.3 Application Access 54.3.4 Shared Passwords 54.3.5 External Services 54.3.6 Certificates and Other Secrets

54.4 Logistics 54.5 Examples

54.5.1 Amicably Leaving a Company 54.5.2 Firing the Boss 54.5.3 Removal at an Academic Institution

54.6 Supporting Infrastructure 54.7 Summary Exercises

Part X: Being More Awesome

Chapter 55. Operational Excellence

55.1 What Does Operational Excellence Look Like? 55.2 How to Measure Greatness 55.3 Assessment Methodology

55.3.1 Operational Responsibilities 55.3.2 Assessment Levels 55.3.3 Assessment Questions and Look-For’s

55.4 Service Assessments

55.4.1 Identifying What to Assess 55.4.2 Assessing Each Service 55.4.3 Comparing Results Across Services 55.4.4 Acting on the Results 55.4.5 Assessment and Project Planning Frequencies

55.5 Organizational Assessments 55.6 Levels of Improvement 55.7 Getting Started 55.8 Summary Exercises

Chapter 56. Operational Assessments

56.1 Regular Tasks (RT)