The Practice of System and Network Administration by Limoncelli, Thomas A. -- Read -- Imperial Library of Trantor

Index

Preface Acknowledgments About the Authors Part I. Getting Started 1. What to Do When ...

1.1 Building a Site from Scratch 1.2 Growing a Small Site 1.3 Going Global 1.4 Replacing Services 1.5 Moving a Data Center 1.6 Moving to/Opening a New Building 1.7 Handling a High Rate of Office Moves 1.8 Assessing a Site (Due Diligence) 1.9 Dealing with Mergers and Acquisitions 1.10 Coping with Frequent Machine Crashes 1.11 Surviving a Major Outage or Work Stoppage 1.12 What Tools Should Every SA Team Member Have? 1.13 Ensuring the Return of Tools 1.14 Why Document Systems and Procedures? 1.15 Why Document Policies? 1.16 Identifying the Fundamental Problems in the Environment 1.17 Getting More Money for Projects 1.18 Getting Projects Done 1.19 Keeping Customers Happy 1.20 Keeping Management Happy 1.21 Keeping SAs Happy 1.22 Keeping Systems from Being Too Slow 1.23 Coping with a Big Influx of Computers 1.24 Coping with a Big Influx of New Users 1.25 Coping with a Big Influx of New SAs 1.26 Handling a High SA Team Attrition Rate 1.27 Handling a High User-Base Attrition Rate 1.28 Being New to a Group 1.29 Being the New Manager of a Group 1.30 Looking for a New Job 1.31 Hiring Many New SAs Quickly 1.32 Increasing Total System Reliability 1.33 Decreasing Costs 1.34 Adding Features 1.35 Stopping the Hurt When Doing “This” 1.36 Building Customer Confidence 1.37 Building the Team’s Self-Confidence 1.38 Improving the Team’s Follow-Through 1.39 Handling an Unethical or Worrisome Request 1.40 My Dishwasher Leaves Spots on My Glasses 1.41 Protecting Your Job 1.42 Getting More Training 1.43 Setting Your Priorities 1.44 Getting All the Work Done 1.45 Avoiding Stress 1.46 What Should SAs Expect from Their Managers? 1.47 What Should SA Managers Expect from Their SAs? 1.48 What Should SA Managers Provide to Their Boss?

2. Climb Out of the Hole

2.1 Tips for Improving System Administration

2.1.1 Use a Trouble-Ticket System 2.1.2 Manage Quick Requests Right 2.1.3 Adopt Three Time-Saving Policies 2.1.4 Start Every New Host in a Known State 2.1.5 Other Tips

2.2 Conclusion

Part II. Foundation Elements 3. Workstations

3.1 The Basics

3.1.1 Loading the OS 3.1.2 Updating the System Software and Applications 3.1.3 Network Configuration 3.1.4 Avoid Using Dynamic DNS with DHCP

3.2 The Icing

3.2.1 High Confidence in Completion 3.2.2 Involve Customers in the Standardization Process 3.2.3 A Variety of Standard Configurations

3.3 Conclusion

4. Servers

4.1 The Basics

4.1.1 Buy Server Hardware for Servers 4.1.2 Choose Vendors Known for Reliable Products 4.1.3 Understand the Cost of Server Hardware 4.1.4 Consider Maintenance Contracts and Spare Parts 4.1.5 Maintaining Data Integrity 4.1.6 Put Servers in the Data Center 4.1.7 Client Server OS Configuration 4.1.8 Provide Remote Console Access 4.1.9 Mirror Boot Disks

4.2 The Icing

4.2.1 Enhancing Reliability and Service Ability 4.2.2 An Alternative: Many Inexpensive Servers

4.3 Conclusion

5. Services

5.1 The Basics

5.1.1 Customer Requirements 5.1.2 Operational Requirements 5.1.3 Open Architecture 5.1.4 Simplicity 5.1.5 Vendor Relations 5.1.6 Machine Independence 5.1.7 Environment 5.1.8 Restricted Access 5.1.9 Reliability 5.1.10 Single or Multiple Servers 5.1.11 Centralization and Standards 5.1.12 Performance 5.1.13 Monitoring 5.1.14 Service Rollout

5.2 The Icing

5.2.1 Dedicated Machines 5.2.2 Full Redundancy 5.2.3 Dataflow Analysis for Scaling

5.3 Conclusion

6. Data Centers

6.1 The Basics

6.1.1 Location 6.1.2 Access 6.1.3 Security 6.1.4 Power and Cooling 6.1.5 Fire Suppression 6.1.6 Racks 6.1.7 Wiring 6.1.8 Labeling 6.1.9 Communication 6.1.10 Console Access 6.1.11 Workbench 6.1.12 Tools and Supplies 6.1.13 Parking Spaces

6.2 The Icing

6.2.1 Greater Redundancy 6.2.2 More Space

6.3 Ideal Data Centers

6.3.1 Tom’s Dream Data Center 6.3.2 Christine’s Dream Data Center

6.4 Conclusion

7. Networks

7.1 The Basics

7.1.1 The OSI Model 7.1.2 Clean Architecture 7.1.3 Network Topologies 7.1.4 Intermediate Distribution Frame 7.1.5 Main Distribution Frame 7.1.6 Demarcation Points 7.1.7 Documentation 7.1.8 Simple Host Routing 7.1.9 Network Devices 7.1.10 Overlay Networks 7.1.11 Number of Vendors 7.1.12 Standards-Based Protocols 7.1.13 Monitoring 7.1.14 Single Administrative Domain

7.2 The Icing

7.2.1 Leading Edge versus Reliability 7.2.2 Multiple Administrative Domains

7.3 Conclusion

7.3.1 Constants in Networking 7.3.2 Things That Change in Network Design

8. Namespaces

8.1 The Basics

8.1.1 Namespace Policies 8.1.2 Namespace Change Procedures 8.1.3 Centralizing Namespace Management

8.2 The Icing

8.2.1 One Huge Database 8.2.2 Further Automation 8.2.3 Customer-Based Updating 8.2.4 Leveraging Namespaces

8.3 Conclusion

9. Documentation

9.1 The Basics

9.1.1 What to Document 9.1.2 A Simple Template for Getting Started 9.1.3 Easy Sources for Documentation 9.1.4 The Power of Checklists 9.1.5 Documentation Storage 9.1.6 Wiki Systems 9.1.7 A Search Facility 9.1.8 Rollout Issues 9.1.9 Self-Management versus Explicit Management

9.2 The Icing

9.2.1 A Dynamic Documentation Repository 9.2.2 A Content-Management System 9.2.3 A Culture of Respect 9.2.4 Taxonomy and Structure 9.2.5 Additional Documentation Uses 9.2.6 Off-Site Links

9.3 Conclusion

10. Disaster Recovery and Data Integrity

10.1 The Basics

10.1.1 Definition of a Disaster 10.1.2 Risk Analysis 10.1.3 Legal Obligations 10.1.4 Damage Limitation 10.1.5 Preparation 10.1.6 Data Integrity

10.2 The Icing

10.2.1 Redundant Site 10.2.2 Security Disasters 10.2.3 Media Relations

10.3 Conclusion

11. Security Policy

11.1 The Basics

11.1.1 Ask the Right Questions 11.1.2 Document the Company’s Security Policies 11.1.3 Basics for the Technical Staff 11.1.4 Management and Organizational Issues

11.2 The Icing

11.2.1 Make Security Pervasive 11.2.2 Stay Current: Contacts and Technologies 11.2.3 Produce Metrics

11.3 Organization Profiles

11.3.1 Small Company 11.3.2 Medium-Size Company 11.3.3 Large Company 11.3.4 E-Commerce Site 11.3.5 University

11.4 Conclusion

12. Ethics

12.1 The Basics

12.1.1 Informed Consent 12.1.2 Professional Code of Conduct 12.1.3 Customer Usage Guidelines 12.1.4 Privileged-Access Code of Conduct 12.1.5 Copyright Adherence 12.1.6 Working with Law Enforcement

12.2 The Icing

12.2.1 Setting Expectations on Privacy and Monitoring 12.2.2 Being Told to Do Something Illegal/Unethical

12.3 Conclusion

13. Helpdesks

13.1 The Basics

13.1.1 Have a Helpdesk 13.1.2 Offer a Friendly Face 13.1.3 Reflect Corporate Culture 13.1.4 Have Enough Staff 13.1.5 Define Scope of Support 13.1.6 Specify How to Get Help 13.1.7 Define Processes for Staff 13.1.8 Establish an Escalation Process 13.1.9 Define “Emergency” in Writing 13.1.10 Supply Request-Tracking Software

13.2 The Icing

13.2.1 Statistical Improvements 13.2.2 Out-of-Hours and 24/7 Coverage 13.2.3 Better Advertising for the Helpdesk 13.2.4 Different Helpdesks for Service Provision and Problem Resolution

13.3 Conclusion

14. Customer Care

14.1 The Basics

14.1.1 Phase A/Step 1: The Greeting 14.1.2 Phase B: Problem Identification 14.1.3 Phase C: Planning and Execution 14.1.4 Phase D: Verification 14.1.5 Perils of Skipping a Step 14.1.6 Team of One

14.2 The Icing

14.2.1 Model-Based Training 14.2.2 Holistic Improvement 14.2.3 Increased Customer Familiarity 14.2.4 Special Announcements for Major Outages 14.2.5 Trend Analysis 14.2.6 Customers Who Know the Process 14.2.7 Architectural Decisions That Match the Process

14.3 Conclusion

Part III. Change Processes 15. Debugging

15.1 The Basics

15.1.1 Learn the Customer’s Problem 15.1.2 Fix the Cause, Not the Symptom 15.1.3 Be Systematic 15.1.4 Have the Right Tools

15.2 The Icing

15.2.1 Better Tools 15.2.2 Formal Training on the Tools 15.2.3 End-to-End Understanding of the System

15.3 Conclusion

16. Fixing Things Once

16.1 The Basics

16.1.1 Don’t Waste Time 16.1.2 Avoid Temporary Fixes 16.1.3 Learn from Carpenters

16.2 The Icing 16.3 Conclusion

17. Change Management

17.1 The Basics

17.1.1 Risk Management 17.1.2 Communications Structure 17.1.3 Scheduling 17.1.4 Process and Documentation 17.1.5 Technical Aspects

17.2 The Icing

17.2.1 Automated Front Ends 17.2.2 Change-Management Meetings 17.2.3 Streamline the Process

17.3 Conclusion

18. Server Upgrades

18.1 The Basics

18.1.1 Step 1: Develop a Service Checklist 18.1.2 Step 2: Verify Software Compatibility 18.1.3 Step 3: Verification Tests 18.1.4 Step 4: Write a Back-Out Plan 18.1.5 Step 5: Select a Maintenance Window 18.1.6 Step 6: Announce the Upgrade as Appropriate 18.1.7 Step 7: Execute the Tests 18.1.8 Step 8: Lock out Customers 18.1.9 Step 9: Do the Upgrade with Someone Watching 18.1.10 Step 10: Test Your Work 18.1.11 Step 11: If All Else Fails, Rely on the Back-Out Plan 18.1.12 Step 12: Restore Access to Customers 18.1.13 Step 13: Communicate Completion/Back-Out

18.2 The Icing

18.2.1 Add and Remove Services at the Same Time 18.2.2 Fresh Installs 18.2.3 Reuse of Tests 18.2.4 Logging System Changes 18.2.5 A Dress Rehearsal 18.2.6 Installation of Old and New Versions on the Same Machine 18.2.7 Minimal Changes from the Base

18.3 Conclusion

19. Service Conversions

19.1 The Basics

19.1.1 Minimize Intrusiveness 19.1.2 Layers versus Pillars 19.1.3 Communication 19.1.4 Training 19.1.5 Small Groups First 19.1.6 Flash-Cuts: Doing It All at Once 19.1.7 Back-Out Plan

19.2 The Icing

19.2.1 Instant Rollback 19.2.2 Avoiding Conversions 19.2.3 Web Service Conversions 19.2.4 Vendor Support

19.3 Conclusion

20. Maintenance Windows

20.1 The Basics

20.1.1 Scheduling 20.1.2 Planning 20.1.3 Directing 20.1.4 Managing Change Proposals 20.1.5 Developing the Master Plan 20.1.6 Disabling Access 20.1.7 Ensuring Mechanics and Coordination 20.1.8 Deadlines for Change Completion 20.1.9 Comprehensive System Testing 20.1.10 Post-maintenance Communication 20.1.11 Re-enable Remote Access 20.1.12 Be Visible the Next Morning 20.1.13 Postmortem

20.2 The Icing

20.2.1 Mentoring a New Flight Director 20.2.2 Trending of Historical Data 20.2.3 Providing Limited Availability 20.2.4 High-Availability Sites

20.3 Conclusion

21. Centralization and Decentralization

21.1 The Basics

21.1.1 Guiding Principles 21.1.2 Candidates for Centralization 21.1.3 Candidates for Decentralization

21.2 The Icing

21.2.1 Consolidate Purchasing 21.2.2 Outsourcing

21.3 Conclusion

Part IV. Providing Services 22. Service Monitoring

22.1 The Basics

22.1.1 Historical Monitoring 22.1.2 Real-Time Monitoring

22.2 The Icing

22.2.1 Accessibility 22.2.2 Pervasive Monitoring 22.2.3 Device Discovery 22.2.4 End-to-End Tests 22.2.5 Application Response Time Monitoring 22.2.6 Scaling 22.2.7 Metamonitoring

22.3 Conclusion

23. Email Service

23.1 The Basics

23.1.1 Privacy Policy 23.1.2 Namespaces 23.1.3 Reliability 23.1.4 Simplicity 23.1.5 Spam and Virus Blocking 23.1.6 Generality 23.1.7 Automation 23.1.8 Basic Monitoring 23.1.9 Redundancy 23.1.10 Scaling 23.1.11 Security Issues 23.1.12 Communication

23.2 The Icing

23.2.1 Encryption 23.2.2 Email Retention Policy 23.2.3 Advanced Monitoring 23.2.4 High-Volume List Processing

23.3 Conclusion

24. Print Service

24.1 The Basics

24.1.1 Level of Centralization 24.1.2 Print Architecture Policy 24.1.3 System Design 24.1.4 Documentation 24.1.5 Monitoring 24.1.6 Environmental Issues

24.2 The Icing

24.2.1 Automatic Failover and Load Balancing 24.2.2 Dedicated Clerical Support 24.2.3 Shredding 24.2.4 Dealing with Printer Abuse

24.3 Conclusion

25. Data Storage

25.1 The Basics

25.1.1 Terminology 25.1.2 Managing Storage 25.1.3 Storage as a Service 25.1.4 Performance 25.1.5 Evaluating New Storage Solutions 25.1.6 Common Problems

25.2 The Icing

25.2.1 Optimizing RAID Usage by Applications 25.2.2 Storage Limits: Disk Access Density Gap 25.2.3 Continuous Data Protection

25.3 Conclusion

26. Backup and Restore

26.1 The Basics

26.1.1 Reasons for Restores 26.1.2 Types of Restores 26.1.3 Corporate Guidelines 26.1.4 A Data-Recovery SLA and Policy 26.1.5 The Backup Schedule 26.1.6 Time and Capacity Planning 26.1.7 Consumables Planning 26.1.8 Restore-Process Issues 26.1.9 Backup Automation 26.1.10 Centralization 26.1.11 Tape Inventory

26.2 The Icing

26.2.1 Fire Drills 26.2.2 Backup Media and Off-Site Storage 26.2.3 High-Availability Databases 26.2.4 Technology Changes

26.3 Conclusion

27. Remote Access Service

27.1 The Basics

27.1.1 Requirements for Remote Access 27.1.2 Policy for Remote Access 27.1.3 Definition of Service Levels 27.1.4 Centralization 27.1.5 Outsourcing 27.1.6 Authentication 27.1.7 Perimeter Security

27.2 The Icing

27.2.1 Home Office 27.2.2 Cost Analysis and Reduction 27.2.3 New Technologies

27.3 Conclusion

28. Software Depot Service

28.1 The Basics

28.1.1 Understand the Justification 28.1.2 Understand the Technical Expectations 28.1.3 Set the Policy 28.1.4 Select Depot Software 28.1.5 Create the Process Manual 28.1.6 Examples

28.2 The Icing

28.2.1 Different Configurations for Different Hosts 28.2.2 Local Replication 28.2.3 Commercial Software in the Depot 28.2.4 Second-Class Citizens

28.3 Conclusion

29. Web Services

29.1 The Basics

29.1.1 Web Service Building Blocks 29.1.2 The Webmaster Role 29.1.3 Service-Level Agreements 29.1.4 Web Service Architectures 29.1.5 Monitoring 29.1.6 Scaling for Web Services 29.1.7 Web Service Security 29.1.8 Content Management 29.1.9 Building the Manageable Generic Web Server

29.2 The Icing

29.2.1 Third-Party Web Hosting 29.2.2 Mashup Applications

29.3 Conclusion

Part V. Management Practices 30. Organizational Structures

30.1 The Basics

30.1.1 Sizing 30.1.2 Funding Models 30.1.3 Management Chain’s Influence 30.1.4 Skill Selection 30.1.5 Infrastructure Teams 30.1.6 Customer Support 30.1.7 Helpdesk 30.1.8 Outsourcing

30.2 The Icing

30.2.1 Consultants and Contractors

30.3 Sample Organizational Structures

30.3.1 Small Company 30.3.2 Medium-Size Company 30.3.3 Large Company 30.3.4 E-Commerce Site 30.3.5 Universities and Nonprofit Organizations

30.4 Conclusion

31. Perception and Visibility

31.1 The Basics

31.1.1 A Good First Impression 31.1.2 Attitude, Perception, and Customers 31.1.3 Priorities Aligned with Customer Expectations 31.1.4 The System Advocate

31.2 The Icing

31.2.1 The System Status Web Page 31.2.2 Management Meetings 31.2.3 Physical Visibility 31.2.4 Town Hall Meetings 31.2.5 Newsletters 31.2.6 Mail to All Customers 31.2.7 Lunch

31.3 Conclusion

32. Being Happy

32.1 The Basics

32.1.1 Follow-Through 32.1.2 Time Management 32.1.3 Communication Skills 32.1.4 Professional Development 32.1.5 Staying Technical

32.2 The Icing

32.2.1 Learn to Negotiate 32.2.2 Love Your Job 32.2.3 Managing Your Manager

32.3 Further Reading 32.4 Conclusion

33. A Guide for Technical Managers

33.1 The Basics

33.1.1 Responsibilities 33.1.2 Working with Nontechnical Managers 33.1.3 Working with Your Employees 33.1.4 Decisions

33.2 The Icing

33.2.1 Make Your Team Even Stronger 33.2.2 Sell Your Department to Senior Management 33.2.3 Work on Your Own Career Growth 33.2.4 Do Something You Enjoy

33.3 Conclusion

34. A Guide for Nontechnical Managers

34.1 The Basics

34.1.1 Priorities and Resources 34.1.2 Morale 34.1.3 Communication 34.1.4 Staff Meetings 34.1.5 One-Year Plans 34.1.6 Technical Staff and the Budget Process 34.1.7 Professional Development

34.2 The Icing

34.2.1 A Five-Year Vision 34.2.2 Meetings with Single Point of Contact 34.2.3 Understanding the Technical Staff’s Work

34.3 Conclusion

35. Hiring System Administrators

35.1 The Basics

35.1.1 Job Description 35.1.2 Skill Level 35.1.3 Recruiting 35.1.4 Timing 35.1.5 Team Considerations 35.1.6 The Interview Team 35.1.7 Interview Process 35.1.8 Technical Interviewing 35.1.9 Nontechnical Interviewing 35.1.10 Selling the Position 35.1.11 Employee Retention

35.2 The Icing

35.2.1 Get Noticed

35.3 Conclusion

36. Firing System Administrators

36.1 The Basics

36.1.1 Follow Your Corporate HR Policy 36.1.2 Have a Termination Checklist 36.1.3 Remove Physical Access 36.1.4 Remove Remote Access 36.1.5 Remove Service Access 36.1.6 Have Fewer Access Databases

36.2 The Icing

36.2.1 Have a Single Authentication Database 36.2.2 System File Changes

36.3 Conclusion

Epilogue Appendixes Appendix A. The Many Roles of a System Administrator Appendix B. Acronyms Bibliography Index

← Prev
Back
Next →

← Prev
Back
Next →