Hour 3

Getting to Know Your Data

What You’ll Learn in This Hour:

Welcome to Hour 3! Now you get to know the data you will use for all the exercises and examples in the rest of the book. In this hour, you look at the data itself, attributes of the data, the way data might be used, potential customers, and more. You also check out T relationship diagrams that focus on the tables used in this book and the columns in those tables; these diagrams illustrate the relationships between the tables. Finally, a list of data is provided from each table so that you can see what the data actually looks like. You get a copy of the scripts to create these tables and insert the data into the database. As you progress through the book, you will build upon this database. Remember that anytime you need to visually refer to the data used in his book or remember how the data is related, refer to this hour. Now let’s get started.

The BIRD Database: Examples and Exercises in This Book

The database in this book focuses on birds. Virtually any database could have worked for the examples and exercises, but I wanted to use a different and interesting topic, to make the experience of learning SQL a little more fun. This first part of the hour accomplishes the following goals:

  •    Introduces you to the basic data

  •    Discusses organizations and users of this data

  •    Speculates how end users and customers might use the data

  •    Envisions opportunities for database expansion and growth down the road

  •    Envisions other data that might be integrated with this database

How to Talk About the Data

Before you continue your journey through SQL fundamentals, you need an introduction to the tables and data you will use throughout the book. The lessons all refer to an example database for a fictitious organization called Waterbird Photography.

Waterbird Photography data has been generated to create real-world scenarios for the examples and exercises in this book. The following sections provide an overview of the specific tables (the database) used, their relationship to one another, their structure, and examples of the data contained.

You can envision this database as one designed by a hobbyist/semipro wildlife photographer that, over time, evolves into something bigger. Within this database, you can look at any data you’d like and design anything you want. You can try to make sense of any information as usable data and convert it into a database for your organization that is not only usable, but makes your organization competitive in today’s market, regardless of your industry.

Keep in mind that, in this hour, we are only talking about the data. Before you begin to design or create a database, you need to look at the potential information and determine how that information can become usable data in an organization. So here, we are simply talking about the data openly so that you can understand it, see how the data is related to other data, and think about how potential organizations might use that data down the road. This chapter is all about the data itself, not database structures or SQL. In future hours, you will use SQL to manipulate and query this data in great detail.

Note

Know Your Data

Every person and every organization has a goal to succeed in some manner. In a data-driven world, SQL and data management are critical contributors to those goals. Knowing your data is mandatory. If you don’t know your data and maximize it, your competitors will.

Bird Information

Now let’s talk about birds. Before looking at entity diagrams, tables, and data, it helps to simply discuss the data. A lot of information about birds exists, and it might or might not be stored in a database. This is a simple database—it does not include all birds or all information about birds. So consider the following basic information to get you started:

  •    Each bird category includes some standard information about the bird itself.

  •    Each bird may have photographs taken from different locations by different photographers.

  •    Information about each bird’s diet will be stored.

  •    Information about each bird’s migration habits will be stored.

  •    Information about the types of nests each bird builds will be stored.

  •    Each bird may have one or more nicknames.

Organizations and Users

A variety of organizations, users, and customers can benefit from the information in this bird database. Let’s say that this database was created by a hobbyist wildlife photographer for personal use, but it evolved over time into something larger and multiple photographers now have access to it. With this in mind, you can speculate that the following types of organizations might use a database such as this:

  •    Photography and media

  •    Rescue organizations

  •    Parks

  •    Publications

  •    Online stores

  •    Government

  •    Birding groups

A few examples of some of the users within these organizations include the following:

  •    Individual photographers

  •    Volunteers

  •    Editors

  •    Park staff

  •    Retail customers

Opportunities for Expansion

As with any database, opportunities for expansion always exist. For example, if bird rescues start using the data within this database, you might create a set of tables about rescues that are integrated into the existing database. You also might end up storing more information about photographers and the equipment used in photographing various birds, such as cameras, lenses, and editing software. You can certainly think about other information that you could eventually integrate into a bird database, based on the previous information about organizations and users.

In addition to expansion, you need to consider the growth of data over time. Specific hardware is beyond the scope of this book, but the size and future growth of the database greatly influence hardware and software products that are used to manage the data. The size of the database also influences the database management system used.

Entity Relationship Diagrams

This section includes two entity relationship diagrams (ERDs). Entity relationship diagrams illustrate entities within a database, as the attributes within each entity, and the relationships between entities. From a logical standpoint, the entities and attributes in a diagram will eventually become the tables and columns in a database when you move on to using SQL to create your tables. Diagrams such as these are important to consider during the design phase of a database and also provide a reference when using SQL to query a database and manage the data within.

Entities and Relationships

Figure 3.1 shows the first ERD, which illustrates all the basic entities in the bird database. Notice the line between each entity. This line represents some sort of relationship between entities or tables. For example, all the tables in this database will revolve around the birds table. A bird is related to food because a bird might eat many different types of food; conversely, each food item might be eaten by many different types of birds. Again, this is a simple diagram, to get you started.

An Entity Relationship diagram showing the basic entities and basic relationships in the BIRDS database.

FIGURE 3.1
An ERD showing entities (tables) and basic relationships

Detailed Entities/Tables

In the ERD in Figure 3.2, you can see that attributes have been added to the entities. Attributes will become the columns when you build the tables using SQL. You can also see lines between the entities, illustrating that some sort of relationship exists. Later, you will look at the different types of relationships in a relational database and see how to illustrate those relationships more specifically in an ERD. You might also notice some extra entities in this figure that were not present in Figure 3.1. These entities are specifically used to facilitate a relationship between other entities, or tables, in the database. For example, a bird might eat many different types of foods, and each food item might be eaten by many different types of birds. The BIRDS_FOOD table simply facilitates a relationship between BIRDS and FOOD. We discuss relationships like this one in greater detail in Hour 6. You will also learn to use SQL to query the database and join tables.

An Entity Relationship diagram showing a BIRDS database displaying Entities, attributes, and relationships. It contains list of fields within each group of data and each list of fields comprises the attributes of each entity.

FIGURE 3.2
An ERD with attributes (columns)

Attribute (Column) Definition

Following is a list of all the entities and attributes that currently exist in the example database for this book. As previously stated, entities will become tables and attributes will become columns when you create the database. A description is provided for each attribute, although some are self-explanatory.

BIRDS

 

BIRD_ID

Unique identifier for each bird

 

BIRD_NAME

Name of the bird

 

HEIGHT

Height of the bird, in inches

 

WINGSPAN

Wingspan of the bird, in inches

 

EGGS

Number of eggs typically laid per brood (one hatching)

 

BROODS

Number of broods per year or season

 

INCUBATION

Egg incubation period, in number of days

 

FLEDGING

Number of days a young bird is raised before it can fly

 

NEST_BUILDER

Who builds the nest (male, female, both, neither)

PHOTOS

 

PHOTO_ID

Unique identifier for each photo

 

PHOTO_FILE

Name of the file associated with a photo

 

PHOTO_DATE

Date a photo was taken

 

PHOTO_LOCATION_ID

Identifier for the location where a photo was taken

 

BIRD_ID

Identifier of a particular bird in a photo

LOCATIONS

 

LOCATION_ID

Unique identifier for a photo location

 

LOCATION_NAME

Location where a photo was taken

BIRDS_FOOD

 

BIRD_ID

Unique identifier for a bird

 

FOOD_ID

Unique identifier for a food item

FOOD

 

FOOD_ID

Unique identifier for a food item

 

FOOD_NAME

Name of a food item

BIRDS_MIGRATION

 

BIRD_ID

Unique identifier for a bird

 

MIGRATION_ID

Unique identifier for a migration location

MIGRATION

 

MIGRATION_ID

Unique identifier for a migration location

 

MIGRATION_LOCATION

Name of a migration location

BIRDS_NESTS

 

BIRD_ID

Unique identifier for a bird

 

NEST_ID

Unique identifier for a nest type

NESTS

 

NEST_ID

Unique identifier for a nest type

 

NEST_NAME

Name or type of nest

NICKNAMES

 

BIRD_ID

Identifier for each bird

 

NICKNAME

Nickname for a bird

Note

Use This Hour As a Reference for Hands-on Exercises

Remember to refer back to this hour when performing hands-on exercises in this book.

Table Naming Standards

As with any standard within a business, table naming standards are critical to maintaining control. After studying the tables and data in the previous sections, you probably noticed consistency in the way entities (tables) and attributes (columns) are named. Naming standards, or naming conventions, exist almost exclusively for overall organization and are a big help in administering any relational database. Remember that, when naming objects in a relational database, you need to be consistent throughout and adhere to your organization’s naming standards. Many approaches for naming standards exist; you just have to decide what works for you and your organization and then stick with it.

Note

Naming Standards

Not only should you adhere to the object naming syntax of any SQL implementation, but you also need to follow local business rules and create names that are both descriptive and related to the data groupings for the business. Consistent naming standards make it easier to manage databases with SQL.

Examples and Exercises

The exercises in this book primarily use the Oracle databases to generate the examples. The book concentrates on this database implementation for consistency and because Oracle is a dominant leader in the RDBMS market. Although Oracle offers a plethora of enhancements to the SQL standard, it also closely follows the SQL standard, making it easy for this book to show simplified examples with Oracle throughout the text. Microsoft SQL Server, MySQL, and other popular vendor implementations would have been adequate as well.

We also show examples of syntax from various implementations throughout, to illustrate how syntax might slightly vary from vendor to vendor. Many vendors provide freely distributed versions of their database for personal learning and development purposes. Whatever implementation you decide to use, you likely will find it easy to apply any examples in this book.

Finally, note that because most databases are not 100% compliant with the SQL standard, the exercises might vary slightly from the ANSI standard or other SQL implementations. However, learning the basics of the ANSI standard generally enables you to easily translate your skills among different database implementations.

Summary

In this hour, you took a look at the database that you will be using throughout this book for examples and hands-on exercises. You learned about the data itself and how it can be used. Then you looked at entity relationship diagrams (ERDs) and, after studying them, got a solid understanding of the data used in this book and the relationships between that data. Remember that entities and tables are basically the same: An entity is the name used during the design phase of a database. A table is the physical object that is created based on a concept or entity. Likewise, attributes become columns when the database is created. Use this hour as a reference for the hands-on exercises in the rest of the book, and feel free to come up with your own ideas to enhance this database as you progress.

Q&A

Q. How difficult is it to add entities to a database after it is well established?

A. If the database is well designed, adding entities to an existing database is simple: You define the data sets being added and define relationships with existing entities. Any data that you add to the new tables eventually must follow any constraints, or rules, that are already in place, such as primary and foreign keys.

Q. Can an entity be directly related to more than one other entity in a database?

A. Yes, an entity can be directly related to multiple entities. In the Bird database, each entity is directly related to only one other entity in the database. However, as you build upon this database, you will unveil more relationships.

Workshop

The following workshop consists of a series of quiz questions and practical exercises. The quiz questions are designed to test your overall understanding of the current material. The practical exercises give you the opportunity to apply the concepts discussed during the current hour, as well as build on the knowledge you acquired in previous hours of study. Be sure to complete the quiz questions and exercises before continuing to the next hour. Refer to Appendix C, “Answers to Quizzes and Exercises,” for answers.

Quiz

  1. 1. What is the difference between entities and tables?

  2. 2. What is the purpose of the BIRD_FOOD entity?

  3. 3. How might photograph locations somehow be related to the food that birds eat?

  4. 4. What does the abbreviation ERD stand for?

  5. 5. How many direct relationships exist between entities in the Birds database?

  6. 6. What is another name for a naming standard?

Exercises

  1. 1. Give an example of an entity or attributes that might be added to this database.

  2. 2. Give some examples of candidates for primary keys, based on Figure 3.2.

  3. 3. Give some examples of candidates for foreign keys, based on Figure 3.2.