Lesson 2 | Preview the data |
YOU NEED THE RIGHT DATA to solve the problem of where to locate a park. You explored the study area in lesson 1. Now you can proceed more systematically. What data do you have? How useful is it? Is there data that you need but don’t have? Has the problem been stated clearly enough for you to know what data you need?
Acquiring, evaluating, and organizing data is a big part of an analysis project. This book doesn’t fully re-create the complexity of the real world, because all the basic data required is provided. But much of the data isn’t project-ready, and that need for further preparation reflects the real world of GIS.
The first thing you’ll do in this lesson is draw up a planning document to help keep your tasks in focus. You’ll use this document to list the guidelines for the new park and translate them into specific needs for spatial and attribute data.
After you itemize your data requirements in general terms (park data, river data, and so on), you’ll take stock of your source data and investigate its spatial and attribute properties. You’ll also familiarize yourself with metadata, which is the data you have about your data. Before you decide to use a particular dataset, you may want to know things such as who made the data, when, and to what standard of accuracy.
Once you have a better working knowledge of your data, you’ll reframe the problem statement. GIS is a quantitative technology: you can’t analyze a problem until it’s been stated in measurable terms. Wherever you find the city council’s guidelines to be vague, you’ll replace them with hard numbers.
Exercise 2a: List the data requirements
You must relate the guidelines for the new park to data requirements for the project.
Open the data requirements table
A table has been made in advance to help you keep track of your requirements. It’s an informal document, but it will still be helpful. You can refer to it as the data requirements table.
1)Open Windows Explorer and navigate to C:\EsriPress\UGIS4\ParkSite\MapsAndMore.
2)Double-click the file DataRequirementsTable.doc to open it in Microsoft® Word.
If you don’t have Microsoft Word, open the RTF version of the document in another app, or print the PDF version and fill it out with a pencil.
List the requirements
In this section, you’ll review the city council’s guidelines (refer to “Park guidelines” in lesson 1, under “Frame the problem”) and describe in a general way the data needed to satisfy them. The specifics of choosing datasets are presented in lesson 3.
The first guideline was to find a vacant piece of land at least one-quarter acre in size. You can break this down into three requirements:
•Land parcel
•Vacancy
•Size
The requirement for a land parcel is already listed in the table. You need spatial data representing parcels so that you can see candidate sites on the map.
The second requirement is vacancy, which is a characteristic, or attribute, of a parcel. In a GIS dataset, vacancy is often listed with other descriptions of land use (commercial, residential, industrial, and so on). In general terms then, you’re looking for a land-use attribute.
1)In row 2 of the table, under Attribute Data, type (or write) land use.
The third requirement is that the park be one-quarter acre or larger. Like vacancy, acreage is an attribute, although it is one that can be calculated by the software. Because ArcGIS Pro can convert one unit of area to another, you don’t even have to start with acres—any measurement of parcel size will suffice.
2)In row 3, under Requirement, type a quarter acre or more. Under Attribute Data, enter area.
The second guideline under “Park guidelines” is that the park be within the Los Angeles city limits. This sounds like spatial data, and you’ll treat it that way for now. (It could be an attribute, too, because a field in a table might store the name of the city in which each parcel is recorded.)
3)Fill in row 4 as you think it should look, and then check the figure.
The third guideline is that the park be as close as possible to the Los Angeles River.
4)In row 5, for the requirement, put near LA River. Under Spatial Data, put rivers.
Using spatial datasets of parcels and rivers, you can measure the distance from any given parcel to the river.
The fourth guideline is to locate the park not in the vicinity of another park, or away from existing parks.
5)Fill out row 6 as you think it should look.
The fifth guideline also needs to be broken down. You need a neighborhood (spatial data) that has the following:
•high population density (attribute data) and
•lots of children (attribute data).
Neighborhoods tend not to have formal boundaries, so you’re probably not going to find them as such in a spatial dataset. As a proxy, or substitute, you’ll use a set of small, standardized areas defined by the US Census Bureau: either the tracts or block groups you looked at in lesson 1.
6)In row 7, enter in a neighborhood as the requirement. Enter census unit for the spatial data.
7)In row 8, enter densely populated for the requirement and population density for the attribute data.
8)In row 9, enter lots of kids for the requirement. For the attribute data, enter age.
The sixth guideline is that the park be in a lower-income neighborhood. You don’t need to repeat the spatial requirement for a neighborhood from step 6.
9)In row 10, enter lower income for the requirement and income for the attribute data.
The last park guideline is to serve as many people as possible. For this guideline, you need a population attribute.
10)In row 11, enter serving the most people as the requirement and population as the attribute data.
Eventually, you’ll want to make a map of potential sites, and you may need some additional data for cartographic purposes. For example, political boundaries and roads put maps in a familiar context. Physical relief creates texture, and imagery provides realistic detail.
11)In rows 12 to 15, enter final map for the requirement. Under Spatial Data, list the examples just mentioned in step 10.
12)Save and minimize the table. You’ll continue to use it in the next exercise.
Exercise 2b: Examine the data
Now you can see what data you actually have on hand. To do this, you’ll work in the Catalog pane. In lesson 1, you used the Catalog pane to manage your maps and data folders. The Catalog pane is great for going back and forth between your map and your data (which is what you do most of the time).
Get started
1)Start ArcGIS Pro, if necessary, and open your LARiverParkSite project. You’ll continue working with your project document from exercise 1b.
2)Display the Catalog pane. The Catalog pane is generally docked to the right of the map (sometimes hidden as a tab).
If you close the Catalog pane or can’t find it, you can always open it by clicking the Catalog Pane button on the View tab.
Insert a new map from the Catalog pane
Recall that you can insert a map from the ribbon, but this time you’ll add it from the Catalog pane.
1)In the Catalog pane, expand the Maps folder . Note that the map(s) you’ve created in previous lessons are listed here.
2)Right-click Maps in the Catalog pane and click New Map in the context menu. A new map is added to the list and opens to the map view.
3)Right-click the new map item and click Rename. Name the new map Lesson2.
4)In the Catalog pane, expand Folders by clicking the arrow to the left of it (or double-click Folders). Here you should see the project folder that was created when you started the project (LARiverParkSite) as well as the ParkSite folder that you connected to in lesson 1.
Survey the SourceData folder
1)Expand the ParkSite folder.
2)Expand the SourceData folder.
3)Expand everything you can under SourceData.
It’s a long list of items. You may have to scroll down or maximize the application to see everything. Each item is a piece of geographic data or a data container. The icons signify the type of data, as illustrated in the sidebar “Representing the real world as data.”
Under the SourceData folder are three folders and a geodatabase :
•The census folder contains three feature classes of census data in shapefile (.shp) format.
•The City of LA folder contains three shapefiles and a stand-alone table in dBASE (.dbf) format.
•The ParkData folder contains a shapefile.
•The geodatabase contains 10 feature classes in geodatabase format. The feature classes are thematically organized in containers called feature datasets .
In the next sections, you’ll preview a lot of this data to make sure you have the features and attributes you listed in the data requirements table.
Representing the real world as data
How would you create an information system to organize and manage the huge variety of geographic stuff in the world? One approach is to think of all that stuff in terms of discrete objects.
The discrete-object view of the world
If you conceive of geography in terms of objects, you can sort these objects by similarities. Shape is a fundamental sorting principle: every object can be drawn—in two dimensions—as either a point, a line, or a polygon. Theme, or type, is another principle: every object can be classified as a school, a road, a park, or something else.
Applying these sorting principles of shape and theme, you can come up with collections of things you would recognize on a map: schools represented as points, roads represented as lines, parks represented as polygons, and so on.
Each object in a collection has a unique location, specified by a pair of spatial coordinates (for points) or a list of coordinate pairs (for lines and polygons).
Besides a unique location, every object has a set of facts that pertain to it: a name, a description, or whatever bits of information have been gathered about it. These facts are the object’s attributes.
In ArcGIS, a collection of such objects—with a common shape, common theme, and common attributes—is called a feature class. An individual object in the collection is a feature. The feature class is the basic storage unit for GIS data created according to the discrete-object view of the world, commonly called the vector data model.
Feature classes can be stored in various file formats, notably the geodatabase and the shapefile. The geodatabase format is newer and more highly developed.
The continuous-surface view of the world
Although it’s a powerful model, the discrete-object view is not an intuitive way to think of certain kinds of geographic information, such as elevation or temperature, that don’t have shapes or boundaries and that cover the world everywhere. It’s quite possible to represent these phenomena as features (for example, contour lines represent elevation on topographic maps), but a more natural way to think of them is in terms of continuous expanses, or surfaces.
The most common way to model a geographic surface is with a matrix of square cells, or pixels. Each cell represents a unit of area, such as a square meter, and stores a single piece of geographic information—typically, a measured or estimated value—at that location.
This way of modeling surfaces is called the raster data model. It’s commonly used for elevation and its derivatives (slope, aspect); for temperature, precipitation, and land cover; for statistical data, such as densities and means; and especially for imagery.
The raster dataset is the basic storage unit for GIS data created according to the continuous-surface view of the world. Raster datasets can be stored in geodatabases or in various standard image file formats, such as TIFF and JPEG.
Feature classes and raster datasets are complementary. In many maps, raster datasets are used for background display, whereas feature classes are used for foreground display and analysis.
Acquiring data
In any GIS project, acquiring good data is a big part of your job. ArcGISSM Online provides datasets representing many types of geography. You can access this data from a web browser (www.arcgis.com) or directly from ArcGIS Pro by switching from the Project tab on the Catalog pane to the Portal tab. You can search for particular types of data by keyword; you can also search for ArcGIS Online groups, such as the Living Atlas, that curate a variety of authoritative content. For spatial data, you can use ArcGIS® Open Data open-source software. Spatial data is also widely available from government agencies, educational institutions, and commercial vendors. All these sources may supplement data collected and managed by your own organization.
Preview parcels
First, you’ll preview the parcels data, which comes from the County of Los Angeles.
1)Switch to the Imagery with Labels basemap using the Basemap button on the Map ribbon.
2)In the Catalog pane, under the City of LA folder, click Parcels.shp to highlight it.
The polygon icon signifies a polygon feature class in shapefile format.
3)Right-click Parcels.shp and click View Metadata.
A new Catalog tab is displayed and shows metadata, or data documentation. What you see here is the Details, an overview of the dataset. Complete metadata includes a description about the data, when and how it was created, attribute information, and so on.
4)Scroll down through the Details. From the top, you see the following:
•The dataset name (City of Los Angeles Parcels) and file type (shapefile)
•A thumbnail image
•Tags that make the data searchable
•A summary of the intended use of the data
•A description of the data content
•Credits attributing where the data came from
•Use limitations related to the license agreement
•Extent of the dataset in latitude and longitude
•Scale range of the dataset from maximum to minimum
5)Switch from the Details view to the Preview view in the lower-left corner of the Parcels metadata. This view allows a visual inspection of the data. The view can be panned and zoomed.
6)Close the Catalog tab (currently displaying the metadata).
7)Add the Parcels shapefile to the Lesson2 map. The map will zoom to the extent of the shapefile.
8)Zoom in until you can easily distinguish features.
The Parcels dataset is large, so the response may be a bit slow. Shapefiles can also be considerably slower than other formats.
This is the spatial data you need, showing individual parcel boundaries.
9)With the Explore tool enabled on the Map tab, click any parcel to identify it. The results will be displayed in a pop-up window showing all the attributes of the feature.
Depending on exactly where you click, you may identify one or more features. The number of identified features will be displayed on the lower left of the pop-up window. You can use the arrow buttons at the bottom to move through the features.
The vacancy attribute you need isn’t here, but you do have an area attribute in unspecified units. Once you find out what the units are (which you’ll do in lesson 3), you can convert them to acres. The figure provides some visual context for the size of a one-quarter-acre parcel compared with a five-acre park.
10)Close the pop-up window.
Preview the table of vacant parcels
Since you don’t have a vacancy attribute in the parcels shapefile, you’ll look for it elsewhere.
1)In the Catalog pane, under the City of LA folder, right-click VacantParcels.dbf and click Add to Current Map.
This stand-alone table has information about parcels, but no polygons or any other association with spatial features. Stand-alone tables are added to the Contents pane in a special group labeled Standalone Tables.
2)Open the table by right-clicking it in the Contents pane and clicking Open. Each of the 29,461 records represents a vacant parcel within the city of Los Angeles.
3)Read the column headings.
OID (object identifier) is a sequential number created and managed by ArcGIS automatically. AIN (assessor identification number) is a user-managed identifier. The UseCode field identifies the parcel use. The CityCode field identifies the city in which the parcel is located.
4)Close the VacantParcels attribute table.
Preview cities
Row 4 of the data requirements table lists cities as needed spatial data. You have this data: you used it in lesson 1, when you made a definition query on Los Angeles.
1)In the Contents pane, turn off the Parcels layer.
2)In the Catalog pane, under Folders > SourceData > ESRI.gdb, under Boundary, add the City_ply feature class to the Lesson2 map.
3)Right-click the City_ply layer in the Contents pane and click Zoom To Layer. You zoom out to a view of all the features in the feature class, which is the entire United States.
4)Turn off the City_ply layer.
5)Add City_pt from the Catalog pane (Folders > SourceData > ESRI.gdb > Boundary).
6)Zoom to the City_pt layer.
The main difference between the two feature classes is that City_pt represents cities as points rather than polygons (hence the names City_pt and City_ply). There’s also a difference in spatial extent, because City_pt includes a feature for Attu Station in Alaska, a place that is so far “west” that it is actually in the Eastern Hemisphere.
7)Zoom in on the data to see individual features.
Why would the same features, such as these cities, be represented with two different types of shapes (points here and polygons in City_ply)? It’s because each shape is appropriate for maps of different scales. For a national map, you would show cities as points. For a local map, you might show them as polygons. For your requirement, which is to make sure that potential park sites lie inside the city limits, you need features that have boundaries—and hence, polygons rather than points.
8)Turn off the City_pt layer.
Preview the LA River
In row 5 of the data requirements table, you have rivers as needed spatial data. In lesson 1, you added the River feature class from the ESRI geodatabase. You also have another feature class to look at: LARiver.shp in the City of LA folder.
1)Add the LARiver shapefile to the Lesson2 map.
2)Zoom to the layer (right-click LARiver and click Zoom To Layer).
3)Open the attribute table, and scroll to the bottom of the table.
In fact, the Los Angeles River is the only river in the feature class, but it’s composed of 265 separate features (the FID, or feature identification, of the first record is 0). Why so many? As noted in lesson 1, the answer has to do with attributes.
4)Review the table to see the attributes, and then scroll up and notice the different values in the Capacity, Discharge, and Protection fields. Many of the rows at the top even have empty attribute values.
For your purposes, it doesn’t really matter what these attributes mean. The point is, if the river was represented as a single feature, it would also have just one row in the attribute table. That would mean that only a single value could be stored for each attribute—fine for the river name (which doesn’t change), but a problem for anything you might want to measure or describe at different locations along the river: flow, depth, water chemistry, navigability, or anything else. The creators of this data wanted to gather facts about the river at different places. To do that, they had to define the river as a spatially connected series of individual features.
You don’t have that need. All you want is the spatial data. If you end up using this feature class in your analysis (rather than the River feature class in ESRI.gdb), you’ll probably combine the 265 features into one.
5)Turn off LARiver and close its attribute table.
Preview parks
In row 6 of the data requirements table, you need spatial data representing parks. You already know you have parks data: in lesson 1, you symbolized and labeled the Parkland feature class. There’s also a shapefile named Parks in the City of LA folder.
1)Add the Parks layer to the Lesson2 map.
2)Zoom to the layer.
3)Open the attribute table.
The table preview shows that there are 337 records (features) and just a few attributes.
Software-managed attributes
Every shapefile feature class has FID and Shape attributes that are created and managed by the software. The FID attribute stores a unique number for each feature. The Shape attribute stores the geometry type. Behind the scenes, it also links each feature to coordinates that define its spatial location. Measurement attributes, such as length and area, can be calculated for shapefiles, but if the values change—because of a spatial edit, for example—the software doesn’t update them automatically.
A geodatabase feature class has up to four software-managed attributes. Like a shapefile, it has a feature identifier (called OBJECTID instead of FID) and a Shape attribute. The Shape_Length attribute stores the lengths of line and polygon features. This attribute doesn’t exist for point features. The Shape_Area attribute stores the internal areas of polygon features. It doesn’t exist for point or line features. Shape_Length and Shape_Area are automatically kept up to date by the software.
4)Turn off the Parks layer and close its attribute table.
5)In the Catalog pane under the ParkData folder, add NewParks.shp to the Lesson2 map.
6)Zoom to the NewParks.shp layer.
7)Open the attribute table.
This shapefile has just two features. One is Los Angeles State Historic Park, and the other is Rio de Los Angeles State Recreation Area. Note the absence of length and area attributes. You can create them if you want—for example, the Parks shapefile has them—but they don’t exist by default because this is a shapefile format.
8)View the metadata for NewParks.shp (right-click it in the Catalog pane) and read the summary.
The data represents two newly developed parks. In lesson 3, when you choose a parks feature class for the analysis, you’ll have to make sure that it incorporates these two parks.
9)Turn off the NewParks.shp layer and close its attribute table.
We are also aware of a third park, Vista Hermosa Park, which has been completed but is not in NewParks.shp. It’s located just north of downtown and is one more park to keep track of.
10)Click the Locate tool located on the Map tab, in the Inquiry group, and search for Vista Hermosa Park in the Locate pane. Then select Vista Hermosa Park, Los Angeles.
The park location will be marked on the map as a symbol.
11)Zoom in for a closer look at the park.
12)Bookmark it as Vista Hermosa Park.
13)Close the Locate pane.
Preview census units
In row 7 of the data requirements table, you decided to use census units as a proxy for neighborhoods.
1)In the Catalog pane, under the census folder, add tracts.shp to the map and zoom to the layer.
The data covers Los Angeles County. The tracts to the north are much bigger than the ones to the south. That’s because census tracts are designed to have a fairly consistent population range, and the northern part of the county, with mountains and desert, is less densely populated.
2)Zoom in somewhere on the southern part of the city.
3)Add block_groups.shp to the map.
4)Add block_centroids.shp to the map. If necessary, zoom in further to see individual points.
Tracts are subdivided into block groups, and block groups are subdivided into blocks. A block centroid is a block represented spatially as a point rather than a polygon. (That’s not so strange—you’ve seen the same thing with cities.) You can learn more about census units in the sidebar “Fundamentals of US Census geography.”
Either block groups or tracts will satisfy your spatial data requirement for a neighborhood. Because of their point geometry, block centroids won’t.
Thus far, you’ve confirmed that you have the spatial and attribute data listed in rows 1–7 of the data requirements table. In three cases (rivers, parks, and census units), you’ll have to choose between feature classes. You’ll tackle that problem in lesson 3. You still have more data requirements to consider and more data to preview. You must also review your requirements for specificity. You’ll do this in the next exercise.
5)Save your ArcGIS Pro project.
Fundamentals of US Census geography
The US Census Bureau reports data by various geographic units. The top-to-bottom relationship shown here represents containment: the nation contains states, states contain counties, counties contain tracts, tracts contain block groups, and block groups contain blocks. Many other nonnesting reporting units are not shown.
Census tracts are relatively small subdivisions of a county. They typically have between 1,000 and 8,000 inhabitants and vary in size. They are designed to be fairly homogeneous with respect to demographic and economic conditions.
A block group is a cluster of blocks within a tract. A block group typically has between 600 and 3,000 inhabitants.
A census block (commonly an ordinary city block) is an area bounded by visible features, such as streets or railroad tracks, or by invisible boundaries, such as city limits. A block centroid is a census block represented as a point rather than a polygon. A centroid is located in the geographic center of the block it represents and has the attributes of that block.
The Census Bureau conducts a new census every 10 years. The latest one was conducted in 2010. Professional demographers estimate values for the intervening years.
Exercise 2c: Reframe the problem statement
Some of the city council’s park guidelines are specific and measurable:
•On a vacant land parcel one-quarter acre or larger
•Within the LA city limits
Others are vague:
•As close as possible to the LA River (Is there a maximum allowed distance from the river? If so, what is it?)
•Not in the vicinity of an existing park (How close is “in the vicinity”?)
•In a densely populated neighborhood with lots of children (How densely populated? How many children?)
•In a lower-income neighborhood (How is “lower income” defined?)
•Serving as many people as possible (How big an area does a park “serve”?)
You can’t do the analysis until you eliminate the vagueness.
Define “proximity to the LA River”
Unless you set a maximum distance limit, every vacant parcel in Los Angeles becomes a potential park candidate. That’s absurd and could waste a lot of data processing time. You’ll set one-half of a mile as an arbitrary outer limit. That stretches the idea of proximity somewhat, but it’s just a cutoff point. Hopefully, you’ll find some good locations that are closer than that.
1)From Windows Explorer, navigate to C:\EsriPress\UGIS4\ParkSite\MapsAndMore and open the DataRequirementsTable.doc (or the .rtf file if you don’t have Microsoft Word).
2)In row 5 of the data requirements table, in the Defined As column, enter <= 0.5 miles.
The symbol <= means “less than or equal to.”
Define “away from other parks”
What minimum distance should a candidate site have to be from existing parks? In open-space planning, a quarter mile is often used to define a convenient walking distance. (That’s typically about a five-minute walk.) Following that standard, you can say that a site is not in the vicinity of an existing park if the site’s border is at least a quarter mile away from the border of the nearest park.
1)In row 6 of the data requirements table, in the Defined As column, enter >= 0.25 miles.
This measure is a simplification because it’s based on straight-line distance.
Define a “densely populated” neighborhood
As you make the rest of the requirements concrete, you’ll also make sure that you have the appropriate data.
1)Open the attribute table of the tracts.shp layer and scroll across its attributes.
As shown in the figure, the population density is in the POPDENS_CY field, as noted in lesson 1. This attribute stores population per square mile for the current year, in this case census year 2015. Close the attribute table when finished.
2)Open the attribute table of block_groups.shp and scroll across its attributes.
The table has many of the same fields as the tracts.shp table, but there is no population density attribute. This won’t be a problem because population density is just total population divided by area. Because you have total population, you need only the area of each block group, which is something that ArcGIS Pro can calculate automatically. So either shapefile satisfies your need: the tracts file already has population density, and you can derive it from block_groups.
You still need a definition of “densely populated.” To keep it simple, you can call a neighborhood densely populated if its value exceeds that of the city of Los Angeles. The population density of Los Angeles as of the year 2015 was 8,474.7 people per square mile. Rounding down to an even number, use 8,500 people per square mile as your threshold.
Threshold values
To find the population density of Los Angeles, as well as other threshold values for the analysis, we used online US Census Bureau data, especially the QuickFacts page at http://www.census.gov/quickfacts.
3)In row 8 of the data requirements table, in the Defined As column, enter >= 8,500 per sq mi.
Define “lots of children”
Again, you’ll look for attributes and then set a threshold.
1)Open the attribute table of the block_groups layer and locate the POP18UP_CY attribute.
This is the population age 18 or older for the year 2015. If you define a child as a person under 18 (which is reasonable), you can subtract the values in this field (POP18UP_CY) from those in TOTPOP_CY to get the number of children.
Because neighborhoods vary in size and population, you can make more valid comparisons if you base your threshold on a ratio rather than on an absolute number. In Los Angeles, 22.2 percent of the population is under 18 years old. You’ll therefore define a neighborhood as having “lots of children” if it meets or exceeds this value. (You can derive the percentage of children from your data using simple arithmetic.)
2)In row 9 of the data requirements table, in the Defined As column, enter >= 22%.
3)In the Attribute Data column, replace the entry “age” with age under 18.
Define “lower income”
Now look at your income attributes.
1)In the attribute table of block_groups, scroll all the way to the right. The last two attributes might be income measures. To find out, you’ll look at the metadata—not the item descriptions you looked at before, but the full data documentation.
2)Click the Project tab (on the far left end of the ribbon), and click Options.
3)Click Metadata in the left column and change the Metadata style to North American Profile of ISO19115 2003.
4)Click OK and return to the project using the button.
Changing the metadata style gives you access to the full set of metadata from the dataset. See the sidebar “Metadata” for more information.
5)If necessary, open the Catalog pane and browse to ParkSite > SourceData > census.
6)Right-click block_groups.shp and click View Metadata.
The metadata is divided into three sections that can be expanded or collapsed. (They are expanded by default.)
Metadata
Metadata is a description of what is known about a dataset. It serves two important purposes. First, it vouchsafes the integrity of data by explaining things such as how, when, and by whom the data was created. Second, it makes the data searchable. Metadata includes tags that identify essential properties of the data (for example, “rivers,” “Los Angeles,” and “2010”) and other descriptions that make it possible to find specific datasets among large inventories of spatial data.
Metadata may be kept according to one of various official standards. Data created by government agencies, commercial data vendors, and many large enterprises typically conforms to one of these standards. Data created by small organizations or by individuals commonly does not. In ArcGIS, metadata can be displayed in a style that is suited to a particular standard. The default style is the Item Description, which displays a thumbnail image of the data and a small amount of important information. This style is suited to metadata that is not kept to an official standard. It can also be used to provide a filtered, summary view of metadata that is kept to an official standard. Anyone who creates and shares data should at least maintain metadata at the Item Description level.
To see the full metadata for a dataset that is kept to an official standard, you must change the metadata style in ArcGIS Pro, under Project Options. All the styles, apart from Item Description, are similar, and all afford access to the full set of metadata—no matter what standard they conform to—although they may present the information slightly differently.
7)Confirm that block_groups.shp is selected on the left side of the Catalog tab. Click Topics and Keywords to collapse it. Right-click block_groups.shp, and click View Metadata.
8)Collapse the next several headings (Citation, Citation Contacts, and so on) until you come to the Fields heading.
This heading contains the metadata that you’re interested in.
9)Scroll through the Fields data.
10)Scroll down until you see the MEDHINC_CY and AVGHINC_CY fields and read the description.
Note that MEDHINC_CY is described as 2015 Median Household Income, and AVGHINC_CY is 2015 Average Household Income.
Both are good possibilities. Median income is a statistical midpoint: it marks the value that half the households are above and half are below. You’ll adopt this measure because it’s less sensitive to extreme values. (A millionaire in a low-income neighborhood might significantly change the average income but not the median income.)
According to the US Census Bureau, the median household income for the city of Los Angeles for the years 2011–2015 is $50,205. Rounding off, you’ll call a neighborhood “lower income” if the median household income is $50,000 or less.
11)In row 10 of the data requirements table, in the Defined As column, enter <= $50,000.
12)In the Attribute Data column, replace “income” with median hh income.
Define “serving the most people”
Finally, you want to know which potential site serves the most people. Anyone can come to a park, so for this criterion you want to count all the people nearby, regardless of their demographic profile. The attribute you need for this element is total population, which you have in both the tract and block_ group feature classes.
You’ll treat this guideline as a preference. If half a dozen sites meet your other requirements, you’ll prefer those serving more people overall to those serving fewer. Eventually, this preference may have to be subjectively weighed against others. For example, which is better: a park closer to the river that serves fewer people or a park farther from the river that serves more people? (How much closer? How many more people?)
To define the size of the area served by a park, you’ll apply the standard of easy walking distance discussed earlier, and say that a park serves anyone who lives within a quarter mile of it. “Serving the most people” therefore means having the largest population within a quarter-mile radius.
1)Open the block_centroids table and browse across its attributes.
This table also has a total population attribute (POP2010). You can’t use block centroids as your spatial data for neighborhoods—you need polygons rather than points—but you can conveniently use them to sum population. Given a distance of a quarter mile around the park, ArcGIS Pro can count the block centroids, or points, that fall within this distance and add their population values.
2)In row 11 of the data requirements table, in the Defined As column, enter <= 0.25 miles.
Your data requirements table should look like the figure.
You can now state the problem in measurable terms that allow you to solve it with GIS tools. Someone might take issue with your interpretation of the city council’s park guidelines, but that’s fine—you’ll always be happy to improve your methodology. For now, you can state your project analysis requirements as follows.
You want to locate a site for a new park on a land parcel that is
•vacant,
•a quarter acre or more in size,
•within the LA city limits,
•within a half mile of the LA River (preferring closer sites),
•more than a quarter mile from the nearest park,
•in a census unit in which
•the population density is 8,500 or more people per square mile,
•where at least 22 percent of the population is under 18 years old, and
•where median household income is $50,000 or less, and
•considering that all other conditions are satisfied, the total population within a quarter-mile radius is maximized.
You’ve formalized the data requirements in a table and confirmed that the essential data is available. In some cases, multiple datasets contain suitable features and attributes. In the next lesson, you’ll compare these datasets and choose which ones to use in the analysis.
3)Close the Lesson2 map and any open tables.
4)Save your project.
5)Continue to the next lesson or close ArcGIS Pro. Save your changes if prompted.