3.4 Land Use Data Prep

From NFTPO Model
Revision as of 20:46, 17 October 2016 by Gwineman (talk | contribs)

Jump to: navigation, search


Parcel Allocation

One of the distinguishing features of NERPM-AB is the use of parcels as the basic spatial unit for generating demand. In order to run the ABM, it is necessary to develop parcel level estimates of employment by industrial sector, households, and enrollment. The parcel estimates are typically derived from TAZ-level information obtained from the TPO.

In order to facilitate the preparation of parcel information, a parcel allocation tool has been developed. This tool requires a number of inputs such as the TAZ-level employment and household controls, the block level employment household information, school locations and enrollment by grade, and correspondences between TAZ, Census block, and parcel geographies. Note that in addition to performing spatial allocation from TAZs to parcels, the tool is also capable of allocating employment from a more aggregate categorization of employment sectors to the nine industrial sectors required by DaySim.

This sector allocation is performed by first disaggregating the employment for each of the broad sectors into twenty more detailed employment sectors using 2-digit NAICS code employment totals from the Census Bureau’s Longitudinal Employer-Household Dynamics (LEHD) data source, and then aggregating these twenty sectors back to the nine sectors used in the NERPM-AB DaySim activity-based model components.

Table 3‑1 summarizes the employment sectors used in the activity-based model and their associated 20digit NAICS codes.

TABLE 3-1 DAYSIM ACTIVITY-BASED MODEL EMPLOYMENT SECTORS

DAYSIM SECTOR 2-DIGIT NAICS CODE Industrial 22,31-33, 42, 48-49 Retail Trade 44-45 Office 51-56 Educational Services 61 Health / Medical 62 Government 92 Food 72 Services 71, 81 Other 11, 21, 23

The allocation tool allocates households from TAZ to parcels using Census block-level information on household locations. Block-level household information can be obtained from the 2010 Census. In addition to allocating households and employment information to parcels, the allocation tool also attaches school enrollment information to parcels. DaySim uses following three school categories:

  • STUGRD – students in kindergarten through grade 8
  • STUHGH – students in grade 9 through grade 12
  • STUUNI – students in universities

The allocation tool uses US Department of Education National Center for Education Statistics information on the point location of schools and their associated enrollment by grade (http://nces.ed.gov/ccd/elsi/tableGenerator.aspx). Post-secondary information can be obtained from the US Department of Education’s Integrated Postsecondary Education Data System (http://nces.ed.gov/ipeds/datacenter/).

The allocation tool is programmed in C#, and includes some key features such as:

  • A graphical user interface (GUI) that allows users to easily reconfigure the employment sector allocation scheme
  • The option to use either NAICS or SIC codes
  • The option to perform a “base” allocation in which exogenous TAZ level controls are used or a “forecast” allocation in exogenous growth is provide.
  • The ability to be executed from within the GUI, or to be called as part of batch process using a prepared control file

Four inputs are required to run the tool:

  • TAZ file - TAZ level totals for employment, household, and enrollment.
  • Block file - Employment in 2-digit NAICS codes at block detail. The data can be prepared using 2010 LEHD information. Updates to the LEHD information can be downloaded from here: http://lehd.ces.census.gov/data/
  • Parcel/Microzone correspondence file – parcel/microzone geometry file that contains Parcel/Microzone ID (MAZID), Census block ID (BLOCKID) and the model TAZ ID (TAZID). The geometry file is prepared by the intersection of block and TAZ boundaries. This file does not need to be modified unless the underlying model geography is revised.
  • School file - School enrollments with MAZID. The data are provided by the agency, based on detailed point location information.

The allocation tool performs the following sequential steps:

  1. Disaggregate block data to parcel/microzone level
  2. Aggregate parcel/microzone level data to TAZ level
  3. Disaggregate TAZ employment categories to NAICS codes (using the aggregate TAZ level parcel/microzone data from Step 2).
  4. Disaggregate TAZ data to parcel/microzone (factoring parcel/microzone data to match TAZ totals)
  5. Aggregate NAICS codes to DaySim employment categories
  6. Add school enrollment data to parcels/microzones.

The process is illustrated in Figure 3‑2

FIGURE 3-2 LAND USE ALLOCATION TOOL FLOW


It must be noted that this tool was developed after NERPM-AB model was in place. Various data sources such as infoUSA were used to develop employment and enrollment data at the parcel level.

Buffering & Transit Access Preparation

In DaySim, it is important to have measures not only of within a particular parcel, but also what lies in the area immediately surrounding each parcel. These measures are created by defining a “buffer” area around each parcel and counting what lies inside the buffer. These variables can then be used in DaySim in a way similar to how zonal land use and density variables are used in TAZ-based models, with the advantage that the buffer is defined in exactly the same way for each parcel. The buffer variables that DaySim uses include:

  • The number of households in the buffer;
  • Employment (number of jobs) in the buffer in various employment sectors;
  • Enrollment in schools in the buffer, segmented by grade schools and colleges;
  • The number of spaces and average price of paid off-street parking in the buffer;
  • The number of transit stops within the buffer (segmented by sub-mode, if relevant);
  • The number of street intersections in the buffer, segmented by 1-node (dead-end or cul-de-sac), 3-node (T-junction), and 4+node intersections; and

In addition, DaySim also uses the distance from the parcel centroid to the nearest transit stop (by transit sub-mode, if relevant), as well as the distance to the nearest open space area while simulating models.

DaySim Buffering Tool

A tool to perform the buffer calculations has been developed, and includes a user-friendly GUI. The use of GUI is described in a subsequent chapter of this document. This tool calculates all the buffer and transit access variables that DaySim needs, using the following inputs:

  • Base parcel file (obtained from employment and enrollment allocation process)
  • Street intersections file
  • Transit stops file
  • Open spaces/parks file (optional)
  • Network nodes file
  • Node-to-node shortest path distance file

The input files and their corresponding structures are described in detail in subsequent sections.

Note that it is essential that the buffer measures used in application are consistent with those used for the original model estimation. Thus, when preparing new buffer measures, users should not modify the settings in the buffering tool control file.

Buffer Calculations

As mentioned earlier, buffer variables for a parcel are calculated by summing land use variables of all parcels within a certain buffer distance of the particular parcel. In the past, buffer calculations have used a “flat” buffer, using a certain radius, e.g. ¼ mile, and counting everything within that radius and nothing outside the radius. That approach is simple, but has the disadvantages that (a) it weights all opportunities within the buffer the same, whereas in reality the land use that is very close by will tend to have more influence on behavior than the land use at the edge of the buffer, and (b) there can be “cliff effects” if a large development is located near the edge of the buffer. In the latter case, the measures become sensitive to the somewhat arbitrary specification of the buffer size, and to the rules used to deal with parcels that straddle the buffer boundary. This tends to add random “noise” to the buffer measures.

The buffering tool can be set to use flat buffering, or a distance-decay buffer, in which each buffered item is weighted according to the distance from the origin parcel centroid. There are two options provided for the weighting function: a logistic function and a negative exponential function. The logistic form is recommended because its shape is more representative of typical behavioral models that use logistic functions.

The buffering program simultaneously calculates all the buffer variables for two different buffer sizes. The reason for this is that the DaySim choice models use smaller buffers for some variables (e.g. those that represent attractiveness of walk trips), and larger buffers for some other variables (e.g. those that represent attractiveness of bike trips, or more general neighborhood effects).

For distance decay buffering, the user specifies three parameters for each buffer: (1) the distance parameter, (2) the offset parameter, and (3) the slope parameter (the latter two are used only for logistic buffering). The parameters and equation for the logistic curves used for DaySim model estimation and calibration are listed below. It is necessary that these same parameters be used for model application.

Parameter Buffer 1 Buffer 2 Inflection BDIST1 = 660 (ft) BDIST2 = 1320 (ft) Offset BOFFS1 = 2640 (ft) BOFFS2 = 2640 (ft) Slope DECAY1 = 0.76 DECAY2 = 0.76 The equation is: Weight = MIN(1, (1 + Exp(DECAY – 0.5 + BOFFS/5280))

 / (1 + Exp(DECAY * (Distance/BDIST – 0.5 – BOFFS/5280))))

Distance is the distance, in feet, from the origin parcel to any other parcel whose calculation is explained in the next paragraph.

FIGURE 3-3 BUFFER1 AND BUFFER2 DISTANCE DECAY WEIGHTS

The buffering program also gives the user three options as to how distances are calculated within the buffering program:

  1. Use crow-fly distance between the XY coordinates
  2. Use interpolation with a “circuity surface” around each parcel.
  3. Use shortest path distance between the nearest all street network nodes.

Note that in option 1, because the buffer distance is calculated using XY coordinates from centroid to centroid for parcels, buffering may not be very accurate for parcels that are very large compared to the size of the buffer.

Option 3 provides most accurate distances that take into account obstacles and directness in the street network and is preferable is the required data exists. The following steps are involved in buffering using distance decay weights and all streets network distances:

  1. The buffering tool first associates each parcel with the nearest network node and creates a parcel -node correspondence.
  2. Multiple parcels may be associated with a single node and so the base parcel file is reduced to node level by aggregating data from all parcels that are associated with the same node.
  3. Other items such as open spaces/parks and transit stops are also associated with the closest network nodes.
  4. At this point, buffering calculations are all done at the node level since node-to-node all street network distance are available. For node pairs that are not within 3 miles of each other, Euclidean distance calculated from XY coordinates is used. Buffer variables for a particular node are calculated by obtaining the weighted sum of land-use variables of all the nodes with the chosen buffer distance. The calculation of distance weights has been described earlier.
  5. Once the buffer calculations at the node level are complete, the buffer variables are then then transferred to parcels using the parcel -node correspondence created in step 1.

It should also be noted that in case of option 3, during the buffering process, two binary files that have information about node-to-node network shortest path distances are output so the DaySim can use them for simulation of short trips.

The following steps are involved in buffering using distance decay weights and XY/Euclidean distance:

  1. Calculate distance weights using the logistic decay equation described earlier.
  2. Calculate buffer variables for each parcel by counting land-use attributes of the surrounding parcels by getting their centroid distances (Euclidean) from that of the parcel under consideration and weighting by the corresponding distance weights.

It must be noted that option 2 – use interpolation with a circuity surface was used when developing the model. However, going forward it is recommended that node-to-node (option 3) method be used for calculating buffer variables.