Note: This chapter was
composed by Gonçalo Correia and Mark
Bradley, with sections of text coming
from the FHWA’s (FHWA, 1996) Travel Survey Manual’s
Chapter 13.
21.1 Introduction
This chapter explains the basic
theory of Stated-Preference (SP) surveys, with emphasis on transport mode
choices, and provides some examples from recent practice. The chapter begins by
distinguishing SP surveys from related survey experiments. Then, the four main
stages of SP experiment design are described, focusing on the construction of
choice alternatives and how these are presented to the respondents. The Chapter
continues with discrete choice model theory and calibration using SP data sets,
with emphasis on the binary logit model, and concludes with some specific
examples.
In general
transportation planning, surveys are meant to capture travelers’ current travel
behavior. For instance, one is interested in knowing the actual mode a traveler
is using, actual travel times, destinations, and so forth. This is known as
Revealed Preference (RP) data, as the traveler is currently experiencing that
behavior and making a choice based on his or her knowledge of the available
travel options. Another type of data is based on Stated Responses (SR), in
which hypothetical situations are presented to the respondents, who are then
asked to choose based on the given attributes for each alternative, without
necessarily experiencing them in real situations. SP is a very popular
sub-class of SR methods, focused on estimating the utility function for
alternatives (Lee-Gosselin 1995).Other
question types in this class are stated intentions, stated tolerance, stated
adaptation, and stated prospect, as discussed later in this chapter.
SR surveys offer a
great advantage for overcoming the problem of the “new option”, whereby an
analyst seeks to forecast the use of a new alternative (such as high-speed
rail), particularly when the new option is very different from existing
alternatives with which the respondent is familiar. The use of a new
alternative is not reflected in RP data
collected on choices made in real markets.
Another aspect where
RP data often fails is that one is not able to correctly identify the
alternatives that were not chosen. The decision maker faces options while
having imperfect information, and not knowing all his/her alternatives. Other
times the decision maker will only have access to attribute information on the
chosen alternatives, e.g. having information on the current automobile trip and
no information on all public transport alternatives possible in the area.
As noted above, the
most common type of SR question is the SP variety, where the respondent is
asked to chose, rank or rate different alternatives based on their attributes
(e.g., travel time, travel cost, and wait time), thus giving information on the
way the choice is made. For a choice-based SP survey one is aiming to get a
choice on the preferred alternative. In a ranking experiment the respondent has
to rank the alternatives in order of preference. In a rating experiment each
alternatives must be classified using a scale which measures its attractiveness
to the respondent. An example of a choice-based experiment is shown in Figure 21.1,
where the objective is to understand how respondents make the choice between
the automobile and bus alternative. Because each alternative is identified with
its name, Automobile and Bus, this is called a “labeled” experiment and allows
analysts to calibrate alternative-specific constants for such clear labels in
DCMs. An alternative type of “non-labeled” experiment might simply seek to
study bus users’ preferences among various service features, and ask
respondents to choose between various bus service descriptions that differ only
in terms of the service attributes, but have no overall distinguishing feature.
Figure 21.1 Example of a Labeled Stated-Preference
Experiment
As noted earlier,
there are at least four other types of SR surveys: Stated Intentions, Stated
Tolerance, Stated Adaptation and Stated , as described below (and as extracted
from Lee-Gosselin (1995)):
Stated Intentions: This is perhaps the simplest form of SR
question.Typically one or two new
choice alternatives are described, and respondents are asked if they would use
the new alternative or not, in the form of a binary yes or no question (or they
may be asked to rate how likely they would be to use the alternative).Such questions may be useful to get a simple
overall indication of the demand for a new alternative, but do not contain
enough detail to model the demand for alternatives with different attribute
levels or under different scenarios. Also, a very simple type of choice
question may not be sufficient to get respondents to consider their likely
choice behavior very carefully.
Stated
Tolerance: Techniques included
in this class do not ask respondents to respond to alternative behavioral
outcomes represented by specific attributes and attribute levels. Instead,
respondents are asked to identify the conditions under which they would take a
particular action or accept a particular behavioral outcome. The basic type of
information sought is responses to questions such as: “Under what circumstances
could you imagine yourself doing the following?” One form of this approach that
received much attention in transportation planning in the 1980’s was the
“transfer price” (TP) method.The
respondent would consider two choice alternatives, and then be asked to imagine
that the cost of one of the alternatives changes and indicate at what level he
or she would switch to the other alternative.For example, auto commuters could be asked to consider their best
transit option and indicate how high fuel prices would have to rise before they
would switch to commuting by transit.This method thus gives a direct quantitative measure of the difference
in utility between two alternatives, but it has been questioned whether or not
travelers can respond very accurately to such questions.A related method that is popular in the field
of environmental economics is that of “contingent valuation” (CV), in which
people are asked to directly value a “good” that is not actually available for
purchase.For example, people can be
asked how much they would be willing to pay in additional taxes if it could
ensure that everyone in their city has access to good public transportation.
This could be thought of in the context of stated tolerance—how much of such a
tax would people be willing to tolerate?
Stated
Adaptation: Techniques
included in this class ask respondents to indicate in a relatively open-ended
manner how they would respond when faced with a particular set of constraints.
The basic type of information sought are responses to questions such as: “What
would you do differently if you were faced with the following specific
constraints?”
Stated
Prospect: With these
techniques, neither the list of possible behavioral outcomes nor a detailed set
of constraints is predetermined. Instead, respondents are typically presented
with some sort of general scenario (e.g., an energy shortage) as a way of
initiating the process of eliciting behavioral outcomes and constraints.
Measurement methods for these techniques involve the use of simulation gaming
techniques. The basic type of information sought are responses to questions
such as: “Under what circumstances would you be likely to change your travel
behavior and how would you go about it?”
Of the classeof SR techniques, SP surveys are
the most important source of data for developing a Discrete Choice Model (DCM)
to represent traveler decisions when faced with new travel alternatives and
transportation policy actions. DCMs have played an important role in
transportation modeling for the last 25 years. “They are namely used to provide
a detailed representation of the complex aspects of transportation demand,
based on strong theoretical justifications” (Bierlaire, 1997).
In this Chapter of the Travel Survey Manual we
focus on SP experiments. In the following section the design and
deployment of the experiments is explained focusing on the attributes and their
levels as well as the media and the way to present the choices to the
respondents. The next section presents the main analysis that can be conducted
through the calibration of DCMs based on SP information.
21.2 – Designing SP Experiments
The design of SP experiments involves the
following stages:
·Designing the experiment in terms of the alternatives and attribute levels;
·Choosing a media;
·Defining the context for the exercise; and
·Designing the sampling plan.
These are each
described in turn here now.
21.2.1 – Designing the Experiment in Terms of Alternatives and Attribute
Levels
“An experiment defined in scientific terms
involves the observation of the effect upon one variable, a response variable,
given the manipulation of the levels of one or more other variables” (Hensher, et al., 2005). This is a general definition that can be
applied to any science or field of research and to any problem that involves a
stimulus and a response.
In developing an
experimental design, the first step is to specify the types of choice
alternatives, the choice attributes, and the attribute levels to be included in
the analysis. Consider the example of a binary choice where the respondent is
asked to choose between driving or riding a bus based on the following
attributes:
·Travel time difference between the auto and the bus;
·Cost difference between the auto and the bus in percentage;
·Number of bus transfers.
In general, a
minimum of three attributes is usually needed to provide a realistic context
for the SP exercise. The attributes associated with a particular SP exercise
should represent as much as possible those factors that are important in the
choice process. Experience suggests that the number of attributes presented to
a respondent should be limited to six or seven. Presenting respondents with
more attributes makes the exercise increasingly difficult for respondents to
deal with and may in some instances limit the usefulness of the data. (Note:
Researchers tend to have different opinions in this regard, and some SP
experiments have included over a dozen choice attributes—see Louviere, et al.
(2000) for examples, and Jones and Bradley (2006) for further discussion.)
The classical
approach, and the one that has been mostly used and tested to build these
experiments, establishes attribute levels for the explanatory variables and
then it uses these levels to build a design which is presented to the
respondents.
The full factorial
design that would result from all possible combinations of these levels most of
the times results in too many choices. For instance if one defines three levels
for each of the attributes described in the example above, the full factorial
design would result in 3x3x3 = 27 different treatment combinations, too many to
be answered by one respondent (Table 21.1).
Table 21.1 Full Factorial Design
Attributes
Alternative
Travel Time
difference
Cost
Difference
Nº of Bus
Transfers
1
The same
-20%
0
2
The same
-20%
1
3
The same
-20%
2
4
The same
-40%
0
5
The same
-40%
1
6
The same
-40%
2
7
The same
-60%
0
8
The same
-60%
1
9
The same
-60%
2
10
10 min less
-20%
0
11
10 min less
-20%
1
12
10 min less
-20%
2
13
10 min less
-40%
0
14
10 min less
-40%
1
15
10 min less
-40%
2
16
10 min less
-60%
0
17
10 min less
-60%
1
18
10 min less
-60%
2
19
20 min less
-20%
0
20
20 min less
-20%
1
21
20 min less
-20%
2
22
20 min less
-40%
0
23
20 min less
-40%
1
24
20 min less
-40%
2
25
20 min less
-60%
0
26
20 min less
-60%
1
27
20 min less
-60%
2
A note should be
given on the minimum number of levels to use for the attributes: a minimum of
three levels is required to detect non-linear relationships between attributes
and preferences. Therefore when non-linear relationships are thought to exist,
at least three levels should be used.
There are several ways to reduce the number of different
treatment combinations required. These include the following:
·Use “fractional-factorial” designs;
·Remove options that will “dominate” or be “dominated” by all other options
in the choice set;
·Separate the alternatives into “blocks,” so that the full factorial design
(or a larger fractional factorial) is
completed, with different groups of respondents each responding to a different
sub-set of options; and
·Carry out a series of experiments with each individual, offering different
attributes, but with at least one attribute common to all.
Use of a fractional factorial design - The approach to use a fractional orthogonal
design can be done with many statistical software packages (SAS, SPSS, etc.).
These computer programs search for a combination of the levels that result in 0
correlations between the attributes, known as the property of orthogonality,
meaning independence between variables (Table 21.2). By guaranteeing that the
attributes in the alternatives are uncorrelated onehelps to ensure that the
effect of each attribute can be estimated as independently from the others as
possible. (Recently, however, so-called “d-optimal” design approaches have been
introduced, that are more statistically efficient than orthogonal designs under
certain conditions.See Rose et al.
(2008) and Bliemer and Rose (2008) for more details)
Table
21.2
Fractional Orthogonal Design
Attributes
Alternative
Travel Time
Difference
Cost
Difference
Nº of Bus
Transfers
1
The same
+20%
0
2
The same
+40%
2
3
The same
+60%
1
4
10 min less
+20%
1
5
10 min less
+40%
0
6
10 min less
+60%
2
7
20 min less
+20%
2
8
20 min less
+40%
1
9
20 min less
+60%
0
Orthogonality among
the attributes allows estimating the main effects of one variable on choice,
independently of the effects that the other variables may have. For instance in
the Auto/Bus mode choice example, if the choices that are presented to the
respondents had always the same level for Travel Time Difference and Cost
Difference, meaning total collinearity between the two vectors, it would not be
possible to estimate the main effect of each of these variables on choice,
because one cannot distinguish if the response is given due to price or to one
or the other attribute.(This problem tends to occur in RP data, as one does not
control the attribute levels. Although perfect collinearity is very unlikely to
appear in real life data, correlations
can sometimes be problematic, for instance travel time and travel distance can
be highly correlated., )
While the fractional
factorial approach can significantly reduce the number of treatments needed for
a SP exercise, it typically does so by ignoring some or all interaction
effects. If interactions among attributes are, in fact, significant, their
effects will be loaded onto the individual main effects, while it will bias the
estimate of the relative importance of individual attributes on response. The
degree of bias will depend on the significance of the interaction effects. If
this bias occurs, the main effects are said to be “confounded” with interaction
effects. If interactions are expected to be important (e.g. the effect of
real-time information for transit services may be highly related to the level
of service frequency or waiting time), then a fractional design should be
selected that allows unbiased estimation of those specific interaction terms.
Removing Dominant/Dominated Options - This approach applies primarily to SP exercises
presented as choice experiments. With this approach, those choice alternatives
that dominate or are dominated in each attribute by every other alternative
included in the choice set can be excluded. The only potential drawback with
this approach is that any respondents choosing alternatives at random or
illogically will not be easily identified based on an analysis of their
responses. Note that this approach can disrupt the orthogonality of the
statistical design and introduce correlations between parameter estimates.
Block Design - Another approach involves dividing the total
number of treatment combinations included in an experimental design into
sub-sets (or blocks). The sample of respondents is randomly divided into
groups, with each group receiving a different block.. This approach can be
implemented by including the block number as an additional “attribute” in the
design so that block membership is orthogonal to the choice attribute
levels.In that way, each respondent
will face all of the levels of the various choice attributes in a balanced way,
which increases the efficiency of parameter estimation.Research by Hess, et al. (2008) has shown
that the use of proper blocking of a large fractional factorial design is very
important to ensure efficient parameter estimation.
Common Attributes - With this approach the attributes to be
evaluated are divided among two or more experimental designs. At least one
common attribute must appear in each design to allow comparison of relative
preferences over all the attributes included.In practical examples, cost is often used as a common “linking” variable,
since it has a metric, quantitative meaning that tends to be transferable
across choice contexts. The issue of how best to divide the attributes into
difference sets is mainly from the respondents’ point of view—which variables
make the most sense to trade off versus one another.For example, if one wishes to evaluate many
transit service attributes, then one could separate out the attributes related
to the station/stop from those related to the trip inside the vehicle, using
fare as a common linking attribute.
When generating
choice sets via a factorial design, some alternatives that are generated may
not be plausible and may affect the respondents’ confidence in the survey.. For
instance,a respondent might find a
“+60%” cost difference between the auto and the bus mode to be strange and
unrealistic when he usually drives very few minutes from home to work and has
no parking expenses. The respondent may try to imagine that situation, but his
routine experience may keep him from understanding the hypothetical situation.
A common solution is
to create designs that are built (“pivoted”) around the actual reported experiences
of the respondents (Hensher, 2004). This is done by using information gathered in
an earlier stage, where the respondent is asked about hisactual experience and
behavior (a RP observation), characterized in the same attributes which are
latter used for the alternatives specification. (Rose et al., 2008) state that“The use of a respondent’s
experience, embodied in a reference alternative, to derive the attribute levels
of the experiment has come about in recognition of a number of supporting
theories in behavioral and cognitive psychology, and economics, such as prospect
theory, case based decision theory and minimum regret theory” (They also warn, however, that care should be
taken when using such customized designs in combination with d-optimal design
approaches.)
21.2.2 Choosing the Survey Media
Unless the SP
exercise is very simple, some sort of visual presentation of the alternatives
and attribute levels will be necessary in order to allow respondents to
understand and comprehend what is being presented to them. This is particularly
true for choice and rating exercises, in which the respondent must compare two
or more alternatives. This would limit the usefulness of telephone interviews,
unless the respondent has received survey materials in advance. Reading a large
set of variables over the telephone would make it impossible for the respondent
to memorize and compare the alternatives.
The format and
layout of the instrument used for the exercise will depend to some extent on
the type of response sought (i.e., choice, ranking or rating). For choice
exercises, respondents will be comparing two or more alternatives at the same
time. The alternatives comprising the choice set should appear together on a
card, sheet of paper or computer screen. For ranking exercises, having each
alternative on a separate card is very useful, since this approach allows the
respondent to spread them out and physically arrange them in their order of
preference however, this can also be done in a computer screen in more modern
software. With rating data, it is usually only necessary to consider one
alternative at a time independently from other alternatives. Therefore, a wide
range of layouts are possible for these responses.
It is always useful
and in some cases essential (e.g., when respondents are expected to complete
the exercises on their own) to provide materials describing the alternatives,
attributes, and attribute levels included in the exercise. This could include
drawings or pictures of new travel modes (e.g., high-speed trains) or sample
schedules and route maps for new transit services.
When SP designs are
customized based on respondents’ actual reported choice situations, this
approach can be handled most efficiently using computer-based technology
because customized branching can obtain a clearer picture of each distinct respondent’s
choices ; and then realistic alternative scenarios can be constructed to
understand the respondent’s behavior. Although the survey itself is simple and
straightforward for the respondent, there is significant behind-the-scenes
programming used to resolve this complexity. The ability to survey respondents
effectively using sophisticated methods allows the researcher to obtain the
critical data he or she needs while making the survey experience simple and
clear for the respondent.
One important advantage
of computer-based surveys is the ability to immediately geocode the
respondent’s origin and destination, and search databases to obtain realistic
attribute levels for the O-D pair, allowing the construction of more realistic
SP experiments for the respondent later in the survey (TRB, 2006)..
For many transit
researchers, survey services are a very good solution to develop a survey at
low cost and to learn firsthand about web-based surveys and how the process
works. However, researchers often find that online services and generic survey
software do not meet their needs. For example, longitudinal surveys cannot be
created that track one respondent over time using such tools. Nor can SP
surveys for mode choice studies be produced effectively using less expensive
online survey services, although there is much more expensive software that
does allow for advanced online mode choice surveys to be created. Features such
as online geocoding and linking transit schedules are typically not
incorporated into these surveys. Advanced validation cannot be accomplished, as
these tools are not capable of, for example, comparing a zip code with a data
table of zip codes to confirm if a respondent’s answer is an existing zip code
or not.For general market research
purposes, the most popular software for computer-based SP surveys is sold by
Sawtooth Software.In the field of
travel demand modeling, however, SP surveys are typically designed and fielded
by consulting firms, often in collaboration with survey firms. In the US,
Resource Systems Group has been conducting web-based SP surveys of travel
behavior since the mid-1990’s.
21.2.3 Defining the Context for the Exercise
A key objective in
the design of SP exercises is to establish as much realism as possible. The
following points noted by Jones (1989) are particularly relevant to building realism
into the context of the exercise, the options that are presented and the
responses that are permitted:
·Focus on very specific rather than general behavior- i.e., ask respondents
how they would respond to a particular product or service under a specific set
of conditions rather than in general;
·Use a realistic choice context that respondents have actually experienced
or one that they feel they could be placed into;
·Use existing or realistic levels of attributes within the experimental
design so that the alternatives are built around these levels;
·Limit the range over which attribute levels are varied to those values that
respondents perceive to be possible;
·Wherever possible, incorporate checks on the answers given;
·Allow for the effect of day-to-day variability on choices;
·Make sure that all variables relevant to the choice process are included in
the analysis;
·Where possible, simplify the presentation of choice exercises (e.g., by
highlighting the attribute levels that are different between alternatives);
·Make sure that constraints on choice are taken into account (e.g., fixed
arrival times at work); and
·Allow respondents to opt for a response outside the set of the experimental
alternatives (e.g., in all alternatives in a mode choice exercise are too
expensive, the respondent may choose not to make the trip, so “neither” should
be included as a possible response).
Because of the
nature of this type of survey where the respondent is asked to state his or her
action according to attributes of alternatives which he or she has not
perceived, it is extremely important to ask the right questions to not implicitly
induce a specific answer.
The FHWA Travel
Survey Manual (1996) provides some examples of confusing SP survey-related
questions to avoid, and how they can be improved upon:
Questions Outside Respondent’s Experience:
Problem: “The agency
is considering building a rail transit system similar to the one in Washington,
DC.”
Improvement: “The
agency is considering building a rail transit system.”
Technical Terms:
Problem: “Did you
use an HOV lane for any part of your work trip?”
Improvement: “For
any part of your trip from home to work, did you use a carpool lane that
requires autos to have more than two people in them?”
Uncommon Idiom:
Problem: “With which
mode did you make the trip?”
Improvement: “How
did you get there?” List of modes provided by interviewer or questionnaire.
Omit Names of Alternatives:
Problem: “Under
these circumstances would you choose to take the maglev system described above
or would you choose to take the other alternative?”
Improvement: “Under
these circumstances, would you choose to take choice A or choice B?”
Vary Descriptions of Alternatives:
Problem: A SP
question refers to a two-page description of a proposed new mode developed by
the equipment manufacturer, and asks respondents to select between it and the
mode they use now for different combinations of travel times and costs.
Improvement: The
description of the new mode should be minimized and well-balanced with positive
and negative attributes. All alternatives should receive similar descriptions.
Link Personalities to Questions:
Problem: “Governor
Williamson has proposed increases in transit service in the Mudville area. How
do you fell about this proposal? Do you strongly agree, agree, disagree, or
strongly disagree with it?”
Improvement: “How do
you feel about the proposal to increase transit service in the Mudville are? Do
you strongly agree, agree, disagree, or strongly disagree with it?”
Link Institutions to Questions:
Problem: “Please
rate the bus service offered by the public transit agency, City Transit: excellent,
good, fair, or poor?”
Improvement: “Please
rate the bus service in your area. Is it excellent, good, fair or poor?”
21.2.4 Designing the Sampling Plan
The same sampling
issues associated to other surveys also apply to SP data. The difference with
SP surveys is that each respondent typically provides responses to more than
one choice exercise (typical response obtained in a RP survey). For example, if
50 respondents each complete 5 choice exercises, this would result in 250 data
records. It is important to note that even with 250 responses, the sample size
from the standpoint of assessing statistical precision is still 50. The fact
that there are five data records for each respondent (i.e., five “repeated
measures”) provides more information about each respondent, but not necessarily
more about the population as a whole. Only an adequately sized random sample
can do this.However, because SP survey
sample sizes tend to be smaller than the sample sizes that are typical for RP
household travel surveys (500-1500 respondents for a typical SP survey, as
compared to 3,000-15,000 households for a typical regional travel survey),
purely random sampling is rarely adequate, and it is often necessary to set
quotas for specific sub-samples to ensure that there is sufficient data to
estimate separate models and/or parameters for key market segments.In practice, sample size quotas or targets
are often set for variables such as trip purpose (e.g. commute, non-commute),
time of day (e.g. AM peak, PM peak, off peak), actual mode used (e.g. auto,
transit), and vehicle occupancy (e.g. drive alone, drive with passenger(s)).
Obviously, the sample size and quotas to be used in any specific context need
to be tailored to the research purpose, the study area, and the survey budget.
21.3
Validity of Stated-Preference Results
A concern often
voiced about the use of SP data is that people do not necessarily do what they
say they will do. Therefore a key issue associated with SP data is validity.
Pearmain, et al. (1991) have reviewed a number of studies in which the
validity of predictions of choice behavior based on SP techniques was
investigated, Based on this review, they concluded that the results of most of
these studies seemed encouraging, suggesting that SP techniques can predict
choice behavior for the sample being studied with a reasonable degree of
accuracy. However, they noted that most of the reported studies of validity had
the following shortcomings:
·The research was not done in a systematic way;
·The research was carried out as a by-product of a practically-oriented
study;
·Some of the studies were based on incorrectly applied prediction methods;
and
·Typically the reported research only concerned the reproduction of existing
behavior of the sample being studied; few studies deal with the generalization
of predictions to entire populations, and very few look at the ability to
predict behavioral changes in response to changed circumstances.
They concluded that
additional systematic validity research is needed before definitive findings
and general guidelines can be given. And, indeed, much validation research has
been done since the early 1990’s, some of it reported by Louviere, et al.
(2000). It is difficult to make general statements about the validity of SP
research, however, because the validity of any particular study depends mainly
on the quality and care with which the experiment is conceived, designed, and
administered.This is partly a science,
but also an art to some extent, and often there is no substitute for experience
when it comes to surveys and market research.Therefore, it is very difficult to design and carry out meaningful SP
research without obtaining at least some advice and input from persons and/or
firms who have experience with this specific type of survey.
It is sometimes claimed that SP methods tend to produce
mode-specific biases (alternative-specific constants) for new transit modes
that are unrealistically high, particularly when they are divided by the time
or cost coefficients and interpreted in terms of dollars or minutes advantage.
One possible reason for “optimistic” forecasts of transit use with SP methods
is “non-commitment” bias—the fact that respondents in hypothetical situations
can imagine the attractive features of having a new transit alternative
available, and can indicate that they would use the new service without having
to actually commit themselves to the inconvenience and uncertainty of shifting
to a new mode of travel. In actual situations, people may find it more
difficult to change their habitual travel patterns. This potential problem with
SP methods has long been recognized, and has been a major impetus for
improvements in survey and design methods to make the experiments as realistic
as possible. As described earlier, this can be done by customizing the choice
context and attribute levels to mirror trips that respondents have actually
made, and by prompting respondents to recall any choice constraints or
circumstances of their actual trip that would make it difficult for them to
change to another mode. Even with the most realistic, customized surveys,
however, it is likely that some potential for non-commitment bias still exists.
Furthermore, combined analysis of SP data with RP data, while useful in many
ways, does not address this particular issue, because the RP data does not
provide any information about modes that do not already exist.
One possible method for investigating this issue further is
the “cheap talk” method, tested by Lu, Fowkes and Wardman (2006). In an SP
survey to measure the demand for improved rail service, the authors included
the following instructions (so-called “cheap talk”) for a random subsample of
the respondents:“Previous surveys have sometimes
found that people say they would be happy to pay extra for improved trains but
when the fare is raised and the improved trains are provided, people say they
would prefer the cheaper fare with the old trains. Bearing this in mind, as you
read through the following choices, please imagine you will actually have to
pay the fare stated.”(This
particular example was not for a completely new rail mode, but one can imagine
analogous text for that situation.)The
authors found that including this additional text did significantly influence
the results, lowering the estimated constant term for shifting to the improved
type of train.
In general terms, it would seem
that researchers should make more use of the split-sample approach described in
the preceding example. By including well-design variations in the survey
protocol and instructions for different subsamples of respondents, one can
measure the influences that certain aspects of the hypothetical choice context
have on the respondents. Although one cannot eliminate biases in this way, one
can at least gain some idea of their possible magnitude, and then select the
most conservative result for forecasting.
Another validity issue for SP surveys arises in the context
of road pricing policies—the possibility that some respondents will state their
choices as a sort of protest against the policy of introducing new tolls,
regardless of the toll level. This is a type of strategic response bias that is
analogous to the positive bias described for transit SP studies above.In this case, however, the bias will tend to
be negative, with SP respondents perhaps less receptive to using tolled
facilities in the hypothetical contexts than they would be in reality. It may
also be more difficult to avoid or measure using a method such as the “cheap
talk” approach described earlier, as it would seem more difficult to influence
what tends to be a political or attitudinal phenomenon using simple verbal
instructions. It may be more effective, in fact, to include a series of
attitudinal questions to identify respondents who are most fervently opposed to
the introduction of tolls, and then to estimate a different bias constant for
such individuals. The difference between that estimate and a bias constant
estimated on the remaining part of the sample would give an estimate of the
size of this strategic bias.
21.4
- Combining Stated - and Revealed-Preference Data
The results of
choice-oriented SP techniques is analogous to RP choice data collected as part
of travel surveys. This gives rise to the possibility of combining these two
types of data for model development and forecasting (Ben Akiva, et al. 1994).
One approach would be simply to pool these two types of data, It has been
shown, however, that this naïve pooling
of SP and RP choice data can lead to seriously biased models. The key problem,
noted by Bates (1988), Bradley and Kroes (1992) and others is that these two types of data are
subject to different types of errors, making it unlikely that they share a common
distribution of unobservable.
Specifically, SP
choice experiments use carefully controlled scenarios to eliminate, as much as
possible, the influence of extraneous variables so that the choice can be
analyzed only as a function of the specific attributes and covariates that the
researcher intends. That means that SP data tends to have a much higher “signal
to noise ratio” than RP data where the research has no control over the choice
context. When estimating discrete choice models (DCM), however, the scale of
the coefficients is directly proportional to the amount of explained variation
relative to the residual, unexplained variation (i.e, the “signal to noise
ratio”).This means that models based on
SP data tend to have coefficients with higher scale relative to RP-based
models.In a predictive context, this means
that SP-based models will tend to be more sensitive, and have higher
elasticities.Therefore, although the relative
values of SP-based parameter estimates are suitable for forecasting, the absolute
levels may not be.It is advisable to
re-scale SP-based parameter estimates using RP data before a model is used in
forecasting. If the SP model includes a new alternative, then obviously there
is no RP data for that particular alternative. However, there may be RP
available for choices among two or more existing alternatives, and that data is
usually adequate for re-scaling purposes.
A number of
approaches have been developed to combine SP data and RP data for model
estimation in a way that accounts for differences in error components. A
sequential estimation procedure, described in Ben-Akiva and Morikawa (1990) can be carried out using readily available
software. A more statistically efficient simultaneous approach has been
developed which requires specialized software. This simultaneous approach has
been adapted to use a form of nested logit estimation possible with existing
software packages (Bradley and Daly, 1994).
21.5 Discrete Choice
Models (DCMs) and the Willingness to Pay (WTP)
As noted earlier, DCMs have been used extensively
for transport mode choice studies. These may be calibrated through RP and/or SP
data. The objective of this Chapter on SP surveys is not to explain DCM, for
this we advise on reading for example BenAkiva and Lerman’s (1985)Discrete Choice Analysis, and Train’s (2002)Discrete Choice Methods with Simulation, these are recognized as
two of the best text books to understand these models. DCMs may be applied to
all consumer choices, and in fact its research started in the psychology and
marketing fields before being used in engineering.
One of the most interesting sub-products of DCMs is
the possibility of extracting trade-offs on the attributes of the alternatives.
Among these trade-offs one of the most important is the willingness to pay,
where one determines the amount of money that the respondents are willing to
pay (WTP) in order to obtain some benefit. In linear utility specifications the
WTP measures are calculated simply by dividing two parameter estimates. At
least one of the parameters has to be measured in monetary units in order to
produce a financial indicator.
One important WTP is the value of travel time
savings (VTTS) where the travel time parameter is divided by the cost parameter
producing a trade-off between both, measured in monetary units/travel time unit, as the utility is linear and
compensatory in its parameters, the value of this ration measures the amount of
money an individual is willing to pay in order to save one unit of time spent
travelling, considering all the other parameters constant.
(Eq. 1)
These measurements are very useful for pricing road
infra-structure use or measuring the value of non-numerical attributes such as
air or water quality, this last being extremely important for measuring the
value of environmental externalities and incorporate them in a cost-benefit
analysis. One should note that both parameters must be statistically
significant in order to produce accurate measures of the VTTS.
Another example of an interesting WTP measure is the
value of the time waiting for the bus or metro service.When comparing this with the value of travel
time, one often observes a statistically, and practically, significant
difference where waiting time is more costly than moving time. Other times, the
number of transfers attribute may be shown to the respondent, resulting in an
estimate of the average monetary value for this movement inside a terminal.
In several European countries, SP methods have been
used to derive measures of the monetary values of various types of travel time
to be used in cost-benefit analysis and other forms of economic evaluation—see
Bates, et al. (1987) and Bradley and Gunn (1991), for example. Methods have
also been developed to explicitly estimate the parameters of a log-normal
distribution of VTTS across the population (Ben-Akiva, et al. 1993), and these
methods are now more accessible with the introduction of software packages to
perform “mixed logit” estimation.
21.6 Examples
As mentioned above, study of the demand for public
transportation has traditionally been the most common applied context for SP
methods, particularly in Europe, but also in other parts of the world. Many of
these studies have focused on the demand for (and willingness to pay for)
improvements to existing transit services. The studies can include a large
number of service attributes, including not only basic service levels such as
fare, journey time, frequency, and number of transfers, but also various
physical amenities in the vehicles and station environments, ticket purchase
options, differences between classes of service, and a variety of other service
features. Another common use of SP methods has been to predict the demand for a
new type of public transportation service, for instance a new rail line where
none exists at present.SP experiments
have been used for many such projects, including major projects such as
Eurotunnel and other high-speed rail corridors around the world.
The first example below was administered as part of the
Seattle Household Activity Survey, carried out for the Puget Sound Regional
Council (PSRC) and Washington State DOT in 2006, by MORPACE International
teamed with Cambridge Systematicsand
Mark Bradley Research and Consulting. Full documentation is available in the
“PSRC 2006 Household Activity Survey Analysis Report” ( http://psrc.org/assets/1979/hhactivitysurvey.pdf)This
example used the phone-mail-phone survey approach that is typical for household
travel surveys. Respondents who reported a trip in a relevant transit corridor
as part of their travel diary data were then sent a customized follow-up SP
questionnaire with hypothetical mode choice scenarios.They were then contacted once again by
telephone to retrieve their answers. The introductory text and an example
choice scenario are shown below.Note
that the choice scenarios included an auto alternatives and two different transit
alternatives, bus and rail. Also note that in addition to typical time and cost
variables, the experiment also includes a seat availability variable and a
service reliability variable.
Another mode choice example from a web-based survey is shown
below.This example was from a 2006
study carried out by Resource Systems Group (RSG) to model mode choice between
JFK airport and lower Manhattan in New York City. Note how the use of graphics
and shading can make the choice scenarios clearer and more attractive for
respondents.
Another common use of SP methods is to study road pricing
for funding road construction and reducing congestion. In the U.S. over the
last decade, this has been by far the most common context for using SP
methods.The types of pricing most often
depicted in the studies are new tolled lanes alongside existing lanes, such as
high occupancy toll (HOT) lanes, or else totally new highways on which all
lanes are tolled. A few SP studies have also looked at downtown area pricing
and cordon-based pricing.
The first example shown below was carried out as a follow-up
to the 2006 PSRC Household and Activity Survey, the same as for the mode choice
example above. Respondents who had reported a trip in a relevant highway
corridor were selected to participate in a follow-on Stated Preference
survey.Customized SP scenarios were
created based on the reported trip and mailed to respondents.A sample SP scenario is shown below. In
addition to the toll and travel time variables, which are included in all SP
experiments of this type, this experiment has two additional variables of
interest:
Distance traveled: Because the free route may be
an entirely different road than the tolled route, there may be a
significant difference in terms of distance.In typical RP data, distance is so
highly correlated with travel time that is not feasible to estimate
separate time and distance coefficients.This SP data allows us to estimate such an effect.
Reliability of travel time: Here, a significant
extra delay was defined as “more than 15 minutes late” (beyond the usual
travel time), and the scenarios were varied in terms of how often such a
delay occurs, allowing us to estimate the effect of the frequency of
delay.
A second example is from a study carried out for San
Francisco County (SFCTA) to study the possibility of implementing cordon
pricing around specific areas of Downtown San Francisco.An SP survey was carried out in 2007 to aid
in modeling the effects of such a policy and setting effective levels of cordon
charge to influence traffic levels at different times of day.Auto travelers to downtown were intercepted
and participated in a web-based SP interview.The web-based experiment was designed and carried out by Resource
Systems Group (RSG)An example choice
screen from the survey is shown below.
In contrast to the previous SP example from Seattle, this experiment
did not include a non-tolled auto alternative, because, in the context of
cordon pricing, that would mean not traveling to downtown San Francisco at all.
(Additional survey questions about that possibility were asked after the SP
questions.) However, a transit option was included, because transit to downtown
San Francisco is a very viable alternative, and because part of the stated
reason for cordon pricing would be to provide funding to maintain and improve
transit services.For the auto alternatives,
the variables used for this study were similar to those used for the Seattle SP
described in the previous section, except that:
(a)The definition of the peak period used for a
given respondent was customized based on their actual departure time, and the
duration and timing of the peak pricing period was varied across respondents,
allowing a more detailed analysis of departure time shifting behavior.
(b)For a given respondent, the effect of reliability
was measured by fixing the frequency of delay and varying the length of the
delay across the alternatives—the opposite of how it was presented in the
Seattle SP.(Note that frequency was
varied randomly across respondents, with ‘1 out of 10 trips’ used for half of
the sample and ‘1 out of 5 trips’ used for the other half.)
(c)All three auto alternatives involve using the same
route, so there is no difference in distance.
A third example below is from an SP survey on HOT lane
options carried out in the Minneapolis region to study the use of the MnPASS
dynamically priced managed lane. Stated preference (SP) questions were
developed to measure willingness to pay for use of the HOT lane.SP tradeoff questions were asked of all
respondents who reported making a reference trip as a solo driver on the
I-394.An interesting feature of this SP
experiment is that it was carried out completely via telephone with no visual
aids.This was possible because the
choice context was very simple, with just two choice alternatives and two
attributes—travel time and toll.The
tradeoff questions were asked using the wording shown below. The value T in
brackets was time reported by the respondent as the fastest reasonable time
they could make the trip if there were no congestion. The values X and Y were
varied using an experimental design, with several sets of values used per
respondent to reflect different time/cost tradeoffs.For the last series of questions, the time
savings Y was held constant and the toll level X was varied adaptively to find
the point where the respondent would switch between using the toll lane and the
free lane.This is an adaptation of the
“transfer price” method described earlier in this chapter, and provides an
individual-level estimate of the willingness to pay for each respondent. For
more information on this study, see Bradley and Zmud (2006).
Now assume you’re making the same
trip in the future that you just told me about.It’s a trip on the same day of the week, at the same time of day, for
the same purpose, and you’re under the same time pressures.You enter the freeway, I-394, and find out
that you can make this trip using a toll lane and paying via electronic toll
collection if you want to.
If you were to use the general traffic lanes on
I-394, your trip would take [T + Y minutes] and be free.If you were to use the toll lane you would $X
and your trip would take [T minutes] saving Y minutes.Now under these conditions, which would you
choose to do?
Use the toll lane, pay $X and save Y minutes
Use the general lane for free.
A third
example context for SP research is the choice of where to park. Parking price
and supply can be some of the most influential policy levers to influence
travel behavior. The first example below is from a study of using on-campus
parking versus using a shuttle bus from remote parking near a university in
Mexico City. This survey was carried out during personal interviews with
students and staff who were intercepted on campus, and the choice situations
were presented on individual cards.Note
the use of graphics to help clarify the meaning of the attributes.
The second parking example below is also for
a choice between on-site parking and remote parking with a shuttle bus, and was
also carried out during interviews with people intercepted on-site, this time
with visitors to Muir Woods in the Bay Area.
Option
A: Park at Muir Woods
• Driving time from the Route 101 exit is 45 minutes
• Parking costs $6.00, reservation required at least 1 week in
advance
• A 5-10 minute search/wait to find a parking space
• Parking lot is a 10 minute walk from the site
Option
B: Park and take a shuttle to Muir Woods
• The shuttles are the same quality as school buses
with
on-board taped narration over the speaker system
• The shuttle goes from a parking lot near the Route 101 exit
and also
from the Sausalito Ferry Dock
• Shuttle parking costs $6.00, reservation required at least 1 day
in
advance
• Shuttle fare is free (or $4.00 from Sausalito)
(children
age 5-12 half price; under 5 free)
• Shuttles run every 20 minutes
• Shuttle riding time from Route 101 to the site is 45 minutes
• Can also
take shuttle to other Marin Parkland sites at no extra fare
The final example below is from a web-based survey of
vehicle purchase choice, from an SP survey designed and carried out by Resource
Systems Group (RSG).Vehicle choice is a
context that requires the use of several choice attributes.
21.7 - Conclusions
This chapter on Stated Preference surveys examines
how to build experiments which allow for calibration of DCMs to better
appreciate and anticipate travel behavior, particularly in the presence of new
alternatives.Due to the stated nature
of the responses (rather than verifiable, revealed behavior), it is very
important to build the right context for the experiment and ask the right
questions.
While demand model specification and estimation was
not the focus of this Chapter, the usefulness of such SP data for DCM
calibration was demonstrated, via calculation of trade-offs. One of these
trade-offs is the willingness to pay in order to save travel time, a key
measure in enhancing existing transport services and designing new ones.
References
Bates, J., (1988). Econometric
Issues in Stated-Preference Analysis. Journal of Transport Economics and
Policy, XXII(1), 59-69.
Bates, J.J, M. Bradley, M. Wardman, A.
Fowkes; H. Gunn and several others.(1987). The
Value of Travel Time Saving. A report of research undertaken for the
U.K. Department of Transport. Policy
Journals. Newbury, UK, 1987.
Ben-Akiva, M. and Morikawa, T.,
(1990). Estimation of Switching Models from Revealed-Preferences and Stated
Intentions. Transport Research 24A(6), 485-495.
Ben-Akiva, M. E. and Lerman, S. R.,
(1985). Discrete choice analysis: theory and application to travel demand,
Cambridge.MIT Press.
Ben-Akva, M. , D. Bolduc, and M. Bradley.
(1993). Estimation of travel choice models with randomly distributed values of
time.Transportation Research Record. 1413: 88-97. Transportation
Research Board, Washington, D.C.1993.
Ben-Akiva, M., M. Bradley, T. Morikawa,
J. Benjamin, T. Novak, H. Oppewal and V. Rao. Combining revealed and stated preferences
data. (1994)Marketing Letters.
5(2): 335-349. Springer, Amsterdam.1994.
Bierlair, M. (1997) Discrete Choice
Models. Available from:
http://roso.epfl.ch/mbi/papers/discretechoice/paper.html. Access date: 11
October 2009.
Bliemer, M.C.J. and J.M. Rose
(2008). “Construction of Experimental Designs for Mixed Logit Models Allowing
For Correlation Across Choice Observations”. Paper presented at the
Transportation Research Board Conference, Washington, DC, January 2008.
Bradley, M. and H. Gunn (1991). A stated
preference analysis of values of travel time in the Netherlands. Transportation Research Record.1285.
Transportation Research Record, Washington, D.C.1991.
Bradley, M. and Kroes, E. (1992),
Forecasting Issues in Stated-Preference Research. Selected Readings in
Transport Survey Methodolog. E. Ampt, A. Richardson and A. Meyburg. Melbourne,
Eucalyptus Press. pp. 89-107.
Bradley, M and A. Daly. (1994). Estimation of
logit choice models using mixed stated preference and revealed preference
information. (1994). In Understanding Travel Behavior in an Era of
Urban Change. P.Stopher and M.Lee-Gosselin Ed. Pergamon Press, Oxford,
1994..
Bradley,
M. and J. Zmud. (2006) Validating Willingness to Pay Estimates for Tolled
Facilities through Panel Survey Methods. Paper presented at the 11th
International Conference on Travel Behavior Research. Kyoto.
Lee-Gosselin,
M. (1995) The Scope and Potential of Interactive Stated-Response Data
Collection Methods. TRB’s Conference on Household Travel Surveys,Irvine, CA.
Hensher, D. A., (2004). Identifying
the influence of stated choice design dimensionality on willingness to pay for
travel time savings. Journal of Transport Economics and Policy, 38, 425-446.
Hensher, D. A., Rose, J. M. and
Green, W. H., (2005). Applied Choice Analysis - A Primer, Cambridge.Cambridge
University Press.
Hess, S., C. Smith, S. Falzarano,
and J.Stubits (2008). “Measuring the Effects of Different Experimental Designs
and Survey Administration Methods using an Atlanta Managed Lanes Stated Preference
Survey”. Paper presented at the Transportation Research Board Conference,
Washington, DC, January 2008
Jones, P. (1989) An Overview of
Stated-Preference Techniques. (Note –this needs more details, Goncalo…)
Jones, P.M. and M. Bradley (2006).
“Stated Preference Surveys: An Assessment” in Travel Survey Methods: Quality
and Future Directions. P. Stopher and C. Stecher ed. Elsevier Science. .
Louviere, J.J., D.A. Hensher and
J.D, Swait. (2000). Stated Choice Methods: Analysis and Applications. Cambridge
Univ. Press.
Lu, H., Fowkes, A.
S,. Wardman, M. R. (2006) “The
influence of stated preference (SP) design on the incentive to bias in
responses”. Paper presented at the European Transport Conference, Strasbourg, October 2006.
Pearmain, D., Swanson, J., Kroes,
E. and Bradley, M., (1991). Stated-Preference Techniques: A Guide to Practice.
Steer Davies Gleave and Hague Consulting Group.
Rose, J. M., Bliemer, M. C. J.,
Hensher, D. A. and Collins, A. T., (2008). Designing efficient stated choice
experiments in the presence of reference alternatives. Transportation Research
Part B, 42(4), 395-406.
Sawtooth Softwarehttp://www.sawtoothsoftware.com/
Train, K. E., (2002). Discrete
Choice Methods with Simulation.Cambridge University Press.
TRB (2006). Web-Based Survey
Techniques. More details
needed here, Goncalo.
The opinions and conclusions expressed or implied are those of the authors and are not necessarily those of either of the Transportation Research Board or its sponsors or of CambridgeSystematics, Inc.