Test Analysis - Smart Charging for PEVs

Quantifying the Benefits and Constraints of Plug-In Electric Vehicle Smart Charging Adoption

Authors

Pingfan Hu

Bharath Ravindra

Sampada Dhakal

Vedanth Surendra Hegde

Published

November 5, 2023

Abstract

The product being studied in the project proposal is “Smart Charging Solutions for Plug-in Electric Vehicles (PEVs).” The initiative aims to investigate the reception and potential adoption of these technologies among users who own or might consider owning PEVs. The focus is on understanding consumer attitudes, preferences, and potential barriers regarding the adoption of smart charging solutions for electric vehicles. The project intends to delve into aspects such as awareness, perceived benefits, and potential hindrances associated with smart charging technologies for PEVs, exploring attributes like compensation, electricity pricing, charging speed, and more.

The investigation of consumer inclinations and considerations within the project was centered on a strategic analysis of distinct product attributes and their corresponding design decisions. These key attributes included Upfront Incentive, Free Level 2 Charger, Electricity Price Discount, Override Window, and the commitment to a Guaranteed Range if smart charged for 8 hours. Notably, the design decisions for these attributes were structured with deliberate variations: the Upfront Incentive ranged across $100, $300, and $500, the provision of a Free Level 2 Charger was delineated as ‘Yes’ or ‘No’, and the Electricity Price Discount was presented at three distinct rates: 10, 25, and 50. Additionally, the Guaranteed Range if smart charged for 8 hours offered choices of 25, 50, and 75 units. These carefully curated design decisions were aimed at assessing consumer responsiveness to diverse incentive structures, charging infrastructure availability, pricing models, and the potential impact of guaranteed range, thereby offering comprehensive insights into the dynamics of smart charging solution adoption among consumers.

Introduction

The global drive for sustainable energy has propelled Plug-in Electric Vehicles (PEVs) to the forefront in the fight against climate change. Smart charging technologies, integrating advanced grid systems and digital connectivity, offer a transformative opportunity to enhance PEVs’ role in a sustainable future by optimizing charging processes, improving grid stability, and reducing costs.

This project proposal aims to gauge user acceptance of smart charging solutions for PEVs. It delves into factors influencing consumer choices, uncovering awareness levels, benefits, and barriers associated with smart charging among PEV owners. Using surveys, interviews, and data analysis, the project aims to reveal user attitudes, preferences, and concerns about smart charging.

The comprehensive approach considers various attributes like compensation, electricity price, charging speed, window, and percentage, exploring user attitudes toward smart charging. It also evaluates the impact of compensation on adoption rates, offering valuable insights for stakeholders in the electric vehicle industry.

In an era focused on environmental awareness and sustainable transportation, this project accelerates the shift toward cleaner and more efficient mobility solutions. It aligns with global energy sustainability goals, emphasizing the need to reduce carbon emissions and create a smarter, more connected, and environmentally responsible landscape for electric vehicle charging.

Survey Design

The survey is divided into three sections. Section 1 covers PEV Ownership and Charging Habits Questions, Section 2 addresses Smart Charging Problems, and Section 3 gathers Demographic Information.

Before starting the actual survey, participants will see a consent page and a 30-second educational video to introduce them to Smart Charging. The video explains smart charging and how it allows the utility to control the PEV charging process, including the time window and charging amount.

Those who don’t have and don’t intend to have one will be screened out at the end of Section 1 as these respondents might not provide us with much valuable information. But this is not implemented in the pilot version but will be included in the final version. 

Section 1 begins by asking participants about their interest in PEVs and whether they own, lease, plan to have one in the future, or do not have one. It also inquires about their preferred PEV model, preferred mileage, and charging habits at home and work. The respondents who have PEVs or tend to have PEVs are within our eligibility. 

Section 2, focusing on Smart Charging Problems, consists of five questions per respondent. This section addresses five main issues faced by smart charging: Upfront Incentive, Free Level 2 charger, Electricity Price discount, Override Window, and Guaranteed Range if charged for 8 hours. Upfront Incentive refers to the instant payment when joining the smart charging program, ranging from $100 to $500. Free Level 2 charger is a faster charging option, with respondents choosing between “Yes” or “No.” Electricity Price discount offers a discount ranging from 10% to 50% if they join the smart charging program. Override Window refers to the longest daily period where users can override regular charging, with values ranging from 0.5 to 4 hours. Guaranteed Range if charged for 8 hours specifies the range achieved if the car is charged for 8 hours during the smart charging period. The value shown ranges from 25% to 75% of the maximum range in miles, relying on users’ input.

This part of our survey involves a small twist: the respondent’s choice in Section 1 question determines the percentage attribute. In the guaranteed range section, we’ll calculate mileage based on the user’s input from Section 1. The guaranteed range in percentage ranges from 25% to 75% for the user’s input. Each question has three alternatives, with the third option being “I am not interested.” Since these attributes are randomly generated, some options may seem illogical or overly obvious. To ensure respondent understanding, we’ve included a sentence asking them to choose between two options if those were the only choices available. Additionally, a third option, “I am not interested,” is provided for those who do not show interest in the available options. Similarly, to ensure clarity, Section 2 starts with a description of the attributes, their values, and a sample question to guide respondents. Below is an example of our conjoint question.

Code
alts <- tibble(
    `Options:` = c('Option 1', 'Option 2'),
    `Upfront Incentive:` = scales::dollar(c(100, 500)),
    `Free Level 2 Charger:` = c('No', 'Yes'), 
    `Electricity Price Discount:` = c('25%', '50%'), 
    `Override Window:` = c('0.5 hrs', '4 hrs'),
    `Guaranteed Range if charged for 8 hrs:` = c('100 miles', '300 miles'))

alts_t <- as.data.frame(t(alts)) %>% 
  rownames_to_column(var = 'Attribute')

colnames(alts_t) <- NULL

kable(alts_t, escape = FALSE, align = c('r', 'c', 'c')) %>% 
  kable_styling(bootstrap_options = c('striped', 'hover'),
                full_width = FALSE, 
                position = 'center') %>%
  column_spec(column = 1, width = '20em') %>%
  column_spec(column = 2, width = '10em') %>%
  column_spec(column = 3, width = '10em') %>%
  row_spec(row = 1, bold = TRUE)
Options: Option 1 Option 2
Upfront Incentive: $100 $500
Free Level 2 Charger: No Yes
Electricity Price Discount: 25% 50%
Override Window: 0.5 hrs 4 hrs
Guaranteed Range if charged for 8 hrs: 100 miles 300 miles

Section 3 collects demographic information, including birth year, gender identity, race, educational background, household income, interest in the environment and climate change, political inclination, current state of residence, commute method, number of cars in their household, and the presence of backup cars, if any, along with their quantity.

Pilot Data Analysis

This section describes the analysis that the team carried out using our pilot data. It contains 4 subsections: sample description, data cleaning, modeling, and power analysis.

The team performed the data analysis in 2 steps:

  1. Step 1: We generated a randomized full-factorial design of cbc choice question set using cbc_design() and fed into the survey. This survey results were collected, cleaned, and stored as a csv file. Then evaluated using the logitr package. This process aims to establish a reusable workflow for future surveys, data collections, and analysis. The survey results are less important at this stage.
  2. Step 2: We generated a randomized full-factorial design as well as a Bayesian D-efficient fractional factorial design using cbc_design(), each simulated using cbc_choices() for a set of random choice results. The results are then fed into cbc_power() for power analysis.

Step 1 generates the the following results:

  1. Summary statistics of vehicle info, charging preferences, demographic info, etc.
  2. The utility model equation containing the 6 beta values.

Step 2 is to generate a better survey, including the number of respondents, number of questions per person, number of alternatives, and most importantly, how will the different levels of attributes distribute across the survey questions.

The team should continuously work on step 2 to generate a feasible survey design, and then re-perform step 1 to analyse and visualize the survey data. However, in this analysis, the team has only performed several attempts on power analysis. The team will continue to work on power analysis and generate the most feasible survey design.

The analysis breakdown is described in the following 4 subsections, and the data dictionary is show in the appendix (see 2. Data Dictionary).

1. Sample Description

As described previously, we generated a randomized full-factorial design of cbc choice question set using cbc_design() and fed into the survey. This initial design is generated using the following codes:

Code
# Define profiles with attributes and levels
profiles <- cbc_profiles(
    upfront      = c(100, 300, 500), # USD
    lv_2_charger = c('No', 'Yes'),
    discount     = c(10, 25, 50),    # percentage
    window       = c(0.5, 1, 2, 4),  # hrs
    guaranteed   = c(25, 50, 75)     # percentage
)

# Make a basic survey using the full factorial of all profiles
design_pilot <- cbc_design(
    profiles  = profiles,
    n_resp    = 500,  # Number of respondents
    n_alts    = 2,    # Number of alternatives per question
    n_q       = 5,    # Number of questions per respondent
    no_choice = TRUE
)

# Save design
write_csv(design_pilot, here('pilot', 'data', 'cbc_questions_pilot.csv'))

In the above codes, the cbc_profiles() function is used to define the attributes, which is the same as the attributes table described in our pilot survey.

The cbc_design() function is used to define the conjoint questions. There ought to be 500 respondents, 2 alternatives per question, and 5 questions per respondent. Therefore, there should be 2500 question responses. There is also a “No Choice” included. These values are arbitrarily chosen.

The summary statistics of our sample survey results will be shown after the Data Cleaning process.

2. Data Cleaning

In fact, the team did not have 500 participants. The team had 30 sample surveys in total, out of which there were 25 successful respondents collected, which means 125 question responses. The data cleaning process is written in the code chunks below.

Firstly, we download survey data from formr and import as dataframes p1, p2, and p3.

Code
p1 <- read_csv(here('pilot', 'data', 'pevConjoint1.csv'))
p2 <- read_csv(here('pilot', 'data', 'pevConjoint2.csv'))
p3 <- read_csv(here('pilot', 'data', 'pevConjoint3.csv'))

Then, we format and combine the 3 dataframes:

Code
p1 <- p1 %>% 
    mutate(
        created = ymd_hms(created, tz = 'EST'),
        ended =  ymd_hms(ended, tz = 'EST'),
        time_sec_p1 = as.numeric(ended - created, units = 'secs')) %>%
    # Select important columns
    select(session, time_sec_p1, starts_with('mc'))

p2 <- p2 %>% 
    mutate(
        created = ymd_hms(created),
        ended =  ymd_hms(ended),
        time_sec_p2 = as.numeric(ended - created, units = 'secs')) %>%
    # Select important columns
    select(session, time_sec_p2, respondentID, cbc_all_same, starts_with('mc'))

p3 <- p3 %>% 
    mutate(
        created = ymd_hms(created),
        ended =  ymd_hms(ended),
        time_sec_p3 = as.numeric(ended - created, units = 'secs')) %>%
    # Select important columns
    select(session, time_sec_p3, completion_code, starts_with('mc'))

# Join all parts together using the session variable
choice_data_wider <- p1 %>% 
    left_join(p2, by = 'session') %>% 
    left_join(p3, by = 'session') %>% 
    # No longer need session variable
    select(-session)

Here the 3 parts of the survey data is stored as choice_data_wider. It is a wider format, and will later be pivoted into different longer formats according to our purposes.

Then, we drop the unnecessary parts:

Code
# Drop anyone who didn't complete all choice questions
choice_data_wider <- choice_data_wider %>% 
    filter(!is.na(mc_cbc_1)) %>% 
    filter(!is.na(mc_cbc_2)) %>% 
    filter(!is.na(mc_cbc_3)) %>% 
    filter(!is.na(mc_cbc_4)) %>% 
    filter(!is.na(mc_cbc_5))

# Drop respondents who went too fast
choice_data_wider <- choice_data_wider %>% 
    mutate(
        # First replace NA values with 0 seconds
        time_sec_p1 = ifelse(is.na(time_sec_p1), 0, time_sec_p1),
        time_sec_p2 = ifelse(is.na(time_sec_p2), 0, time_sec_p2),
        time_sec_p3 = ifelse(is.na(time_sec_p3), 0, time_sec_p3),
        # Now compute the total time
        time_min_total = (time_sec_p1 + time_sec_p2 + time_sec_p3) / 60
    )

# Drop anyone who finished in under the 10th percentile of completion times
time_10 <- quantile(choice_data_wider$time_min_total, 0.1)
choice_data_wider <- choice_data_wider %>% 
    filter(time_min_total >= time_10)

# Drop respondents that got the attention check question wrong
choice_data_wider <- choice_data_wider %>% 
    filter(mc_cbc_practice == 'option_2')

The team followed Prof. Helveston’s instructions on dropping criteria, including those were:

  • Incomplete
  • Too fast
  • Under 10th percentile of completion time
  • Wrong in the attention check question

After we have the wide formatted data, it’s time to generate the choice data. We convert the wide data into a long format according to the 5 conjoint questions, wrangle the data to the desired format, and then save as csv:

Code
# First convert the data to long format
choice_data <- choice_data_wider %>% 
    pivot_longer(
        cols = mc_cbc_1:mc_cbc_5,
        names_to = 'qID',
        values_to = 'choice') %>% 
    # Convert the qID variable to a number
    mutate(qID = parse_number(qID))

# Read in choice questions and join it to choice_data
survey <- read_csv('https://raw.githubusercontent.com/pingfan-hu/My-Resources/main/data/cbc_test.csv')
choice_data <- choice_data %>% 
    rename(respID = respondentID) %>% 
    left_join(survey, by = c('respID', 'qID'), relationship = 'many-to-many')

# Convert choice column to 1 or 0 based on if the alternative was chosen 
choice_data <- choice_data %>% 
    mutate(
        choice = case_when(
            str_detect(choice, '^option_') ~
                as.numeric(str_replace(choice, 'option_', '')),
            choice == 'not_interested' ~ 3,
            TRUE ~ as.numeric(choice)
        ),
        choice = ifelse(choice == altID, 1, 0)
    ) %>% 
    select(-mc_cbc_practice, -cbc_all_same)

# Create new values for respID & obsID
nRespondents <- nrow(choice_data_wider)
nAlts <- max(survey$altID)
nQuestions <- max(survey$qID)
choice_data$respID <- rep(seq(nRespondents), each = nAlts*nQuestions)
choice_data$obsID <- rep(seq(nRespondents*nQuestions), each = nAlts)

# Reorder columns - it's nice to have the 'ID' variables first
choice_data <- choice_data %>% 
    select(ends_with('ID'), 'choice', 'upfront',
           'lv_2_charger_Yes', 'lv_2_charger_No',
           'discount', 'window', 'guaranteed',
           'no_choice', everything())

# Clean up names for choice_data
choice_data <- clean_names(choice_data)

# Save cleaned data for modeling
write_csv(choice_data, here('pilot', 'data', 'choice_data_pilot.csv'))

3. Summary Statistics

This is instructed to be in the Sample Description subsection, but in fact can only be processed after Data Cleaning. The summary statistics will be shown in 3 parts:

  1. Vehicle Info Summary
  2. Charging Preference Summary
  3. Demographic Info Summary

We utilize the choice_data_wider data frame for all 3 parts.

3.1 Vehicle Info Summary

For vehicle info summary, we firstly pivot the data:

Code
choice_data_vehicle <- choice_data_wider %>% 
    pivot_longer(cols = c(mc_own_ev, `mc_max_range.x`, mc_commute,
                          mc_car_number, mc_backup),
                 names_to = 'category',
                 values_to = 'value')

Then, we reorder and relabel the values based on category:

Code
choice_data_vehicle <- choice_data_vehicle %>% 
    mutate(
        category = case_when(
            category == 'mc_own_ev' ~ 'Own EV',
            category == 'mc_max_range.x' ~ 'Max Range',
            category == 'mc_car_number' ~ 'Car Number',
            category == 'mc_backup' ~ 'Backup Car',
            category == 'mc_commute' ~ 'Commute',
            TRUE ~ category
        ),
        value = case_when(
            category == 'Own EV' ~
                factor(value, levels = c('own', 'lease', 'plan'), 
                       labels = c('Own', 'Lease', 'Plan')),
            category == 'Max Range' ~
                factor(value, levels = c(
                    'below_100', 'range_100_to_149', 'range_150_to_199',
                    'range_200_to_249', 'range_250_to_299', 'over_300'
                ),
                labels = c('100', '150', '200', '250', '300', '>300')),
            category == 'Car Number' ~
                factor(value, levels = c('zero', 'one', 'two', 'three', 'four'),
                       labels = c('0', '1', '2', '3', '4')),
            category == 'Backup Car' ~
                factor(value, levels = c('yes_ev', 'yes_gas', 'yes_both', 'no'),
                       labels = c('EV', 'Gas', 'Both', 'No')),
            category == 'Commute' ~
                factor(value, levels = c('on_foot', 'by_bus', 'by_train_subway',
                                         'self_driving'),
                       labels = c('On Foot', 'By Bus', 'By Train', 'Self-driving')),
            TRUE ~ as.factor(value)
        )
    )

Finally, here is the plot for vehicle info summary:

Code
choice_data_vehicle %>%
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_y_continuous(breaks = seq(0, 12, by = 2)) +
    facet_wrap(~fct_relevel(category, 'Own EV', 'Max Range',
                            'Car Number', 'Backup Car', 'Commute'),
               scales = 'free_x', nrow = 3) +
    labs(x = NULL,
         y = 'Count',
         title = 'Vehicle Info Summary Plots',
         subtitle = 'Summary Statistics Part 1') +
    theme_bw(base_family = 'Ubuntu')

In the above plot, we can see the distributions of:

  1. EV Ownership
  2. EV Max Range
  3. Household Car Number
  4. Household Backup Car Type
  5. Commute Method

3.2 Charging Preference Summary

For charging preference summary, we firstly pivot the data:

Code
choice_data_charging <- choice_data_wider %>% 
    pivot_longer(cols = c(mc_at_home, mc_at_work),
                 names_to = 'category',
                 values_to = 'value') %>% 
    separate_rows(value, sep = ',\\s*')

Then, we reorder and relabel the values based on category:

Code
choice_data_charging <- choice_data_charging %>% 
    mutate(
        category = case_when(
            category == 'mc_at_home' ~ 'At Home',
            category == 'mc_at_work' ~ 'At Work',
            TRUE ~ category
        ),
        value = case_when(
            category == 'At Home' ~
                factor(value, levels = c(
                    'home_morning', 'home_afternoon', 'home_evening',
                    'home_night', 'no_home'), 
                       labels = c('Morning', 'Afternoon', 'Evening',
                                  'Night', 'No')),
            category == 'At Work' ~
                factor(value, levels = c(
                    'work_morning', 'work_noon', 'work_afternoon',
                    'work_evening', 'work_night', 'no_work'), 
                    labels = c('Morning', 'Noon', 'Afternoon',
                               'Evening', 'Night', 'No')),
            TRUE ~ as.factor(value)
        )
    )

Finally, here is the plot for charging preference summary:

Code
choice_data_charging %>%
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_y_continuous(breaks = seq(0, 16, by = 2)) +
    facet_wrap(~fct_relevel(category, 'At Home', 'At Work'),
               scales = 'free_x', nrow = 2) +
    scale_x_discrete(limits = c('Morning', 'Noon', 'Afternoon',
                                'Evening', 'Night', 'No')) + 
    labs(x = NULL,
         y = 'Count',
         title = 'Charging Preference Summary Plots',
         subtitle = 'Summary Statistics Part 2') +
    theme_bw(base_family = 'Ubuntu')

Here we don’t have option of “Noon” for “At Home”, but it is shown here for the ease of comparison between home and work charging.

3.3 Demographic Info Summary

The demographic info summary is a little complicated. The team decided to split it into 2 sub-parts.

The 1st part contains gender, race, climate caring, and party. We firstly pivot the data:

Code
choice_data_demo <- choice_data_wider %>% 
    mutate(across(c(mc_gender, mc_race, mc_climate, mc_party), as.character)) %>%
    pivot_longer(cols = c(mc_gender, mc_race, mc_climate, mc_party),
                 names_to = 'category',
                 values_to = 'value') %>% 
    separate_rows(value, sep = ',\\s*')

Then, we reorder and relabel the values based on category:

Code
choice_data_demo <- choice_data_demo %>% 
    mutate(
        category = case_when(
            category == 'mc_gender' ~ 'Gender',
            category == 'mc_race' ~ 'Race',
            category == 'mc_climate' ~ 'Climate Caring',
            category == 'mc_party' ~ 'Party',
            TRUE ~ category
        ),
        value = case_when(
            category == 'Gender' ~
                factor(value, levels = c(
                    'male', 'female', 'transMale', 'transFemale',
                    'genderNonconform', 'others', 'prefer_not_say'), 
                    labels = c('Male', 'Female', 'Trans Male', 'Trans Female',
                               'Genderqueer', 'Others', 'Not Say')),
            category == 'Race' ~
                factor(value, levels = c(
                    'asian', 'black', 'white', 'hispanic', 'native',
                    'pacific', 'others', 'prefer_not_say'), 
                    labels = c('Asian', 'Black', 'White', 'Hispanic', 'Native',
                               'Pacific', 'Others', 'Not Say')),
            category == 'Party' ~
                factor(value, levels = c(
                    'democratic', 'republican', 'no_vote', 'prefer_not_say'), 
                    labels = c('Democratic', 'Republican', 'No', 'Not Say')),
            TRUE ~ as.factor(value)
        )
    )

Then, we plot the 1st part of demographic info summary:

Code
choice_data_demo %>%
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_y_continuous(breaks = seq(0, 14, by = 2)) +
    facet_wrap(~fct_relevel(category, 'Gender', 'Race', 'Climate Caring', 'Party'),
               scales = 'free_x', nrow = 2) +
    labs(x = NULL,
         y = 'Count',
         title = 'Demographic Summary Plots Part 1',
         subtitle = 'Summary Statistics Part 3-1') +
    theme_bw(base_family = 'Ubuntu')

The 2nd part contains education and income. We firstly pivot the data:

Code
choice_data_demo_2 <- choice_data_wider %>% 
    mutate(across(c(mc_education, mc_income), as.character)) %>%
    pivot_longer(cols = c(mc_education, mc_income),
                 names_to = 'category',
                 values_to = 'value') %>% 
    separate_rows(value, sep = ',\\s*')

Then, we reorder and relabel the values based on category:

Code
choice_data_demo_2 <- choice_data_demo_2 %>% 
    mutate(
        category = case_when(
            category == 'mc_education' ~ 'Education',
            category == 'mc_income' ~ 'Income',
            TRUE ~ category
        ),
        value = case_when(
            category == 'Education' ~
                factor(value, levels = c(
                    'no_hs', 'hs', 'college_some', 'vocational', 'degree_associate',
                    'degree_bs', 'degree_grad', 'others', 'prefer_not_say'), 
                    labels = c('< High School', 'High School', 'Some College',
                               'Vocational', 'AA/AS', 'BA/BS', 'Grad',
                               'Others', 'Not Say')),
            category == 'Income' ~
                factor(value, levels = c(
                    'under60', 'inc_60to80', 'inc_80to100', 'inc_100to150',
                    'inc_150to200', 'inc_200to250', 'inc_250to300',
                    'inc_300to350', 'inc_350to400', 'inc_over400',
                    'prefer_not_say'), 
                    labels = c('< $60K', '$60K to $80K', '$80K to $100K',
                               '$100K to $150K', '$150K to $200K',
                               '$200K to $250K', '$250K to $300K',
                               '$300K to $350K', '$350K to $400K',
                               '> $400K', 'Not Say')),
            TRUE ~ as.factor(value)
        )
    )

Then, we plot the education distribution:

Code
choice_data_demo_2 %>% 
    filter(category == 'Education') %>% 
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_y_continuous(breaks = seq(0, 12, by = 2)) +
    labs(x = NULL,
         y = 'Count',
         title = 'Demographic Summary Plots Part 2 - Education',
         subtitle = 'Summary Statistics Part 3-2') +
    theme_bw(base_family = 'Ubuntu')

Finally, the income distribution:

Code
choice_data_demo_2 %>% 
    filter(category == 'Income') %>% 
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_y_continuous(breaks = seq(0, 12, by = 1)) +
    labs(x = NULL,
         y = 'Count',
         title = 'Demographic Summary Plots Part 3 - Income',
         subtitle = 'Summary Statistics Part 3-3') +
    coord_flip() +
    theme_bw(base_family = 'Ubuntu')

4. Modeling

4.1 Model Summary

In order to reveal the utility model, we need to use the logitr package:

Code
# Load the data set:
choice_data <- read_csv(here('pilot', 'data', 'choice_data_pilot.csv'))

# Estimate the model
model <- logitr(
    data    = choice_data,
    outcome = 'choice',
    obsID   = 'obs_id',
    pars    = c('upfront', 'lv_2_charger_yes', 'lv_2_charger_no',
                'discount', 'window', 'guaranteed'))

View summary of results:

4.2 Model Evaluation

Before we move on to the equation, we evaluate the model. Firstly, we work out the 1st order condition: Is the gradient at the solution zero?

Code
model$gradient
#>                          [,1]
#> upfront           0.063132417
#> lv_2_charger_yes  0.001400323
#> lv_2_charger_no  -0.001418025
#> discount         -0.008386328
#> window            0.003765479
#> guaranteed       -0.040964975

It shows the 1st order values are close to zero, which is within our expectation.

Secondly, we work out the 2nd order condition: Is the hessian negative definite?

Code
eigen(model$hessian)$values
#> [1] -8.743880e+05 -2.478647e+04 -8.713279e+03 -4.380717e+01 -1.138431e+01
#> [6] -1.742082e-01

It shows the eigenvalues are negative, meaning the hessian is negative definite. This is also within our expectation.

The evaluation process ends here. We can continue.

4.3 Model Equation

From the summary, we can derive the estimated values and standard errors of the 6 coefficients:

Code
coeff_table <- data.frame(
    Coefficient = c('&beta;<sub>1</sub>', '&beta;<sub>2</sub>',
                    '&beta;<sub>3</sub>', '&beta;<sub>4</sub>',
                    '&beta;<sub>5</sub>', '&beta;<sub>6</sub>'),
    Estimate = c(0.0163208, -7.5792292, -9.8322095,
                 0.0451776, 0.2911476, 0.0827197),
    StdError = c(0.0027778, 1.5417186, 1.8565463,
                 0.0127580, 0.1642019, 0.0170261),
    Levels = c('100, 300, 500', '(Nil)', '(Nil)',
               '10, 25, 50', '0.5, 1, 2, 4', '25, 50, 75'),
    Meaning = c('Upfront', 'Lv 2 Charger', 'No Lv 2 Charger',
                'Discount', 'Window', 'Guaranteed')
    )
coeff_table$Estimate <- round(coeff_table$Estimate, 4)
coeff_table$StdError <- round(coeff_table$StdError, 4)
coeff_table %>% 
    kable(format = 'html', escape = FALSE,
          col.names = c('Coefficient', 'Estimate', 'Std Error',
                        'Levels', 'Meaning'),
          align = c('c', 'c', 'c', 'l', 'l'),
          caption = 'Summary of Model Coefficients') %>% 
    kable_styling(bootstrap_options = c('striped', 'hover'))
Summary of Model Coefficients
Coefficient Estimate Std Error Levels Meaning
β1 0.0163 0.0028 100, 300, 500 Upfront
β2 -7.5792 1.5417 (Nil) Lv 2 Charger
β3 -9.8322 1.8565 (Nil) No Lv 2 Charger
β4 0.0452 0.0128 10, 25, 50 Discount
β5 0.2911 0.1642 0.5, 1, 2, 4 Window
β6 0.0827 0.0170 25, 50, 75 Guaranteed

The equation can be written as:

\[ v_j = \beta_1 x_j^{upfront} + \beta_2 \delta_j^{charger} + \beta_3 \delta_j^{nocharger} + \beta_4 x_j^{discount} + \beta_5 x_j^{window} + \beta_6 x_j^{guaranteed} \]

Plugging in the \(\boldsymbol{\beta}\) values, we have:

\[ v_j = 0.0163 x_j^{upfront} - 7.5792 \delta_j^{charger} - 9.8322 \delta_j^{nocharger} + 0.0452 x_j^{discount} + 0.2911 x_j^{window} + 0.0827 x_j^{guaranteed} \]

The above equation only considers the observed utility. There should also be an error term so that:

\[ \tilde u_j=v_j+\tilde \epsilon_j \]

where \(v_j\) is the observed utility that contains \(\beta\) and \(x\) values, and \(\tilde \epsilon_j\) is the error part.

4.4 Model Figures

Then we generate the summary figures. The figures are in 2 parts:

  • Part 1: Upfront, Discount, Guaranteed
  • Part 2: Lv2 Charger, No Lv2 Charger, Window

Let’s do Part 1 first. We start with data frame construction:

Code
coeff_figure_1 <- data.frame(
    Attribute = factor(
        c('Upfront', 'Discount', 'Guaranteed'),
        levels = c('Guaranteed', 'Discount', 'Upfront')
        ),
    Estimate = c(0.0163208, 0.0451776, 0.0827197),
    StdError = c(0.0027778, 0.0127580, 0.0170261)
)

coeff_figure_1$LowerBound =
    coeff_figure_1$Estimate - 1.96 * coeff_figure_1$StdError
coeff_figure_1$UpperBound =
    coeff_figure_1$Estimate + 1.96 * coeff_figure_1$StdError

Then, we create the plot for Part 1:

Code
ggplot(coeff_figure_1, aes(x = Attribute, y = Estimate)) +
    geom_point() +
    geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width = 0.2) +
    labs(title = 'Coefficient Estimates Part 1',
         subtitle = 'With Uncertainty Bounds',
         x = 'Attribute',
         y = 'Estimate') +
    coord_flip() +
    theme_bw(base_family = 'Ubuntu')

Then, the data frame construction for Part 2:

Code
coeff_figure_2 <- data.frame(
    Attribute = factor(c('Charger', 'No Charger', 'Window'),
                       levels = c('Window', 'No Charger', 'Charger')),
    Estimate = c(-7.5792292, -9.8322095, 0.2911476),
    StdError = c(1.5417186, 1.8565463, 0.1642019)
)

coeff_figure_2$LowerBound =
    coeff_figure_2$Estimate - 1.96 * coeff_figure_2$StdError
coeff_figure_2$UpperBound =
    coeff_figure_2$Estimate + 1.96 * coeff_figure_2$StdError

The plot for Part 2:

Code
ggplot(coeff_figure_2, aes(x = Attribute, y = Estimate)) +
    geom_point() +
    geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width = 0.2) +
    labs(title = 'Coefficient Estimates Part 2',
         subtitle = 'With Uncertainty Bounds',
         x = 'Attribute',
         y = 'Estimate') +
    coord_flip() +
    theme_bw(base_family = 'Ubuntu')

The interpretations for these plots are:

  1. Since “Upfront”, “Discount”, and “Guaranteed” are in large numbers, their coefficients are plotted together in Part 1.
  2. Since “Charger”, “No Charger”, and “Window” are in small numbers, their coefficients are plotted together in Part 2.
  3. The larger the attribute level numbers are, the larger expansion that the distributions will be.
  4. With only 25 successful questionnaires, the uncertainty is considerably high. More samples are required to hedge the uncertainty.

Important notice: The analysis of the 25 successful questionnaires ends here. In the Power Analysis section, we will simulate the results and figure out a proper way to redesign the survey.

5. Power Analysis

In this section, we will attempt to redesign the survey by simulating survey results. Below is the step-by-step digest:

  1. We use the cbc_profiles() function to generate a profile for our 5 attributes.
  2. Then, we use cbc_design() to create a design for the survey. The team purposes to do two designs: randomized full-factorial, and Bayesian D-efficient fractional factorial.
  3. Use cbc_choices() to simulate the random choices: this is a critical part, indicating that we only use random answers for the surveys, without collecting real data from human users.
  4. Use cbc_power() to perform power analysis.
  5. Generate plots of the power analysis, and fine tune the cbc_design().

The purpose of the power analysis is to figure out:

  1. What is the most feasible survey design, including the number of questions, respondents, and alternatives.
  2. The diminishing marginal utility exists in our design. How can we judge the best point for the return of our effort.

5.1 Randomized Full Factorial Design

To start, we define profiles with attributes and levels:

Code
profiles <- cbc_profiles(
    upfront      = c(100, 300, 500), # USD
    lv_2_charger = c('No', 'Yes'),
    discount     = c(10, 25, 50),    # percentage
    window       = c(0.5, 1, 2, 4),  # hrs
    guaranteed   = c(25, 50, 75)     # percentage
)

Then, we generate a randomized full-factorial design:

Code
design_rand <- cbc_design(
    profiles  = profiles,
    n_resp    = 500,  # Number of respondents
    n_alts    = 2,    # Number of alternatives per question
    n_q       = 8     # Number of questions per respondent
)

Use cbc_balance() to show counts of each level of each attribute:

Use cbc_overlap() to show overlap across attributes:

Simulate the random choices:

Code
choice_rand <- cbc_choices(
    design = design_rand,
    obsID = 'obsID'
)

Run power analysis:

Code
power_rand <- cbc_power(
    data    = choice_rand,
    pars    = c('upfront', 'lv_2_charger', 'discount',
                'window', 'guaranteed'),
    obsID   = 'obsID',
    outcome = 'choice',
    nbreaks = 10,
    n_q     = 8
)

Show head and tail:

Code
head(power_rand)
#>   sampleSize            coef           est           se
#> 1         50         upfront -0.0003702309 0.0004316450
#> 2         50 lv_2_chargerYes  0.3360325610 0.1425752162
#> 3         50        discount -0.0079059940 0.0045513375
#> 4         50          window  0.0540157358 0.0530679088
#> 5         50      guaranteed -0.0003908104 0.0034182330
#> 6        100         upfront -0.0002695202 0.0003050828
Code
tail(power_rand)
#>    sampleSize            coef           est           se
#> 45        450      guaranteed -7.215745e-04 0.0011389828
#> 46        500         upfront -7.837905e-05 0.0001341594
#> 47        500 lv_2_chargerYes  9.273578e-03 0.0451510715
#> 48        500        discount -4.697513e-04 0.0013650028
#> 49        500          window -5.003860e-03 0.0164844426
#> 50        500      guaranteed -5.463983e-04 0.0010773817

Show the plot:

Code
ggplot(power_rand) +
  geom_hline(yintercept = 0.05, color = 'red', linetype = 2) +
  geom_point(aes(x = sampleSize, y = se, color = coef)) +
  expand_limits(y = 0) +
  scale_y_continuous(limits = c(0, 0.125)) +
  theme_bw(base_family = 'Ubuntu') + 
  labs(
    x = 'Sample size', 
    y = 'Standard error', 
    color = 'Coefficient',
    title = 'Randomized Design Plot'
  )

5.2 Bayesian D-efficient Fractional Factorial Design

To start, we define profiles with attributes and levels (same as the randomized design):

Code
profiles <- cbc_profiles(
    upfront      = c(100, 300, 500), # USD
    lv_2_charger = c('No', 'Yes'),
    discount     = c(10, 25, 50),    # percentage
    window       = c(0.5, 1, 2, 4),  # hrs
    guaranteed   = c(25, 50, 75)     # percentage
)

Then, we generate a Bayesian D-efficient fractional factorial design and save as RData:

Code
design_db_eff <- cbc_design(
    profiles  = profiles,
    n_resp    = 500, # Number of respondents
    n_alts    = 2,   # Number of alternatives per question
    n_q       = 8,   # Number of questions per respondent
    n_start   = 5,   # Number of random starts for D-efficient design
    n_blocks = 3,    # More blocks increases overall efficiency
    priors = list(
        upfront          = 0.02,
        lv_2_charger     = 1,
        discount         = 0.05,
        window           = 0.3,
        guaranteed       = 0.08
    ),
    method = 'Modfed' # The Modified Federov Algorithm
)

save(design_db_eff,
     file = here('pilot', 'data', 'design_db_eff_pilot.RData'))

The Modfed algorithm is computationally rich. The above codes are set as #| eval: false to prevent from repeated processing. We read the Bayesian D-efficient design from RData file:

Code
load(here('pilot', 'data', 'design_db_eff_pilot.RData'))

Use cbc_balance() to show counts of each level of each attribute:

Use cbc_overlap() to show overlap across attributes:

Simulate the random choices:

Code
choice_db_eff <- cbc_choices(
    design = design_db_eff,
    obsID = 'obsID'
)

Run power analysis:

Code
power_db_eff <- cbc_power(
    data    = choice_db_eff,
    pars    = c('upfront', 'lv_2_charger', 'discount',
                'window', 'guaranteed'),
    obsID   = 'obsID',
    outcome = 'choice',
    nbreaks = 10,
    n_q     = 8
)

Show head and tail:

Code
head(power_db_eff)
#>   sampleSize            coef           est           se
#> 1         50         upfront -0.0004136294 0.0009252123
#> 2         50 lv_2_chargerYes -0.0092983447 0.3343511117
#> 3         50        discount -0.0020561084 0.0047309824
#> 4         50          window -0.1008509864 0.0894250497
#> 5         50      guaranteed  0.0086475139 0.0095303544
#> 6        100         upfront -0.0001340599 0.0006524451
Code
tail(power_db_eff)
#>    sampleSize            coef           est           se
#> 45        450      guaranteed -3.903126e-04 0.0012911649
#> 46        500         upfront -6.946008e-05 0.0001759873
#> 47        500 lv_2_chargerYes  2.817442e-03 0.0430160003
#> 48        500        discount  3.026918e-04 0.0014049661
#> 49        500          window -1.562871e-02 0.0130511790
#> 50        500      guaranteed -1.034237e-03 0.0011635472

Show the plot:

Code
ggplot(power_db_eff) +
  geom_hline(yintercept = 0.05, color = 'red', linetype = 2) +
  geom_point(aes(x = sampleSize, y = se, color = coef)) +
  expand_limits(y = 0) +
  scale_y_continuous(limits = c(0, 0.125)) +
  theme_bw(base_family = 'Ubuntu') + 
  labs(
    x = 'Sample size', 
    y = 'Standard error', 
    color = 'Coefficient',
    title = 'Bayesian D-efficient Design Plot'
  )

5.3 Interpretation of Power Analysis

Since both analysis contain a certain level of randomization, the output results and the plots are not unique. The stand errors of each attribute gradually decreases and reaches to a steady level at about 400 of sample size, which also indicates the expected survey sample size.

Here is the modification that the team will apply onto the final design:

  1. Increase the number of questions per person to 8.
  2. Remove the “Not Interested” option.
  3. Modify the sample size to 400, but may require further discussion and judgement.
  4. Both cbc_balance() and cbc_overlap() shows detailed distribution of attributes and options. This will be further investigated later.

Above is merely the modification on the conjoint question design. For a more inclusive improvement plan, please see the Changes to Final Survey section below.

Changes to Final Survey

Data and feedback from the pilot survey have played a pivotal role in refining our survey and analysis. In this section, we’ll highlight the key adjustments made to our final survey based on insights and observations gathered during the pilot phase. These adjustments not only help us enhance the survey’s comprehensiveness and precision but also guide us in designing and planning our analysis.

The pilot analysis provided us with profound insights into how users engage with our conjoint questions. The choice data obtained from our pilot survey is instrumental in enhancing and fine-tuning our final survey. This ensures that the data collected in the final survey will hold greater significance and precision for our analysis.

1. Current Version

Our survey is structured into three distinct sections:

Section 1: This section seeks to determine whether the participant is an PEV owner or a prospective PEV buyer. It collects essential data about the vehicle’s range, the preferred charging window, and the charging location.

Section 2: Here, we provide a comprehensive explanation of the attributes crucial for making informed choices. Additionally, we’ve incorporated a set of five conjoint questions, refined based on insights from our pilot survey.

Section 3: This section contains demographic questions that offer valuable insights into the participant’s demographic profile.

2. Improvements

Since this survey is already not a pure “pilot” version, there are improvements that the team has done and intend to do. This section will contain both.

Below are the improvements that we have already done:

  1. For section 1: We’ve relocated the question about the maximum range of the PEV from section 2 to section 1.
  2. For section 1: An educational video has been included to explain smart charging.
  3. For section 2: A helpful link to a page explaining the attributes has been included for all questions, allowing users to clarify any doubts they may have about the survey’s attributes.
  4. For section 3: We’ve enhanced the robustness of demographic questions to collect more detailed information about the participants.

Below are the improvements that we intend to do:

  1. (Important) For Section 2: We consider improving the conjoint survey design. Firstly, we increase the number of conjoint questions from 5 to 8; secondly, we are considering the removal of the “Not Interested” response option from the survey. After that, we will perform better power analysis to figure out a reasonable distribution of overlap and balance of the revised survey design.
  2. For Section 2: We consider simplifying the contents in the smart charging programs attributes table, and add images to enhance understanding.
  3. For Section 3: Instead of asking which party they voted for, a more general political question will be which party they intend to vote for, or they want to support.
  4. For Section 3: Change the state to ZIP code.
  5. For Section 3: Put all of the travel related questions on a single page, then, the demographic questions shall be in another page.

Attribution

This pilot analysis report is done by all members of the PEV Team:

Appendix

1. Conjoint Survey Documentation

1.1 Section 2 - Smart Charging Programs

1.2 Start of Section 2

Now that we have known basics about your EV ownership and your charging habits, let’s move on to Smart Charging program selections!

A Smart Charging program consists of 5 attributes: Upfront Incentive, Free Level 2 Charger, Electricity Price Discount, Override Window, and Guaranteed Range if charged for 8 hrs.

Here is a brief explanation of each attribute:

Attribute Range Explanation
Upfront Incentive: $100, $300, $500 It means the amount of money you will be instantly paid if you join the smart charging program. The unit is in USD.
Free Level 2 Charger: No or Yes A "Yes" means you will get a free level 2 charger. A level 2 charger enables faster charging than a standard home charger.
Electricity Price Discount: 10%, 25%, 50% If you choose to join this smart charging program, this will be your electricity price discount.
Override Window: 0.5 ~ 4 hrs This is the longest time window per day that you can force override to regular charging, during which time your EV is not controlled by the smart charging program.
Guaranteed Range if charged for 8 hrs: 25%, 50%, 75% It means the guranteed range if your car is charged for 8 hrs during smart charging. It shows percentage here, but will be converted to mileages.

The above table can be accessed via this link. It’s available on top of each question so that you can refer at any time.

The next page provides a sample question to enhance your understanding.

1.3 Section 2 - Sample Question

You are provided with 2 options of smart charging programs, and an “I am not interested” option. You need to treat them as if they were your only options and choose the one that you most prefer. You can choose the “I am not interested” option if you are not interested in either.

You can access the attribute table using this link.

Options: Option 1 Option 2
Upfront Incentive: $100 $500
Free Level 2 Charger: No Yes
Electricity Price Discount: 25% 50%
Override Window: 0.5 hrs 4 hrs
Guaranteed Range if charged for 8 hrs: 100 miles 300 miles
  • Option 1
  • Option 2
  • I am not interested

1.4 Section 2 - Ready to Start the Questions

The sample question is quite straight-forward. Since Option 2 has every attribute better than Option 1, I bet you have chosen it.

However, this simplicity is not always the case. There will be trade-offs between the 2 options, and you need to make your own decision. For example, one option might have a large up-front but a low discount, while the other one is the inverse. Then, you need to make decisions based on your preference.

There are 5 choice questions in total. For each question, you have 2 valid options to choose from. Again, you make your choice as if they were your ONLY options. If you are really not interested in either of them, you can choose “I am not interested”.

Click on “Next Page” to proceed.

1.5 Question 1 of 5

(1 of 5) If your utility offers you these 2 smart charging programs, which one do you prefer?

You can access the attribute table using this link.

Options: Option 1 Option 2
Upfront Incentive: $100 $100
Free Level 2 Charger: No No
Electricity Price Discount: 25% 10%
Override Window: 1 hrs 4 hrs
Guaranteed Range if charged for 8 hrs: 25 % 25 %
  • Option 1
  • Option 2
  • I am not interested

The other 4 questions are the same and are thus omitted.

2. Data Dictionary

Dictionary of Files:

Code
dict_file <- data.frame(
    name = c('cbc_questions',
             'choice_data',
             'pevConjoint1',
             'pevConjoint2',
             'pevConjoint3'),
    type = c('CSV',
             'CSV',
             'CSV',
             'CSV',
             'CSV'),
    description = c('Conjoint questions for the pilot survey.',
                    'Cleaned results from the pilot survey.',
                    'Part 1 of pilot survey result from formr.',
                    'Part 2 of pilot survey result from formr',
                    'Part 3 of pilot survey result from formr')
) %>% 
    mutate(count = row_number()) %>%
    select(count, everything())
dict_file %>% 
    kable(format = 'html',
          escape = FALSE,
          align = c('c', 'l', 'c', 'l'),
          col.names = c('No.', 'Name', 'Type', 'Description'),
          caption = 'Dictionary of Files') %>%
    kable_styling(bootstrap_options = c('striped', 'hover'))
Dictionary of Files
No. Name Type Description
1 cbc_questions CSV Conjoint questions for the pilot survey.
2 choice_data CSV Cleaned results from the pilot survey.
3 pevConjoint1 CSV Part 1 of pilot survey result from formr.
4 pevConjoint2 CSV Part 2 of pilot survey result from formr
5 pevConjoint3 CSV Part 3 of pilot survey result from formr

Dictionary of Data Frames:

Code
dict_df <- data.frame(
    name = c('choice_data',
             'choice_data_charging',
             'choice_data_demo',
             'choice_data_demo_2',
             'choice_data_vehicle',
             'choice_data_wider',
             'choice_db_eff',
             'choice_rand',
             'coeff_figure_1',
             'coeff_figure_2',
             'coeff_table',
             'design_db_eff',
             'design_pilot',
             'design_rand',
             'model',
             'p1',
             'p2',
             'p3',
             'power_db_eff',
             'power_rand',
             'profiles',
             'survey'),
    type = c('A tibble: 375 × 41',
             'A tibble: 70 × 36',
             'A tibble: 100 × 34',
             'A tibble: 50 × 36',
             'A tibble: 125 × 33',
             'A tibble: 25 × 36',
             'A tibble: 8000 × 12',
             'A tibble: 8000 × 11',
             'A data frame: 3 × 5',
             'A data frame: 3 × 5',
             'A data frame: 6 × 5',
             'A tibble: 8000 × 11',
             'A tibble: 7500 × 12',
             'A tibble: 8000 × 10',
             'A list: 32',
             'A tibble: 29 × 11',
             'A tibble: 28 × 11',
             'A tibble: 28 × 11',
             'A data frame: 50 × 4',
             'A data frame: 50 × 4',
             'A data frame: 216 × 6',
             'A tibble: 1,500 × 12'),
    description = c(
        'Conjoint questions for the pilot survey.',
        'Longer version of choice data about charging.',
        'Longer version of choice data about main demography.',
        'Longer version of choice data about education and income.',
        'Longer version of choice data about vehicle.',
        'Original wider version of choice data.',
        'Simulated random choices for Bayesian D-efficient design.',
        'Simulated random choices for randomized design.',
        'Coeffients used for model summary figure 1.',
        'Coeffients used for model summary figure 2.',
        'Coeffients used for model summary table.',
        'Survey design in Bayesian D-efficient fractional factorial.',
        'Survey design for pilot survey.',
        'Survey design in Randomized factorial.',
        'Utility model generated by logitr.',
        'Part 1 of pilot survey result from formr.',
        'Part 2 of pilot survey result from formr.',
        'Part 3 of pilot survey result from formr.',
        'Power analysis for Bayesian D-efficient design.',
        'Power analysis for randomized design.',
        'Profiles with attributes and levels (same for all designs).',
        'Conjoint questions for the pilot survey (online version).'
        )
) %>% 
    mutate(count = row_number()) %>%
    select(count, everything())
dict_df %>% 
    kable(format = 'html',
          escape = FALSE,
          align = c('c', 'l', 'l', 'l'),
          col.names = c('No.', 'Name', 'Type', 'Description'),
          caption = 'Dictionary of Data Frames') %>%
    kable_styling(bootstrap_options = c('striped', 'hover'))
Dictionary of Data Frames
No. Name Type Description
1 choice_data A tibble: 375 × 41 Conjoint questions for the pilot survey.
2 choice_data_charging A tibble: 70 × 36 Longer version of choice data about charging.
3 choice_data_demo A tibble: 100 × 34 Longer version of choice data about main demography.
4 choice_data_demo_2 A tibble: 50 × 36 Longer version of choice data about education and income.
5 choice_data_vehicle A tibble: 125 × 33 Longer version of choice data about vehicle.
6 choice_data_wider A tibble: 25 × 36 Original wider version of choice data.
7 choice_db_eff A tibble: 8000 × 12 Simulated random choices for Bayesian D-efficient design.
8 choice_rand A tibble: 8000 × 11 Simulated random choices for randomized design.
9 coeff_figure_1 A data frame: 3 × 5 Coeffients used for model summary figure 1.
10 coeff_figure_2 A data frame: 3 × 5 Coeffients used for model summary figure 2.
11 coeff_table A data frame: 6 × 5 Coeffients used for model summary table.
12 design_db_eff A tibble: 8000 × 11 Survey design in Bayesian D-efficient fractional factorial.
13 design_pilot A tibble: 7500 × 12 Survey design for pilot survey.
14 design_rand A tibble: 8000 × 10 Survey design in Randomized factorial.
15 model A list: 32 Utility model generated by logitr.
16 p1 A tibble: 29 × 11 Part 1 of pilot survey result from formr.
17 p2 A tibble: 28 × 11 Part 2 of pilot survey result from formr.
18 p3 A tibble: 28 × 11 Part 3 of pilot survey result from formr.
19 power_db_eff A data frame: 50 × 4 Power analysis for Bayesian D-efficient design.
20 power_rand A data frame: 50 × 4 Power analysis for randomized design.
21 profiles A data frame: 216 × 6 Profiles with attributes and levels (same for all designs).
22 survey A tibble: 1,500 × 12 Conjoint questions for the pilot survey (online version).

3. Miscellaneous

3.1 Summary of Model

Code
summary(model)
#> =================================================
#> 
#> Model estimated on: Sun Oct 06 23:53:11 2024 
#> 
#> Using logitr version: 1.1.2 
#> 
#> Call:
#> logitr(data = choice_data, outcome = "choice", obsID = "obs_id", 
#>     pars = c("upfront", "lv_2_charger_yes", "lv_2_charger_no", 
#>         "discount", "window", "guaranteed"))
#> 
#> Frequencies of alternatives:
#>     1     2     3 
#> 0.464 0.432 0.104 
#> 
#> Exit Status: 3, Optimization stopped because ftol_rel or ftol_abs was reached.
#>                                 
#> Model Type:    Multinomial Logit
#> Model Space:          Preference
#> Model Run:                1 of 1
#> Iterations:                   18
#> Elapsed Time:        0h:0m:0.01s
#> Algorithm:        NLOPT_LD_LBFGS
#> Weights Used?:             FALSE
#> Robust?                    FALSE
#> 
#> Model Coefficients: 
#>                    Estimate Std. Error z-value  Pr(>|z|)    
#> upfront           0.0163208  0.0027778  5.8754 4.218e-09 ***
#> lv_2_charger_yes -7.5792292  1.5417186 -4.9161 8.829e-07 ***
#> lv_2_charger_no  -9.8322095  1.8565463 -5.2960 1.184e-07 ***
#> discount          0.0451776  0.0127580  3.5411 0.0003984 ***
#> window            0.2911476  0.1642019  1.7731 0.0762109 .  
#> guaranteed        0.0827197  0.0170261  4.8584 1.183e-06 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>                                     
#> Log-Likelihood:          -56.7669262
#> Null Log-Likelihood:    -137.3265361
#> AIC:                     125.5338524
#> BIC:                     142.5037000
#> McFadden R2:               0.5866281
#> Adj McFadden R2:           0.5429367
#> Number of Observations:  125.0000000

3.2 Randomized Design

Counts of each level of each attribute:

Code
cbc_balance(design_rand)
#> =====================================
#> Individual attribute level counts
#> 
#> upfront:
#> 
#>  100  300  500 
#> 2781 2576 2643 
#> 
#> lv_2_charger:
#> 
#>   No  Yes 
#> 4013 3987 
#> 
#> discount:
#> 
#>   10   25   50 
#> 2676 2640 2684 
#> 
#> window:
#> 
#>  0.5    1    2    4 
#> 1963 2023 1965 2049 
#> 
#> guaranteed:
#> 
#>   25   50   75 
#> 2681 2610 2709 
#> 
#> =====================================
#> Pairwise attribute level counts
#> 
#> upfront x lv_2_charger:
#> 
#>            No  Yes
#>       NA 4013 3987
#> 100 2781 1408 1373
#> 300 2576 1288 1288
#> 500 2643 1317 1326
#> 
#> upfront x discount:
#> 
#>            10   25   50
#>       NA 2676 2640 2684
#> 100 2781  902  923  956
#> 300 2576  853  849  874
#> 500 2643  921  868  854
#> 
#> upfront x window:
#> 
#>           0.5    1    2    4
#>       NA 1963 2023 1965 2049
#> 100 2781  647  705  712  717
#> 300 2576  649  654  607  666
#> 500 2643  667  664  646  666
#> 
#> upfront x guaranteed:
#> 
#>            25   50   75
#>       NA 2681 2610 2709
#> 100 2781  879  936  966
#> 300 2576  865  841  870
#> 500 2643  937  833  873
#> 
#> lv_2_charger x discount:
#> 
#>            10   25   50
#>       NA 2676 2640 2684
#> No  4013 1353 1326 1334
#> Yes 3987 1323 1314 1350
#> 
#> lv_2_charger x window:
#> 
#>           0.5    1    2    4
#>       NA 1963 2023 1965 2049
#> No  4013  986  998  985 1044
#> Yes 3987  977 1025  980 1005
#> 
#> lv_2_charger x guaranteed:
#> 
#>            25   50   75
#>       NA 2681 2610 2709
#> No  4013 1338 1273 1402
#> Yes 3987 1343 1337 1307
#> 
#> discount x window:
#> 
#>          0.5    1    2    4
#>      NA 1963 2023 1965 2049
#> 10 2676  636  672  685  683
#> 25 2640  671  700  633  636
#> 50 2684  656  651  647  730
#> 
#> discount x guaranteed:
#> 
#>           25   50   75
#>      NA 2681 2610 2709
#> 10 2676  924  854  898
#> 25 2640  878  888  874
#> 50 2684  879  868  937
#> 
#> window x guaranteed:
#> 
#>            25   50   75
#>       NA 2681 2610 2709
#> 0.5 1963  661  646  656
#> 1   2023  697  658  668
#> 2   1965  661  635  669
#> 4   2049  662  671  716

Overlap across attributes:

Code
cbc_overlap(design_rand)
#> ==============================
#> Counts of attribute overlap:
#> (# of questions with N unique levels)
#> 
#> upfront:
#> 
#>    1    2 
#> 1304 2696 
#> 
#> lv_2_charger:
#> 
#>    1    2 
#> 2033 1967 
#> 
#> discount:
#> 
#>    1    2 
#> 1337 2663 
#> 
#> window:
#> 
#>    1    2 
#> 1002 2998 
#> 
#> guaranteed:
#> 
#>    1    2 
#> 1270 2730

3.3 Bayesian D-efficient Design

Counts of each level of each attribute:

Code
cbc_balance(design_db_eff)
#> =====================================
#> Individual attribute level counts
#> 
#> blockID:
#> 
#>    1    2    3 
#> 2672 2672 2656 
#> 
#> upfront:
#> 
#>  100  300  500 
#> 2999 2503 2498 
#> 
#> lv_2_charger:
#> 
#>   No  Yes 
#> 4500 3500 
#> 
#> discount:
#> 
#>   10   25   50 
#> 1832 2333 3835 
#> 
#> window:
#> 
#>  0.5    1    2    4 
#> 3000  499 1335 3166 
#> 
#> guaranteed:
#> 
#>   25   50   75 
#> 4002 1499 2499 
#> 
#> =====================================
#> Pairwise attribute level counts
#> 
#> blockID x upfront:
#> 
#>         100  300  500
#>     NA 2999 2503 2498
#> 1 2672 1169 1336  167
#> 2 2672  668  835 1169
#> 3 2656 1162  332 1162
#> 
#> blockID x lv_2_charger:
#> 
#>          No  Yes
#>     NA 4500 3500
#> 1 2672 1336 1336
#> 2 2672 1670 1002
#> 3 2656 1494 1162
#> 
#> blockID x discount:
#> 
#>          10   25   50
#>     NA 1832 2333 3835
#> 1 2672  668  835 1169
#> 2 2672  334  668 1670
#> 3 2656  830  830  996
#> 
#> blockID x window:
#> 
#>         0.5   1    2    4
#>     NA 3000 499 1335 3166
#> 1 2672  835 167 1002  668
#> 2 2672 1169   0  167 1336
#> 3 2656  996 332  166 1162
#> 
#> blockID x guaranteed:
#> 
#>          25   50   75
#>     NA 4002 1499 2499
#> 1 2672 1169  334 1169
#> 2 2672 1837  501  334
#> 3 2656  996  664  996
#> 
#> upfront x lv_2_charger:
#> 
#>            No  Yes
#>       NA 4500 3500
#> 100 2999 1667 1332
#> 300 2503 1335 1168
#> 500 2498 1498 1000
#> 
#> upfront x discount:
#> 
#>            10   25   50
#>       NA 1832 2333 3835
#> 100 2999  832  834 1333
#> 300 2503  668  500 1335
#> 500 2498  332  999 1167
#> 
#> upfront x window:
#> 
#>           0.5   1    2    4
#>       NA 3000 499 1335 3166
#> 100 2999 1167 166  167 1499
#> 300 2503  668 167  835  833
#> 500 2498 1165 166  333  834
#> 
#> upfront x guaranteed:
#> 
#>            25   50   75
#>       NA 4002 1499 2499
#> 100 2999  999  499 1501
#> 300 2503 1836  334  333
#> 500 2498 1167  666  665
#> 
#> lv_2_charger x discount:
#> 
#>            10   25   50
#>       NA 1832 2333 3835
#> No  4500  834 1333 2333
#> Yes 3500  998 1000 1502
#> 
#> lv_2_charger x window:
#> 
#>           0.5   1    2    4
#>       NA 3000 499 1335 3166
#> No  4500 1666 166  834 1834
#> Yes 3500 1334 333  501 1332
#> 
#> lv_2_charger x guaranteed:
#> 
#>            25   50   75
#>       NA 4002 1499 2499
#> No  4500 2335  667 1498
#> Yes 3500 1667  832 1001
#> 
#> discount x window:
#> 
#>          0.5   1    2    4
#>      NA 3000 499 1335 3166
#> 10 1832  500 333  334  665
#> 25 2333  999 166  167 1001
#> 50 3835 1501   0  834 1500
#> 
#> discount x guaranteed:
#> 
#>           25   50   75
#>      NA 4002 1499 2499
#> 10 1832 1001  332  499
#> 25 2333  834  499 1000
#> 50 3835 2167  668 1000
#> 
#> window x guaranteed:
#> 
#>            25   50   75
#>       NA 4002 1499 2499
#> 0.5 3000 1334  499 1167
#> 1    499  167  166  166
#> 2   1335 1169    0  166
#> 4   3166 1332  834 1000

Overlap across attributes:

Code
cbc_overlap(design_db_eff)
#> ==============================
#> Counts of attribute overlap:
#> (# of questions with N unique levels)
#> 
#> blockID:
#> 
#>    1 
#> 4000 
#> 
#> upfront:
#> 
#>    1    2 
#> 2503 1497 
#> 
#> lv_2_charger:
#> 
#>    1    2 
#> 1502 2498 
#> 
#> discount:
#> 
#>    1    2 
#> 1667 2333 
#> 
#> window:
#> 
#>    1    2 
#>  835 3165 
#> 
#> guaranteed:
#> 
#>    1    2 
#> 1669 2331
Back to top