Quantifying the Benefits and Constraints of Plug-In Electric Vehicle Smart Charging Adoption
Authors
Pingfan Hu
Bharath Ravindra
Sampada Dhakal
Vedanth Surendra Hegde
Published
November 5, 2023
Abstract
The product being studied in the project proposal is “Smart Charging Solutions for Plug-in Electric Vehicles (PEVs).” The initiative aims to investigate the reception and potential adoption of these technologies among users who own or might consider owning PEVs. The focus is on understanding consumer attitudes, preferences, and potential barriers regarding the adoption of smart charging solutions for electric vehicles. The project intends to delve into aspects such as awareness, perceived benefits, and potential hindrances associated with smart charging technologies for PEVs, exploring attributes like compensation, electricity pricing, charging speed, and more.
The investigation of consumer inclinations and considerations within the project was centered on a strategic analysis of distinct product attributes and their corresponding design decisions. These key attributes included Upfront Incentive, Free Level 2 Charger, Electricity Price Discount, Override Window, and the commitment to a Guaranteed Range if smart charged for 8 hours. Notably, the design decisions for these attributes were structured with deliberate variations: the Upfront Incentive ranged across $100, $300, and $500, the provision of a Free Level 2 Charger was delineated as ‘Yes’ or ‘No’, and the Electricity Price Discount was presented at three distinct rates: 10, 25, and 50. Additionally, the Guaranteed Range if smart charged for 8 hours offered choices of 25, 50, and 75 units. These carefully curated design decisions were aimed at assessing consumer responsiveness to diverse incentive structures, charging infrastructure availability, pricing models, and the potential impact of guaranteed range, thereby offering comprehensive insights into the dynamics of smart charging solution adoption among consumers.
Introduction
The global drive for sustainable energy has propelled Plug-in Electric Vehicles (PEVs) to the forefront in the fight against climate change. Smart charging technologies, integrating advanced grid systems and digital connectivity, offer a transformative opportunity to enhance PEVs’ role in a sustainable future by optimizing charging processes, improving grid stability, and reducing costs.
This project proposal aims to gauge user acceptance of smart charging solutions for PEVs. It delves into factors influencing consumer choices, uncovering awareness levels, benefits, and barriers associated with smart charging among PEV owners. Using surveys, interviews, and data analysis, the project aims to reveal user attitudes, preferences, and concerns about smart charging.
The comprehensive approach considers various attributes like compensation, electricity price, charging speed, window, and percentage, exploring user attitudes toward smart charging. It also evaluates the impact of compensation on adoption rates, offering valuable insights for stakeholders in the electric vehicle industry.
In an era focused on environmental awareness and sustainable transportation, this project accelerates the shift toward cleaner and more efficient mobility solutions. It aligns with global energy sustainability goals, emphasizing the need to reduce carbon emissions and create a smarter, more connected, and environmentally responsible landscape for electric vehicle charging.
Survey Design
The survey is divided into three sections. Section 1 covers PEV Ownership and Charging Habits Questions, Section 2 addresses Smart Charging Problems, and Section 3 gathers Demographic Information.
Before starting the actual survey, participants will see a consent page and a 30-second educational video to introduce them to Smart Charging. The video explains smart charging and how it allows the utility to control the PEV charging process, including the time window and charging amount.
Those who don’t have and don’t intend to have one will be screened out at the end of Section 1 as these respondents might not provide us with much valuable information. But this is not implemented in the pilot version but will be included in the final version.
Section 1 begins by asking participants about their interest in PEVs and whether they own, lease, plan to have one in the future, or do not have one. It also inquires about their preferred PEV model, preferred mileage, and charging habits at home and work. The respondents who have PEVs or tend to have PEVs are within our eligibility.
Section 2, focusing on Smart Charging Problems, consists of five questions per respondent. This section addresses five main issues faced by smart charging: Upfront Incentive, Free Level 2 charger, Electricity Price discount, Override Window, and Guaranteed Range if charged for 8 hours. Upfront Incentive refers to the instant payment when joining the smart charging program, ranging from $100 to $500. Free Level 2 charger is a faster charging option, with respondents choosing between “Yes” or “No.” Electricity Price discount offers a discount ranging from 10% to 50% if they join the smart charging program. Override Window refers to the longest daily period where users can override regular charging, with values ranging from 0.5 to 4 hours. Guaranteed Range if charged for 8 hours specifies the range achieved if the car is charged for 8 hours during the smart charging period. The value shown ranges from 25% to 75% of the maximum range in miles, relying on users’ input.
This part of our survey involves a small twist: the respondent’s choice in Section 1 question determines the percentage attribute. In the guaranteed range section, we’ll calculate mileage based on the user’s input from Section 1. The guaranteed range in percentage ranges from 25% to 75% for the user’s input. Each question has three alternatives, with the third option being “I am not interested.” Since these attributes are randomly generated, some options may seem illogical or overly obvious. To ensure respondent understanding, we’ve included a sentence asking them to choose between two options if those were the only choices available. Additionally, a third option, “I am not interested,” is provided for those who do not show interest in the available options. Similarly, to ensure clarity, Section 2 starts with a description of the attributes, their values, and a sample question to guide respondents. Below is an example of our conjoint question.
Section 3 collects demographic information, including birth year, gender identity, race, educational background, household income, interest in the environment and climate change, political inclination, current state of residence, commute method, number of cars in their household, and the presence of backup cars, if any, along with their quantity.
Pilot Data Analysis
This section describes the analysis that the team carried out using our pilot data. It contains 4 subsections: sample description, data cleaning, modeling, and power analysis.
The team performed the data analysis in 2 steps:
Step 1: We generated a randomized full-factorial design of cbc choice question set using cbc_design() and fed into the survey. This survey results were collected, cleaned, and stored as a csv file. Then evaluated using the logitr package. This process aims to establish a reusable workflow for future surveys, data collections, and analysis. The survey results are less important at this stage.
Step 2: We generated a randomized full-factorial design as well as a Bayesian D-efficient fractional factorial design using cbc_design(), each simulated using cbc_choices() for a set of random choice results. The results are then fed into cbc_power() for power analysis.
Step 1 generates the the following results:
Summary statistics of vehicle info, charging preferences, demographic info, etc.
The utility model equation containing the 6 beta values.
Step 2 is to generate a better survey, including the number of respondents, number of questions per person, number of alternatives, and most importantly, how will the different levels of attributes distribute across the survey questions.
The team should continuously work on step 2 to generate a feasible survey design, and then re-perform step 1 to analyse and visualize the survey data. However, in this analysis, the team has only performed several attempts on power analysis. The team will continue to work on power analysis and generate the most feasible survey design.
The analysis breakdown is described in the following 4 subsections, and the data dictionary is show in the appendix (see 2. Data Dictionary).
1. Sample Description
As described previously, we generated a randomized full-factorial design of cbc choice question set using cbc_design() and fed into the survey. This initial design is generated using the following codes:
Code
# Define profiles with attributes and levelsprofiles <-cbc_profiles(upfront =c(100, 300, 500), # USDlv_2_charger =c('No', 'Yes'),discount =c(10, 25, 50), # percentagewindow =c(0.5, 1, 2, 4), # hrsguaranteed =c(25, 50, 75) # percentage)# Make a basic survey using the full factorial of all profilesdesign_pilot <-cbc_design(profiles = profiles,n_resp =500, # Number of respondentsn_alts =2, # Number of alternatives per questionn_q =5, # Number of questions per respondentno_choice =TRUE)# Save designwrite_csv(design_pilot, here('pilot', 'data', 'cbc_questions_pilot.csv'))
In the above codes, the cbc_profiles() function is used to define the attributes, which is the same as the attributes table described in our pilot survey.
The cbc_design() function is used to define the conjoint questions. There ought to be 500 respondents, 2 alternatives per question, and 5 questions per respondent. Therefore, there should be 2500 question responses. There is also a “No Choice” included. These values are arbitrarily chosen.
The summary statistics of our sample survey results will be shown after the Data Cleaning process.
2. Data Cleaning
In fact, the team did not have 500 participants. The team had 30 sample surveys in total, out of which there were 25 successful respondents collected, which means 125 question responses. The data cleaning process is written in the code chunks below.
Firstly, we download survey data from formr and import as dataframes p1, p2, and p3.
p1 <- p1 %>%mutate(created =ymd_hms(created, tz ='EST'),ended =ymd_hms(ended, tz ='EST'),time_sec_p1 =as.numeric(ended - created, units ='secs')) %>%# Select important columnsselect(session, time_sec_p1, starts_with('mc'))p2 <- p2 %>%mutate(created =ymd_hms(created),ended =ymd_hms(ended),time_sec_p2 =as.numeric(ended - created, units ='secs')) %>%# Select important columnsselect(session, time_sec_p2, respondentID, cbc_all_same, starts_with('mc'))p3 <- p3 %>%mutate(created =ymd_hms(created),ended =ymd_hms(ended),time_sec_p3 =as.numeric(ended - created, units ='secs')) %>%# Select important columnsselect(session, time_sec_p3, completion_code, starts_with('mc'))# Join all parts together using the session variablechoice_data_wider <- p1 %>%left_join(p2, by ='session') %>%left_join(p3, by ='session') %>%# No longer need session variableselect(-session)
Here the 3 parts of the survey data is stored as choice_data_wider. It is a wider format, and will later be pivoted into different longer formats according to our purposes.
Then, we drop the unnecessary parts:
Code
# Drop anyone who didn't complete all choice questionschoice_data_wider <- choice_data_wider %>%filter(!is.na(mc_cbc_1)) %>%filter(!is.na(mc_cbc_2)) %>%filter(!is.na(mc_cbc_3)) %>%filter(!is.na(mc_cbc_4)) %>%filter(!is.na(mc_cbc_5))# Drop respondents who went too fastchoice_data_wider <- choice_data_wider %>%mutate(# First replace NA values with 0 secondstime_sec_p1 =ifelse(is.na(time_sec_p1), 0, time_sec_p1),time_sec_p2 =ifelse(is.na(time_sec_p2), 0, time_sec_p2),time_sec_p3 =ifelse(is.na(time_sec_p3), 0, time_sec_p3),# Now compute the total timetime_min_total = (time_sec_p1 + time_sec_p2 + time_sec_p3) /60 )# Drop anyone who finished in under the 10th percentile of completion timestime_10 <-quantile(choice_data_wider$time_min_total, 0.1)choice_data_wider <- choice_data_wider %>%filter(time_min_total >= time_10)# Drop respondents that got the attention check question wrongchoice_data_wider <- choice_data_wider %>%filter(mc_cbc_practice =='option_2')
The team followed Prof. Helveston’s instructions on dropping criteria, including those were:
Incomplete
Too fast
Under 10th percentile of completion time
Wrong in the attention check question
After we have the wide formatted data, it’s time to generate the choice data. We convert the wide data into a long format according to the 5 conjoint questions, wrangle the data to the desired format, and then save as csv:
Code
# First convert the data to long formatchoice_data <- choice_data_wider %>%pivot_longer(cols = mc_cbc_1:mc_cbc_5,names_to ='qID',values_to ='choice') %>%# Convert the qID variable to a numbermutate(qID =parse_number(qID))# Read in choice questions and join it to choice_datasurvey <-read_csv('https://raw.githubusercontent.com/pingfan-hu/My-Resources/main/data/cbc_test.csv')choice_data <- choice_data %>%rename(respID = respondentID) %>%left_join(survey, by =c('respID', 'qID'), relationship ='many-to-many')# Convert choice column to 1 or 0 based on if the alternative was chosen choice_data <- choice_data %>%mutate(choice =case_when(str_detect(choice, '^option_') ~as.numeric(str_replace(choice, 'option_', '')), choice =='not_interested'~3,TRUE~as.numeric(choice) ),choice =ifelse(choice == altID, 1, 0) ) %>%select(-mc_cbc_practice, -cbc_all_same)# Create new values for respID & obsIDnRespondents <-nrow(choice_data_wider)nAlts <-max(survey$altID)nQuestions <-max(survey$qID)choice_data$respID <-rep(seq(nRespondents), each = nAlts*nQuestions)choice_data$obsID <-rep(seq(nRespondents*nQuestions), each = nAlts)# Reorder columns - it's nice to have the 'ID' variables firstchoice_data <- choice_data %>%select(ends_with('ID'), 'choice', 'upfront','lv_2_charger_Yes', 'lv_2_charger_No','discount', 'window', 'guaranteed','no_choice', everything())# Clean up names for choice_datachoice_data <-clean_names(choice_data)# Save cleaned data for modelingwrite_csv(choice_data, here('pilot', 'data', 'choice_data_pilot.csv'))
3. Summary Statistics
This is instructed to be in the Sample Description subsection, but in fact can only be processed after Data Cleaning. The summary statistics will be shown in 3 parts:
Vehicle Info Summary
Charging Preference Summary
Demographic Info Summary
We utilize the choice_data_wider data frame for all 3 parts.
3.1 Vehicle Info Summary
For vehicle info summary, we firstly pivot the data:
Since “Upfront”, “Discount”, and “Guaranteed” are in large numbers, their coefficients are plotted together in Part 1.
Since “Charger”, “No Charger”, and “Window” are in small numbers, their coefficients are plotted together in Part 2.
The larger the attribute level numbers are, the larger expansion that the distributions will be.
With only 25 successful questionnaires, the uncertainty is considerably high. More samples are required to hedge the uncertainty.
Important notice: The analysis of the 25 successful questionnaires ends here. In the Power Analysis section, we will simulate the results and figure out a proper way to redesign the survey.
5. Power Analysis
In this section, we will attempt to redesign the survey by simulating survey results. Below is the step-by-step digest:
We use the cbc_profiles() function to generate a profile for our 5 attributes.
Then, we use cbc_design() to create a design for the survey. The team purposes to do two designs: randomized full-factorial, and Bayesian D-efficient fractional factorial.
Use cbc_choices() to simulate the random choices: this is a critical part, indicating that we only use random answers for the surveys, without collecting real data from human users.
Use cbc_power() to perform power analysis.
Generate plots of the power analysis, and fine tune the cbc_design().
The purpose of the power analysis is to figure out:
What is the most feasible survey design, including the number of questions, respondents, and alternatives.
The diminishing marginal utility exists in our design. How can we judge the best point for the return of our effort.
5.1 Randomized Full Factorial Design
To start, we define profiles with attributes and levels:
Then, we generate a randomized full-factorial design:
Code
design_rand <-cbc_design(profiles = profiles,n_resp =500, # Number of respondentsn_alts =2, # Number of alternatives per questionn_q =8# Number of questions per respondent)
Use cbc_balance() to show counts of each level of each attribute:
Then, we generate a Bayesian D-efficient fractional factorial design and save as RData:
Code
design_db_eff <-cbc_design(profiles = profiles,n_resp =500, # Number of respondentsn_alts =2, # Number of alternatives per questionn_q =8, # Number of questions per respondentn_start =5, # Number of random starts for D-efficient designn_blocks =3, # More blocks increases overall efficiencypriors =list(upfront =0.02,lv_2_charger =1,discount =0.05,window =0.3,guaranteed =0.08 ),method ='Modfed'# The Modified Federov Algorithm)save(design_db_eff,file =here('pilot', 'data', 'design_db_eff_pilot.RData'))
The Modfed algorithm is computationally rich. The above codes are set as #| eval: false to prevent from repeated processing. We read the Bayesian D-efficient design from RData file:
ggplot(power_db_eff) +geom_hline(yintercept =0.05, color ='red', linetype =2) +geom_point(aes(x = sampleSize, y = se, color = coef)) +expand_limits(y =0) +scale_y_continuous(limits =c(0, 0.125)) +theme_bw(base_family ='Ubuntu') +labs(x ='Sample size', y ='Standard error', color ='Coefficient',title ='Bayesian D-efficient Design Plot' )
5.3 Interpretation of Power Analysis
Since both analysis contain a certain level of randomization, the output results and the plots are not unique. The stand errors of each attribute gradually decreases and reaches to a steady level at about 400 of sample size, which also indicates the expected survey sample size.
Here is the modification that the team will apply onto the final design:
Increase the number of questions per person to 8.
Remove the “Not Interested” option.
Modify the sample size to 400, but may require further discussion and judgement.
Both cbc_balance() and cbc_overlap() shows detailed distribution of attributes and options. This will be further investigated later.
Above is merely the modification on the conjoint question design. For a more inclusive improvement plan, please see the Changes to Final Survey section below.
Changes to Final Survey
Data and feedback from the pilot survey have played a pivotal role in refining our survey and analysis. In this section, we’ll highlight the key adjustments made to our final survey based on insights and observations gathered during the pilot phase. These adjustments not only help us enhance the survey’s comprehensiveness and precision but also guide us in designing and planning our analysis.
The pilot analysis provided us with profound insights into how users engage with our conjoint questions. The choice data obtained from our pilot survey is instrumental in enhancing and fine-tuning our final survey. This ensures that the data collected in the final survey will hold greater significance and precision for our analysis.
1. Current Version
Our survey is structured into three distinct sections:
Section 1: This section seeks to determine whether the participant is an PEV owner or a prospective PEV buyer. It collects essential data about the vehicle’s range, the preferred charging window, and the charging location.
Section 2: Here, we provide a comprehensive explanation of the attributes crucial for making informed choices. Additionally, we’ve incorporated a set of five conjoint questions, refined based on insights from our pilot survey.
Section 3: This section contains demographic questions that offer valuable insights into the participant’s demographic profile.
2. Improvements
Since this survey is already not a pure “pilot” version, there are improvements that the team has done and intend to do. This section will contain both.
Below are the improvements that we have already done:
For section 1: We’ve relocated the question about the maximum range of the PEV from section 2 to section 1.
For section 1: An educational video has been included to explain smart charging.
For section 2: A helpful link to a page explaining the attributes has been included for all questions, allowing users to clarify any doubts they may have about the survey’s attributes.
For section 3: We’ve enhanced the robustness of demographic questions to collect more detailed information about the participants.
Below are the improvements that we intend to do:
(Important) For Section 2: We consider improving the conjoint survey design. Firstly, we increase the number of conjoint questions from 5 to 8; secondly, we are considering the removal of the “Not Interested” response option from the survey. After that, we will perform better power analysis to figure out a reasonable distribution of overlap and balance of the revised survey design.
For Section 2: We consider simplifying the contents in the smart charging programs attributes table, and add images to enhance understanding.
For Section 3: Instead of asking which party they voted for, a more general political question will be which party they intend to vote for, or they want to support.
For Section 3: Change the state to ZIP code.
For Section 3: Put all of the travel related questions on a single page, then, the demographic questions shall be in another page.
Attribution
This pilot analysis report is done by all members of the PEV Team:
Now that we have known basics about your EV ownership and your charging habits, let’s move on to Smart Charging program selections!
A Smart Charging program consists of 5 attributes: Upfront Incentive, Free Level 2 Charger, Electricity Price Discount, Override Window, and Guaranteed Range if charged for 8 hrs.
Here is a brief explanation of each attribute:
Attribute
Range
Explanation
Upfront Incentive:
$100, $300, $500
It means the amount of money you will be instantly paid if you join the smart charging program. The unit is in USD.
Free Level 2 Charger:
No or Yes
A "Yes" means you will get a free level 2 charger. A level 2 charger enables faster charging than a standard home charger.
Electricity Price Discount:
10%, 25%, 50%
If you choose to join this smart charging program, this will be your electricity price discount.
Override Window:
0.5 ~ 4 hrs
This is the longest time window per day that you can force override to regular charging, during which time your EV is not controlled by the smart charging program.
Guaranteed Range if charged for 8 hrs:
25%, 50%, 75%
It means the guranteed range if your car is charged for 8 hrs during smart charging. It shows percentage here, but will be converted to mileages.
The above table can be accessed via this link. It’s available on top of each question so that you can refer at any time.
The next page provides a sample question to enhance your understanding.
1.3 Section 2 - Sample Question
You are provided with 2 options of smart charging programs, and an “I am not interested” option. You need to treat them as if they were your only options and choose the one that you most prefer. You can choose the “I am not interested” option if you are not interested in either.
You can access the attribute table using this link.
Options:
Option 1
Option 2
Upfront Incentive:
$100
$500
Free Level 2 Charger:
No
Yes
Electricity Price Discount:
25%
50%
Override Window:
0.5 hrs
4 hrs
Guaranteed Range if charged for 8 hrs:
100 miles
300 miles
Option 1
Option 2
I am not interested
1.4 Section 2 - Ready to Start the Questions
The sample question is quite straight-forward. Since Option 2 has every attribute better than Option 1, I bet you have chosen it.
However, this simplicity is not always the case. There will be trade-offs between the 2 options, and you need to make your own decision. For example, one option might have a large up-front but a low discount, while the other one is the inverse. Then, you need to make decisions based on your preference.
There are 5 choice questions in total. For each question, you have 2 valid options to choose from. Again, you make your choice as if they were your ONLY options. If you are really not interested in either of them, you can choose “I am not interested”.
Click on “Next Page” to proceed.
1.5 Question 1 of 5
(1 of 5) If your utility offers you these 2 smart charging programs, which one do you prefer?
You can access the attribute table using this link.
Options:
Option 1
Option 2
Upfront Incentive:
$100
$100
Free Level 2 Charger:
No
No
Electricity Price Discount:
25%
10%
Override Window:
1 hrs
4 hrs
Guaranteed Range if charged for 8 hrs:
25 %
25 %
Option 1
Option 2
I am not interested
The other 4 questions are the same and are thus omitted.
2. Data Dictionary
Dictionary of Files:
Code
dict_file <-data.frame(name =c('cbc_questions','choice_data','pevConjoint1','pevConjoint2','pevConjoint3'),type =c('CSV','CSV','CSV','CSV','CSV'),description =c('Conjoint questions for the pilot survey.','Cleaned results from the pilot survey.','Part 1 of pilot survey result from formr.','Part 2 of pilot survey result from formr','Part 3 of pilot survey result from formr')) %>%mutate(count =row_number()) %>%select(count, everything())dict_file %>%kable(format ='html',escape =FALSE,align =c('c', 'l', 'c', 'l'),col.names =c('No.', 'Name', 'Type', 'Description'),caption ='Dictionary of Files') %>%kable_styling(bootstrap_options =c('striped', 'hover'))
Dictionary of Files
No.
Name
Type
Description
1
cbc_questions
CSV
Conjoint questions for the pilot survey.
2
choice_data
CSV
Cleaned results from the pilot survey.
3
pevConjoint1
CSV
Part 1 of pilot survey result from formr.
4
pevConjoint2
CSV
Part 2 of pilot survey result from formr
5
pevConjoint3
CSV
Part 3 of pilot survey result from formr
Dictionary of Data Frames:
Code
dict_df <-data.frame(name =c('choice_data','choice_data_charging','choice_data_demo','choice_data_demo_2','choice_data_vehicle','choice_data_wider','choice_db_eff','choice_rand','coeff_figure_1','coeff_figure_2','coeff_table','design_db_eff','design_pilot','design_rand','model','p1','p2','p3','power_db_eff','power_rand','profiles','survey'),type =c('A tibble: 375 × 41','A tibble: 70 × 36','A tibble: 100 × 34','A tibble: 50 × 36','A tibble: 125 × 33','A tibble: 25 × 36','A tibble: 8000 × 12','A tibble: 8000 × 11','A data frame: 3 × 5','A data frame: 3 × 5','A data frame: 6 × 5','A tibble: 8000 × 11','A tibble: 7500 × 12','A tibble: 8000 × 10','A list: 32','A tibble: 29 × 11','A tibble: 28 × 11','A tibble: 28 × 11','A data frame: 50 × 4','A data frame: 50 × 4','A data frame: 216 × 6','A tibble: 1,500 × 12'),description =c('Conjoint questions for the pilot survey.','Longer version of choice data about charging.','Longer version of choice data about main demography.','Longer version of choice data about education and income.','Longer version of choice data about vehicle.','Original wider version of choice data.','Simulated random choices for Bayesian D-efficient design.','Simulated random choices for randomized design.','Coeffients used for model summary figure 1.','Coeffients used for model summary figure 2.','Coeffients used for model summary table.','Survey design in Bayesian D-efficient fractional factorial.','Survey design for pilot survey.','Survey design in Randomized factorial.','Utility model generated by logitr.','Part 1 of pilot survey result from formr.','Part 2 of pilot survey result from formr.','Part 3 of pilot survey result from formr.','Power analysis for Bayesian D-efficient design.','Power analysis for randomized design.','Profiles with attributes and levels (same for all designs).','Conjoint questions for the pilot survey (online version).' )) %>%mutate(count =row_number()) %>%select(count, everything())dict_df %>%kable(format ='html',escape =FALSE,align =c('c', 'l', 'l', 'l'),col.names =c('No.', 'Name', 'Type', 'Description'),caption ='Dictionary of Data Frames') %>%kable_styling(bootstrap_options =c('striped', 'hover'))
Dictionary of Data Frames
No.
Name
Type
Description
1
choice_data
A tibble: 375 × 41
Conjoint questions for the pilot survey.
2
choice_data_charging
A tibble: 70 × 36
Longer version of choice data about charging.
3
choice_data_demo
A tibble: 100 × 34
Longer version of choice data about main demography.
4
choice_data_demo_2
A tibble: 50 × 36
Longer version of choice data about education and income.
5
choice_data_vehicle
A tibble: 125 × 33
Longer version of choice data about vehicle.
6
choice_data_wider
A tibble: 25 × 36
Original wider version of choice data.
7
choice_db_eff
A tibble: 8000 × 12
Simulated random choices for Bayesian D-efficient design.
8
choice_rand
A tibble: 8000 × 11
Simulated random choices for randomized design.
9
coeff_figure_1
A data frame: 3 × 5
Coeffients used for model summary figure 1.
10
coeff_figure_2
A data frame: 3 × 5
Coeffients used for model summary figure 2.
11
coeff_table
A data frame: 6 × 5
Coeffients used for model summary table.
12
design_db_eff
A tibble: 8000 × 11
Survey design in Bayesian D-efficient fractional factorial.
13
design_pilot
A tibble: 7500 × 12
Survey design for pilot survey.
14
design_rand
A tibble: 8000 × 10
Survey design in Randomized factorial.
15
model
A list: 32
Utility model generated by logitr.
16
p1
A tibble: 29 × 11
Part 1 of pilot survey result from formr.
17
p2
A tibble: 28 × 11
Part 2 of pilot survey result from formr.
18
p3
A tibble: 28 × 11
Part 3 of pilot survey result from formr.
19
power_db_eff
A data frame: 50 × 4
Power analysis for Bayesian D-efficient design.
20
power_rand
A data frame: 50 × 4
Power analysis for randomized design.
21
profiles
A data frame: 216 × 6
Profiles with attributes and levels (same for all designs).
22
survey
A tibble: 1,500 × 12
Conjoint questions for the pilot survey (online version).
---title: "Test Analysis - Smart Charging for PEVs"subtitle: "Quantifying the Benefits and Constraints of Plug-In Electric Vehicle Smart Charging Adoption"date: "Nov 05, 2023"bibliography: ../references.bib---```{r}#| label: setup#| include: falselibrary(tidyverse)library(here)library(cbcTools)library(logitr)library(janitor)library(lubridate)library(fastDummies)library(kableExtra)library(tibble)```## AbstractThe product being studied in the project proposal is "Smart Charging Solutions for Plug-in Electric Vehicles (PEVs)." The initiative aims to investigate the reception and potential adoption of these technologies among users who own or might consider owning PEVs. The focus is on understanding consumer attitudes, preferences, and potential barriers regarding the adoption of smart charging solutions for electric vehicles. The project intends to delve into aspects such as awareness, perceived benefits, and potential hindrances associated with smart charging technologies for PEVs, exploring attributes like compensation, electricity pricing, charging speed, and more.The investigation of consumer inclinations and considerations within the project was centered on a strategic analysis of distinct product attributes and their corresponding design decisions. These key attributes included Upfront Incentive, Free Level 2 Charger, Electricity Price Discount, Override Window, and the commitment to a Guaranteed Range if smart charged for 8 hours. Notably, the design decisions for these attributes were structured with deliberate variations: the Upfront Incentive ranged across \$100, \$300, and \$500, the provision of a Free Level 2 Charger was delineated as 'Yes' or 'No', and the Electricity Price Discount was presented at three distinct rates: 10, 25, and 50. Additionally, the Guaranteed Range if smart charged for 8 hours offered choices of 25, 50, and 75 units. These carefully curated design decisions were aimed at assessing consumer responsiveness to diverse incentive structures, charging infrastructure availability, pricing models, and the potential impact of guaranteed range, thereby offering comprehensive insights into the dynamics of smart charging solution adoption among consumers.## IntroductionThe global drive for sustainable energy has propelled Plug-in Electric Vehicles (PEVs) to the forefront in the fight against climate change. Smart charging technologies, integrating advanced grid systems and digital connectivity, offer a transformative opportunity to enhance PEVs' role in a sustainable future by optimizing charging processes, improving grid stability, and reducing costs.This project proposal aims to gauge user acceptance of smart charging solutions for PEVs. It delves into factors influencing consumer choices, uncovering awareness levels, benefits, and barriers associated with smart charging among PEV owners. Using surveys, interviews, and data analysis, the project aims to reveal user attitudes, preferences, and concerns about smart charging.The comprehensive approach considers various attributes like compensation, electricity price, charging speed, window, and percentage, exploring user attitudes toward smart charging. It also evaluates the impact of compensation on adoption rates, offering valuable insights for stakeholders in the electric vehicle industry.In an era focused on environmental awareness and sustainable transportation, this project accelerates the shift toward cleaner and more efficient mobility solutions. It aligns with global energy sustainability goals, emphasizing the need to reduce carbon emissions and create a smarter, more connected, and environmentally responsible landscape for electric vehicle charging.## Survey DesignThe survey is divided into three sections. Section 1 covers PEV Ownership and Charging Habits Questions, Section 2 addresses Smart Charging Problems, and Section 3 gathers Demographic Information.Before starting the actual survey, participants will see a consent page and a 30-second educational video to introduce them to Smart Charging. The video explains smart charging and how it allows the utility to control the PEV charging process, including the time window and charging amount.Those who don't have and don't intend to have one will be screened out at the end of Section 1 as these respondents might not provide us with much valuable information. But this is not implemented in the pilot version but will be included in the final version. **Section 1** begins by asking participants about their interest in PEVs and whether they own, lease, plan to have one in the future, or do not have one. It also inquires about their preferred PEV model, preferred mileage, and charging habits at home and work. The respondents who have PEVs or tend to have PEVs are within our eligibility. **Section 2**, focusing on Smart Charging Problems, consists of five questions per respondent. This section addresses five main issues faced by smart charging: Upfront Incentive, Free Level 2 charger, Electricity Price discount, Override Window, and Guaranteed Range if charged for 8 hours. Upfront Incentive refers to the instant payment when joining the smart charging program, ranging from \$100 to \$500. Free Level 2 charger is a faster charging option, with respondents choosing between "Yes" or "No." Electricity Price discount offers a discount ranging from 10% to 50% if they join the smart charging program. Override Window refers to the longest daily period where users can override regular charging, with values ranging from 0.5 to 4 hours. Guaranteed Range if charged for 8 hours specifies the range achieved if the car is charged for 8 hours during the smart charging period. The value shown ranges from 25% to 75% of the maximum range in miles, relying on users' input.This part of our survey involves a small twist: the respondent's choice in Section 1 question determines the percentage attribute. In the guaranteed range section, we'll calculate mileage based on the user's input from Section 1. The guaranteed range in percentage ranges from 25% to 75% for the user's input. Each question has three alternatives, with the third option being "I am not interested." Since these attributes are randomly generated, some options may seem illogical or overly obvious. To ensure respondent understanding, we've included a sentence asking them to choose between two options if those were the only choices available. Additionally, a third option, "I am not interested," is provided for those who do not show interest in the available options. Similarly, to ensure clarity, Section 2 starts with a description of the attributes, their values, and a sample question to guide respondents. Below is an example of our conjoint question.```{r}alts <-tibble(`Options:`=c('Option 1', 'Option 2'),`Upfront Incentive:`= scales::dollar(c(100, 500)),`Free Level 2 Charger:`=c('No', 'Yes'), `Electricity Price Discount:`=c('25%', '50%'), `Override Window:`=c('0.5 hrs', '4 hrs'),`Guaranteed Range if charged for 8 hrs:`=c('100 miles', '300 miles'))alts_t <-as.data.frame(t(alts)) %>%rownames_to_column(var ='Attribute')colnames(alts_t) <-NULLkable(alts_t, escape =FALSE, align =c('r', 'c', 'c')) %>%kable_styling(bootstrap_options =c('striped', 'hover'),full_width =FALSE, position ='center') %>%column_spec(column =1, width ='20em') %>%column_spec(column =2, width ='10em') %>%column_spec(column =3, width ='10em') %>%row_spec(row =1, bold =TRUE)```**Section 3** collects demographic information, including birth year, gender identity, race, educational background, household income, interest in the environment and climate change, political inclination, current state of residence, commute method, number of cars in their household, and the presence of backup cars, if any, along with their quantity.## Pilot Data AnalysisThis section describes the analysis that the team carried out using our pilot data. It contains 4 subsections: sample description, data cleaning, modeling, and power analysis.The team performed the data analysis in **2 steps**:1. **Step 1**: We generated a randomized full-factorial design of cbc choice question set using `cbc_design()` and fed into the survey. This survey results were collected, cleaned, and stored as a `csv` file. Then evaluated using the `logitr` package. This process aims to establish a reusable workflow for future surveys, data collections, and analysis. The survey results are less important at this stage.2. **Step 2**: We generated a randomized full-factorial design as well as a Bayesian D-efficient fractional factorial design using `cbc_design()`, each simulated using `cbc_choices()` for a set of random choice results. The results are then fed into `cbc_power()` for power analysis.**Step 1** generates the the following results:1. Summary statistics of vehicle info, charging preferences, demographic info, etc.2. The utility model equation containing the 6 beta values.**Step 2** is to generate a better survey, including the number of respondents, number of questions per person, number of alternatives, **and most importantly, how will the different levels of attributes distribute across the survey questions.**The team should continuously work on step 2 to generate a feasible survey design, and then re-perform step 1 to analyse and visualize the survey data. However, in this analysis, the team has only performed several attempts on power analysis. The team will continue to work on power analysis and generate the most feasible survey design.The analysis breakdown is described in the following 4 subsections, and the data dictionary is show in the appendix (see [2. Data Dictionary]).### 1. Sample DescriptionAs described previously, we generated a randomized full-factorial design of cbc choice question set using `cbc_design()` and fed into the survey. This **initial design** is generated using the following codes:```{r}# Define profiles with attributes and levelsprofiles <-cbc_profiles(upfront =c(100, 300, 500), # USDlv_2_charger =c('No', 'Yes'),discount =c(10, 25, 50), # percentagewindow =c(0.5, 1, 2, 4), # hrsguaranteed =c(25, 50, 75) # percentage)# Make a basic survey using the full factorial of all profilesdesign_pilot <-cbc_design(profiles = profiles,n_resp =500, # Number of respondentsn_alts =2, # Number of alternatives per questionn_q =5, # Number of questions per respondentno_choice =TRUE)# Save designwrite_csv(design_pilot, here('pilot', 'data', 'cbc_questions_pilot.csv'))```In the above codes, the `cbc_profiles()` function is used to define the attributes, which is the same as the attributes table described in our [pilot survey](https://pev.pingfanhu.com/project/03_pilot_survey#start-of-section-2).The `cbc_design()` function is used to define the conjoint questions. There ought to be 500 respondents, 2 alternatives per question, and 5 questions per respondent. Therefore, there should be 2500 question responses. There is also a "No Choice" included. These values are arbitrarily chosen.The **summary statistics** of our sample survey results will be shown **after the Data Cleaning process**.### 2. Data CleaningIn fact, the team did not have 500 participants. The team had 30 sample surveys in total, out of which there were 25 successful respondents collected, which means 125 question responses. The data cleaning process is written in the code chunks below.Firstly, we download survey data from formr and import as dataframes `p1`, `p2`, and `p3`.```{r}p1 <-read_csv(here('pilot', 'data', 'pevConjoint1.csv'))p2 <-read_csv(here('pilot', 'data', 'pevConjoint2.csv'))p3 <-read_csv(here('pilot', 'data', 'pevConjoint3.csv'))```Then, we format and combine the 3 dataframes:```{r}p1 <- p1 %>%mutate(created =ymd_hms(created, tz ='EST'),ended =ymd_hms(ended, tz ='EST'),time_sec_p1 =as.numeric(ended - created, units ='secs')) %>%# Select important columnsselect(session, time_sec_p1, starts_with('mc'))p2 <- p2 %>%mutate(created =ymd_hms(created),ended =ymd_hms(ended),time_sec_p2 =as.numeric(ended - created, units ='secs')) %>%# Select important columnsselect(session, time_sec_p2, respondentID, cbc_all_same, starts_with('mc'))p3 <- p3 %>%mutate(created =ymd_hms(created),ended =ymd_hms(ended),time_sec_p3 =as.numeric(ended - created, units ='secs')) %>%# Select important columnsselect(session, time_sec_p3, completion_code, starts_with('mc'))# Join all parts together using the session variablechoice_data_wider <- p1 %>%left_join(p2, by ='session') %>%left_join(p3, by ='session') %>%# No longer need session variableselect(-session)```Here the 3 parts of the survey data is stored as `choice_data_wider`. It is a wider format, and will later be pivoted into different longer formats according to our purposes.Then, we drop the unnecessary parts:```{r}# Drop anyone who didn't complete all choice questionschoice_data_wider <- choice_data_wider %>%filter(!is.na(mc_cbc_1)) %>%filter(!is.na(mc_cbc_2)) %>%filter(!is.na(mc_cbc_3)) %>%filter(!is.na(mc_cbc_4)) %>%filter(!is.na(mc_cbc_5))# Drop respondents who went too fastchoice_data_wider <- choice_data_wider %>%mutate(# First replace NA values with 0 secondstime_sec_p1 =ifelse(is.na(time_sec_p1), 0, time_sec_p1),time_sec_p2 =ifelse(is.na(time_sec_p2), 0, time_sec_p2),time_sec_p3 =ifelse(is.na(time_sec_p3), 0, time_sec_p3),# Now compute the total timetime_min_total = (time_sec_p1 + time_sec_p2 + time_sec_p3) /60 )# Drop anyone who finished in under the 10th percentile of completion timestime_10 <-quantile(choice_data_wider$time_min_total, 0.1)choice_data_wider <- choice_data_wider %>%filter(time_min_total >= time_10)# Drop respondents that got the attention check question wrongchoice_data_wider <- choice_data_wider %>%filter(mc_cbc_practice =='option_2')```The team followed Prof. Helveston's instructions on **dropping** criteria, including those were:- Incomplete- Too fast- Under 10th percentile of completion time- Wrong in the attention check questionAfter we have the wide formatted data, it's time to generate the choice data. We convert the wide data into a **long** format according to the 5 conjoint questions, wrangle the data to the desired format, and then save as `csv`:```{r}# First convert the data to long formatchoice_data <- choice_data_wider %>%pivot_longer(cols = mc_cbc_1:mc_cbc_5,names_to ='qID',values_to ='choice') %>%# Convert the qID variable to a numbermutate(qID =parse_number(qID))# Read in choice questions and join it to choice_datasurvey <-read_csv('https://raw.githubusercontent.com/pingfan-hu/My-Resources/main/data/cbc_test.csv')choice_data <- choice_data %>%rename(respID = respondentID) %>%left_join(survey, by =c('respID', 'qID'), relationship ='many-to-many')# Convert choice column to 1 or 0 based on if the alternative was chosen choice_data <- choice_data %>%mutate(choice =case_when(str_detect(choice, '^option_') ~as.numeric(str_replace(choice, 'option_', '')), choice =='not_interested'~3,TRUE~as.numeric(choice) ),choice =ifelse(choice == altID, 1, 0) ) %>%select(-mc_cbc_practice, -cbc_all_same)# Create new values for respID & obsIDnRespondents <-nrow(choice_data_wider)nAlts <-max(survey$altID)nQuestions <-max(survey$qID)choice_data$respID <-rep(seq(nRespondents), each = nAlts*nQuestions)choice_data$obsID <-rep(seq(nRespondents*nQuestions), each = nAlts)# Reorder columns - it's nice to have the 'ID' variables firstchoice_data <- choice_data %>%select(ends_with('ID'), 'choice', 'upfront','lv_2_charger_Yes', 'lv_2_charger_No','discount', 'window', 'guaranteed','no_choice', everything())# Clean up names for choice_datachoice_data <-clean_names(choice_data)# Save cleaned data for modelingwrite_csv(choice_data, here('pilot', 'data', 'choice_data_pilot.csv'))```### 3. Summary StatisticsThis is instructed to be in the **Sample Description** subsection, but in fact can only be processed after **Data Cleaning**. The summary statistics will be shown in 3 parts:1. Vehicle Info Summary2. Charging Preference Summary3. Demographic Info SummaryWe utilize the `choice_data_wider` data frame for all 3 parts.#### 3.1 Vehicle Info SummaryFor vehicle info summary, we firstly **pivot** the data:```{r}choice_data_vehicle <- choice_data_wider %>%pivot_longer(cols =c(mc_own_ev, `mc_max_range.x`, mc_commute, mc_car_number, mc_backup),names_to ='category',values_to ='value')```Then, we **reorder** and **relabel** the values based on category:```{r}choice_data_vehicle <- choice_data_vehicle %>%mutate(category =case_when( category =='mc_own_ev'~'Own EV', category =='mc_max_range.x'~'Max Range', category =='mc_car_number'~'Car Number', category =='mc_backup'~'Backup Car', category =='mc_commute'~'Commute',TRUE~ category ),value =case_when( category =='Own EV'~factor(value, levels =c('own', 'lease', 'plan'), labels =c('Own', 'Lease', 'Plan')), category =='Max Range'~factor(value, levels =c('below_100', 'range_100_to_149', 'range_150_to_199','range_200_to_249', 'range_250_to_299', 'over_300' ),labels =c('100', '150', '200', '250', '300', '>300')), category =='Car Number'~factor(value, levels =c('zero', 'one', 'two', 'three', 'four'),labels =c('0', '1', '2', '3', '4')), category =='Backup Car'~factor(value, levels =c('yes_ev', 'yes_gas', 'yes_both', 'no'),labels =c('EV', 'Gas', 'Both', 'No')), category =='Commute'~factor(value, levels =c('on_foot', 'by_bus', 'by_train_subway','self_driving'),labels =c('On Foot', 'By Bus', 'By Train', 'Self-driving')),TRUE~as.factor(value) ) )```Finally, here is the **plot** for **vehicle info** summary:```{r}#| fig-asp: 1choice_data_vehicle %>%ggplot(aes(x = value)) +geom_bar(width =0.6) +scale_y_continuous(breaks =seq(0, 12, by =2)) +facet_wrap(~fct_relevel(category, 'Own EV', 'Max Range','Car Number', 'Backup Car', 'Commute'),scales ='free_x', nrow =3) +labs(x =NULL,y ='Count',title ='Vehicle Info Summary Plots',subtitle ='Summary Statistics Part 1') +theme_bw(base_family ='Ubuntu')```In the above plot, we can see the **distributions** of:1. EV Ownership2. EV Max Range3. Household Car Number4. Household Backup Car Type5. Commute Method#### 3.2 Charging Preference SummaryFor charging preference summary, we firstly **pivot** the data:```{r}choice_data_charging <- choice_data_wider %>%pivot_longer(cols =c(mc_at_home, mc_at_work),names_to ='category',values_to ='value') %>%separate_rows(value, sep =',\\s*')```Then, we **reorder** and **relabel** the values based on category:```{r}choice_data_charging <- choice_data_charging %>%mutate(category =case_when( category =='mc_at_home'~'At Home', category =='mc_at_work'~'At Work',TRUE~ category ),value =case_when( category =='At Home'~factor(value, levels =c('home_morning', 'home_afternoon', 'home_evening','home_night', 'no_home'), labels =c('Morning', 'Afternoon', 'Evening','Night', 'No')), category =='At Work'~factor(value, levels =c('work_morning', 'work_noon', 'work_afternoon','work_evening', 'work_night', 'no_work'), labels =c('Morning', 'Noon', 'Afternoon','Evening', 'Night', 'No')),TRUE~as.factor(value) ) )```Finally, here is the **plot** for **charging preference** summary:```{r}#| fig-asp: 0.8choice_data_charging %>%ggplot(aes(x = value)) +geom_bar(width =0.6) +scale_y_continuous(breaks =seq(0, 16, by =2)) +facet_wrap(~fct_relevel(category, 'At Home', 'At Work'),scales ='free_x', nrow =2) +scale_x_discrete(limits =c('Morning', 'Noon', 'Afternoon','Evening', 'Night', 'No')) +labs(x =NULL,y ='Count',title ='Charging Preference Summary Plots',subtitle ='Summary Statistics Part 2') +theme_bw(base_family ='Ubuntu')```Here we don't have option of "Noon" for "At Home", but it is shown here for the ease of comparison between home and work charging.#### 3.3 Demographic Info SummaryThe demographic info summary is a little complicated. The team decided to split it into **2** sub-parts.The **1st** part contains gender, race, climate caring, and party. We firstly **pivot** the data:```{r}choice_data_demo <- choice_data_wider %>%mutate(across(c(mc_gender, mc_race, mc_climate, mc_party), as.character)) %>%pivot_longer(cols =c(mc_gender, mc_race, mc_climate, mc_party),names_to ='category',values_to ='value') %>%separate_rows(value, sep =',\\s*')```Then, we **reorder** and **relabel** the values based on category:```{r}choice_data_demo <- choice_data_demo %>%mutate(category =case_when( category =='mc_gender'~'Gender', category =='mc_race'~'Race', category =='mc_climate'~'Climate Caring', category =='mc_party'~'Party',TRUE~ category ),value =case_when( category =='Gender'~factor(value, levels =c('male', 'female', 'transMale', 'transFemale','genderNonconform', 'others', 'prefer_not_say'), labels =c('Male', 'Female', 'Trans Male', 'Trans Female','Genderqueer', 'Others', 'Not Say')), category =='Race'~factor(value, levels =c('asian', 'black', 'white', 'hispanic', 'native','pacific', 'others', 'prefer_not_say'), labels =c('Asian', 'Black', 'White', 'Hispanic', 'Native','Pacific', 'Others', 'Not Say')), category =='Party'~factor(value, levels =c('democratic', 'republican', 'no_vote', 'prefer_not_say'), labels =c('Democratic', 'Republican', 'No', 'Not Say')),TRUE~as.factor(value) ) )```Then, we **plot** the **1st** part of demographic info summary:```{r}#| fig-asp: 0.8choice_data_demo %>%ggplot(aes(x = value)) +geom_bar(width =0.6) +scale_y_continuous(breaks =seq(0, 14, by =2)) +facet_wrap(~fct_relevel(category, 'Gender', 'Race', 'Climate Caring', 'Party'),scales ='free_x', nrow =2) +labs(x =NULL,y ='Count',title ='Demographic Summary Plots Part 1',subtitle ='Summary Statistics Part 3-1') +theme_bw(base_family ='Ubuntu')```The **2nd** part contains education and income. We firstly **pivot** the data:```{r}choice_data_demo_2 <- choice_data_wider %>%mutate(across(c(mc_education, mc_income), as.character)) %>%pivot_longer(cols =c(mc_education, mc_income),names_to ='category',values_to ='value') %>%separate_rows(value, sep =',\\s*')```Then, we **reorder** and **relabel** the values based on category:```{r}choice_data_demo_2 <- choice_data_demo_2 %>%mutate(category =case_when( category =='mc_education'~'Education', category =='mc_income'~'Income',TRUE~ category ),value =case_when( category =='Education'~factor(value, levels =c('no_hs', 'hs', 'college_some', 'vocational', 'degree_associate','degree_bs', 'degree_grad', 'others', 'prefer_not_say'), labels =c('< High School', 'High School', 'Some College','Vocational', 'AA/AS', 'BA/BS', 'Grad','Others', 'Not Say')), category =='Income'~factor(value, levels =c('under60', 'inc_60to80', 'inc_80to100', 'inc_100to150','inc_150to200', 'inc_200to250', 'inc_250to300','inc_300to350', 'inc_350to400', 'inc_over400','prefer_not_say'), labels =c('< $60K', '$60K to $80K', '$80K to $100K','$100K to $150K', '$150K to $200K','$200K to $250K', '$250K to $300K','$300K to $350K', '$350K to $400K','> $400K', 'Not Say')),TRUE~as.factor(value) ) )```Then, we **plot** the **education** distribution:```{r}choice_data_demo_2 %>%filter(category =='Education') %>%ggplot(aes(x = value)) +geom_bar(width =0.6) +scale_y_continuous(breaks =seq(0, 12, by =2)) +labs(x =NULL,y ='Count',title ='Demographic Summary Plots Part 2 - Education',subtitle ='Summary Statistics Part 3-2') +theme_bw(base_family ='Ubuntu')```Finally, the **income** distribution:```{r}choice_data_demo_2 %>%filter(category =='Income') %>%ggplot(aes(x = value)) +geom_bar(width =0.6) +scale_y_continuous(breaks =seq(0, 12, by =1)) +labs(x =NULL,y ='Count',title ='Demographic Summary Plots Part 3 - Income',subtitle ='Summary Statistics Part 3-3') +coord_flip() +theme_bw(base_family ='Ubuntu')```### 4. Modeling#### 4.1 Model SummaryIn order to reveal the utility model, we need to use the `logitr` package:```{r}# Load the data set:choice_data <-read_csv(here('pilot', 'data', 'choice_data_pilot.csv'))# Estimate the modelmodel <-logitr(data = choice_data,outcome ='choice',obsID ='obs_id',pars =c('upfront', 'lv_2_charger_yes', 'lv_2_charger_no','discount', 'window', 'guaranteed'))```View summary of results:- Please see in [3. Miscellaneous] in **Appendix**.#### 4.2 Model EvaluationBefore we move on to the equation, we evaluate the model. Firstly, we work out the **1st order** condition: Is the gradient at the solution zero?```{r}model$gradient```It shows the 1st order values are close to zero, which is within our expectation.Secondly, we work out the **2nd order** condition: Is the hessian negative definite?```{r}eigen(model$hessian)$values```It shows the eigenvalues are negative, meaning the hessian is negative definite. This is also within our expectation.The evaluation process ends here. We can continue.#### 4.3 Model EquationFrom the summary, we can derive the **estimated** values and standard **errors** of the 6 coefficients:```{r}coeff_table <-data.frame(Coefficient =c('β<sub>1</sub>', 'β<sub>2</sub>','β<sub>3</sub>', 'β<sub>4</sub>','β<sub>5</sub>', 'β<sub>6</sub>'),Estimate =c(0.0163208, -7.5792292, -9.8322095,0.0451776, 0.2911476, 0.0827197),StdError =c(0.0027778, 1.5417186, 1.8565463,0.0127580, 0.1642019, 0.0170261),Levels =c('100, 300, 500', '(Nil)', '(Nil)','10, 25, 50', '0.5, 1, 2, 4', '25, 50, 75'),Meaning =c('Upfront', 'Lv 2 Charger', 'No Lv 2 Charger','Discount', 'Window', 'Guaranteed') )coeff_table$Estimate <-round(coeff_table$Estimate, 4)coeff_table$StdError <-round(coeff_table$StdError, 4)coeff_table %>%kable(format ='html', escape =FALSE,col.names =c('Coefficient', 'Estimate', 'Std Error','Levels', 'Meaning'),align =c('c', 'c', 'c', 'l', 'l'),caption ='Summary of Model Coefficients') %>%kable_styling(bootstrap_options =c('striped', 'hover'))```The equation can be written as:$$v_j = \beta_1 x_j^{upfront} + \beta_2 \delta_j^{charger} + \beta_3 \delta_j^{nocharger} + \beta_4 x_j^{discount} + \beta_5 x_j^{window} + \beta_6 x_j^{guaranteed}$$Plugging in the $\boldsymbol{\beta}$ values, we have:$$v_j = 0.0163 x_j^{upfront} - 7.5792 \delta_j^{charger} - 9.8322 \delta_j^{nocharger} + 0.0452 x_j^{discount} + 0.2911 x_j^{window} + 0.0827 x_j^{guaranteed}$$The above equation only considers the observed utility. There should also be an error term so that:$$\tilde u_j=v_j+\tilde \epsilon_j$$where $v_j$ is the observed utility that contains $\beta$ and $x$ values, and $\tilde \epsilon_j$ is the error part.#### 4.4 Model FiguresThen we generate the **summary figures**. The figures are in 2 parts:- Part 1: Upfront, Discount, Guaranteed- Part 2: Lv2 Charger, No Lv2 Charger, WindowLet's do **Part 1** first. We start with **data frame** construction:```{r}coeff_figure_1 <-data.frame(Attribute =factor(c('Upfront', 'Discount', 'Guaranteed'),levels =c('Guaranteed', 'Discount', 'Upfront') ),Estimate =c(0.0163208, 0.0451776, 0.0827197),StdError =c(0.0027778, 0.0127580, 0.0170261))coeff_figure_1$LowerBound = coeff_figure_1$Estimate -1.96* coeff_figure_1$StdErrorcoeff_figure_1$UpperBound = coeff_figure_1$Estimate +1.96* coeff_figure_1$StdError```Then, we create the **plot** for **Part 1**:```{r}ggplot(coeff_figure_1, aes(x = Attribute, y = Estimate)) +geom_point() +geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width =0.2) +labs(title ='Coefficient Estimates Part 1',subtitle ='With Uncertainty Bounds',x ='Attribute',y ='Estimate') +coord_flip() +theme_bw(base_family ='Ubuntu')```Then, the **data frame** construction for **Part 2**:```{r}coeff_figure_2 <-data.frame(Attribute =factor(c('Charger', 'No Charger', 'Window'),levels =c('Window', 'No Charger', 'Charger')),Estimate =c(-7.5792292, -9.8322095, 0.2911476),StdError =c(1.5417186, 1.8565463, 0.1642019))coeff_figure_2$LowerBound = coeff_figure_2$Estimate -1.96* coeff_figure_2$StdErrorcoeff_figure_2$UpperBound = coeff_figure_2$Estimate +1.96* coeff_figure_2$StdError```The **plot** for **Part 2**:```{r}ggplot(coeff_figure_2, aes(x = Attribute, y = Estimate)) +geom_point() +geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width =0.2) +labs(title ='Coefficient Estimates Part 2',subtitle ='With Uncertainty Bounds',x ='Attribute',y ='Estimate') +coord_flip() +theme_bw(base_family ='Ubuntu')```The **interpretations** for these plots are:1. Since "Upfront", "Discount", and "Guaranteed" are in large numbers, their coefficients are plotted together in **Part 1**.2. Since "Charger", "No Charger", and "Window" are in small numbers, their coefficients are plotted together in **Part 2**.3. The larger the attribute level numbers are, the larger expansion that the distributions will be.4. With only 25 successful questionnaires, the uncertainty is considerably high. More samples are required to hedge the uncertainty.**Important notice: The analysis of the 25 successful questionnaires ends here. In the Power Analysis section, we will simulate the results and figure out a proper way to redesign the survey.**### 5. Power AnalysisIn this section, we will attempt to redesign the survey by simulating survey results. Below is the step-by-step digest:1. We use the `cbc_profiles()` function to generate a **profile** for our 5 attributes.2. Then, we use `cbc_design()` to create a **design** for the survey. The team purposes to do two designs: randomized full-factorial, and Bayesian D-efficient fractional factorial.3. Use `cbc_choices()` to **simulate** the random choices: this is a critical part, indicating that we only use random answers for the surveys, without collecting real data from human users.4. Use `cbc_power()` to perform **power analysis**.5. Generate plots of the power analysis, and fine tune the `cbc_design()`.The **purpose** of the power analysis is to figure out:1. What is the most feasible survey design, including the number of questions, respondents, and alternatives.2. The diminishing marginal utility exists in our design. How can we judge the best point for the return of our effort.#### 5.1 Randomized Full Factorial DesignTo start, we define **profiles** with attributes and levels:```{r}profiles <-cbc_profiles(upfront =c(100, 300, 500), # USDlv_2_charger =c('No', 'Yes'),discount =c(10, 25, 50), # percentagewindow =c(0.5, 1, 2, 4), # hrsguaranteed =c(25, 50, 75) # percentage)```Then, we generate a **randomized** full-factorial design:```{r}design_rand <-cbc_design(profiles = profiles,n_resp =500, # Number of respondentsn_alts =2, # Number of alternatives per questionn_q =8# Number of questions per respondent)```Use `cbc_balance()` to show **counts** of each level of each attribute:- Please see in [3. Miscellaneous] in **Appendix**.Use `cbc_overlap()` to show **overlap** across attributes:- Please see in [3. Miscellaneous] in **Appendix**.**Simulate** the random choices:```{r}choice_rand <-cbc_choices(design = design_rand,obsID ='obsID')```Run **power analysis**:```{r}power_rand <-cbc_power(data = choice_rand,pars =c('upfront', 'lv_2_charger', 'discount','window', 'guaranteed'),obsID ='obsID',outcome ='choice',nbreaks =10,n_q =8)```Show **head** and **tail**:```{r}head(power_rand)tail(power_rand)```Show the **plot**:```{r}ggplot(power_rand) +geom_hline(yintercept =0.05, color ='red', linetype =2) +geom_point(aes(x = sampleSize, y = se, color = coef)) +expand_limits(y =0) +scale_y_continuous(limits =c(0, 0.125)) +theme_bw(base_family ='Ubuntu') +labs(x ='Sample size', y ='Standard error', color ='Coefficient',title ='Randomized Design Plot' )```#### 5.2 Bayesian D-efficient Fractional Factorial DesignTo start, we define **profiles** with attributes and levels (**same** as the randomized design):```{r}profiles <-cbc_profiles(upfront =c(100, 300, 500), # USDlv_2_charger =c('No', 'Yes'),discount =c(10, 25, 50), # percentagewindow =c(0.5, 1, 2, 4), # hrsguaranteed =c(25, 50, 75) # percentage)```Then, we generate a **Bayesian D-efficient** fractional factorial design and save as `RData`:```{r}#| eval: falsedesign_db_eff <-cbc_design(profiles = profiles,n_resp =500, # Number of respondentsn_alts =2, # Number of alternatives per questionn_q =8, # Number of questions per respondentn_start =5, # Number of random starts for D-efficient designn_blocks =3, # More blocks increases overall efficiencypriors =list(upfront =0.02,lv_2_charger =1,discount =0.05,window =0.3,guaranteed =0.08 ),method ='Modfed'# The Modified Federov Algorithm)save(design_db_eff,file =here('pilot', 'data', 'design_db_eff_pilot.RData'))```The Modfed algorithm is computationally rich. The above codes are set as `#| eval: false` to prevent from repeated processing. We read the Bayesian D-efficient design from `RData` file:```{r}load(here('pilot', 'data', 'design_db_eff_pilot.RData'))```Use `cbc_balance()` to show **counts** of each level of each attribute:- Please see in [3. Miscellaneous] in **Appendix**.Use `cbc_overlap()` to show **overlap** across attributes:- Please see in [3. Miscellaneous] in **Appendix**.**Simulate** the random choices:```{r}choice_db_eff <-cbc_choices(design = design_db_eff,obsID ='obsID')```Run **power analysis**:```{r}power_db_eff <-cbc_power(data = choice_db_eff,pars =c('upfront', 'lv_2_charger', 'discount','window', 'guaranteed'),obsID ='obsID',outcome ='choice',nbreaks =10,n_q =8)```Show **head** and **tail**:```{r}head(power_db_eff)tail(power_db_eff)```Show the **plot**:```{r}ggplot(power_db_eff) +geom_hline(yintercept =0.05, color ='red', linetype =2) +geom_point(aes(x = sampleSize, y = se, color = coef)) +expand_limits(y =0) +scale_y_continuous(limits =c(0, 0.125)) +theme_bw(base_family ='Ubuntu') +labs(x ='Sample size', y ='Standard error', color ='Coefficient',title ='Bayesian D-efficient Design Plot' )```#### 5.3 Interpretation of Power AnalysisSince both analysis contain a certain level of randomization, the output results and the plots are not unique. The stand errors of each attribute gradually decreases and reaches to a steady level at about **400** of sample size, which also indicates the expected survey sample size.Here is the modification that the team will apply onto the final design:1. Increase the number of **questions** per person to **8**.2. **Remove** the "Not Interested" option.3. Modify the **sample** size to **400**, but may require further discussion and judgement.4. Both `cbc_balance()` and `cbc_overlap()` shows detailed distribution of attributes and options. This will be further investigated later.Above is merely the modification on the **conjoint** question design. For a more inclusive improvement plan, please see the **Changes to Final Survey** section below.## Changes to Final SurveyData and feedback from the pilot survey have played a pivotal role in refining our survey and analysis. In this section, we'll highlight the key adjustments made to our final survey based on insights and observations gathered during the pilot phase. These adjustments not only help us enhance the survey's comprehensiveness and precision but also guide us in designing and planning our analysis.The pilot analysis provided us with profound insights into how users engage with our conjoint questions. The choice data obtained from our pilot survey is instrumental in enhancing and fine-tuning our final survey. This ensures that the data collected in the final survey will hold greater significance and precision for our analysis.### 1. Current VersionOur survey is structured into three distinct sections:**Section 1**: This section seeks to determine whether the participant is an PEV owner or a prospective PEV buyer. It collects essential data about the vehicle's range, the preferred charging window, and the charging location.**Section 2**: Here, we provide a comprehensive explanation of the attributes crucial for making informed choices. Additionally, we've incorporated a set of five conjoint questions, refined based on insights from our pilot survey.**Section 3**: This section contains demographic questions that offer valuable insights into the participant's demographic profile.### 2. ImprovementsSince this survey is already not a pure "pilot" version, there are improvements that the team has done and intend to do. This section will contain both.Below are the improvements that we have **already** done:1. For section 1: We've relocated the question about the maximum range of the PEV from section 2 to section 1.2. For section 1: An educational video has been included to explain smart charging.3. For section 2: A helpful link to a page explaining the attributes has been included for all questions, allowing users to clarify any doubts they may have about the survey's attributes.4. For section 3: We've enhanced the robustness of demographic questions to collect more detailed information about the participants.Below are the improvements that we **intend** to do:1. (**Important**) For Section 2: We consider improving the conjoint survey design. Firstly, we increase the number of conjoint questions from 5 to **8**; secondly, we are considering the **removal** of the "Not Interested" response option from the survey. After that, we will perform better power analysis to figure out a reasonable distribution of `overlap` and `balance` of the revised survey design.2. For Section 2: We consider simplifying the contents in the smart charging programs [attributes table](https://pevteam.quarto.pub/attribute-table/), and add images to enhance understanding.3. For Section 3: Instead of asking which party they voted for, a more general political question will be which party they intend to vote for, or they want to support.4. For Section 3: Change the state to ZIP code.5. For Section 3: Put all of the **travel** related questions on a single page, then, the **demographic** questions shall be in another page.## AttributionThis pilot analysis report is done by all members of the PEV Team:- **Vedanth** worked on [Abstract] and [Introduction].- **Sampada** worked on [Survey Design].- **Pingfan** worked on [Pilot Data Analysis] and [Appendix], and prepared the files for submission.- **Bharath** worked on [Changes to Final Survey].## Appendix### 1. Conjoint Survey Documentation#### 1.1 **Section 2** - Smart Charging Programs```{r, include=FALSE}# Read in the choice questionslibrary(tidyverse)survey <- read_csv('https://raw.githubusercontent.com/pingfan-hu/My-Resources/main/data/cbc_test.csv')# Define the respondent IDrespondentID <- sample(survey$respID, 1)# Create the subset of rows for that respondent IDdf <- survey %>% filter(respID == respondentID)# Convert df to jsondf_json <- jsonlite::toJSON(df) # Here the df_json is NOT serialized since we are using Kables instead of buttons.```#### 1.2 Start of Section 2Now that we have known basics about your EV ownership and your charging habits, let's move on to Smart Charging program selections!A Smart Charging program consists of 5 attributes: **Upfront Incentive**, **Free Level 2 Charger**, **Electricity Price Discount**, **Override Window**, and **Guaranteed Range if charged for 8 hrs**.Here is a brief explanation of each attribute:```{r, echo=FALSE}library(dplyr)library(kableExtra)brief <- tibble( `Attribute` = c('Range', 'Explanation'), `Upfront Incentive:` = c('$100, $300, $500', 'It means the amount of money you will be instantly paid if you join the smart charging program. The unit is in USD.'), `Free Level 2 Charger:` = c('No or Yes', 'A \"Yes\" means you will get a free level 2 charger. A level 2 charger enables faster charging than a standard home charger.'), `Electricity Price Discount:` = c('10%, 25%, 50%', 'If you choose to join this smart charging program, this will be your electricity price discount.'), `Override Window:` = c('0.5 ~ 4 hrs', 'This is the longest time window per day that you can force override to regular charging, during which time your EV is not controlled by the smart charging program.'), `Guaranteed Range if charged for 8 hrs:` = c('25%, 50%, 75%', 'It means the guranteed range if your car is charged for 8 hrs during smart charging. It shows percentage here, but will be converted to mileages.'))row.names(brief) <- NULLkable(t(brief), escape = FALSE) %>% kable_styling(bootstrap_options = c('striped', 'hover'), full_width = FALSE, position = 'center') %>% row_spec(row = 1, bold = TRUE)```The above table can be accessed via this <a href="https://pevteam.quarto.pub/attribute-table/" target="_blank"><strong>link</strong></a>. It's available on top of each question so that you can refer at any time.The next page provides a sample question to enhance your understanding.#### 1.3 Section 2 - Sample QuestionYou are provided with **2 options** of smart charging programs, and an **"I am not interested"** option. You need to treat them **as if they were your only options** and choose the one that you most prefer. You can choose the **"I am not interested"** option if you are not interested in either.You can access the attribute table using this <a href="https://pevteam.quarto.pub/attribute-table/" target="_blank"><strong>link</strong></a>.```{r, echo=FALSE}library(dplyr)library(kableExtra)library(tibble)alts <- tibble( `Options:` = c('Option 1', 'Option 2'), `Upfront Incentive:` = scales::dollar(c(100, 500)), `Free Level 2 Charger:` = c('No', 'Yes'), `Electricity Price Discount:` = c('25%', '50%'), `Override Window:` = c('0.5 hrs', '4 hrs'), `Guaranteed Range if charged for 8 hrs:` = c('100 miles', '300 miles'))alts_t <- as.data.frame(t(alts)) %>% rownames_to_column(var = 'Attribute')colnames(alts_t) <- NULLkable(alts_t, escape = FALSE, align = c('r', 'c', 'c')) %>% kable_styling(bootstrap_options = c('striped', 'hover'), full_width = FALSE, position = 'center') %>% column_spec(column = 1, width = '20em') %>% column_spec(column = 2, width = '10em') %>% column_spec(column = 3, width = '10em') %>% row_spec(row = 1, bold = TRUE)```- Option 1- Option 2- I am not interested#### 1.4 **Section 2** - Ready to Start the QuestionsThe sample question is quite straight-forward. Since **Option 2** has every attribute better than Option 1, I bet you have chosen it.However, this simplicity is not always the case. There will be trade-offs between the 2 options, and you need to make your own decision. For example, one option might have a large up-front but a low discount, while the other one is the inverse. Then, you need to make decisions based on your preference.There are **5** choice questions in total. For each question, you have 2 valid options to choose from. Again, you make your choice **as if they were your ONLY options**. If you are really not interested in either of them, you can choose **"I am not interested"**.Click on **"Next Page"** to proceed.#### 1.5 Question 1 of 5(**1 of 5**) If your utility offers you these 2 smart charging programs, which one do you prefer?You can access the attribute table using this <a href="https://pevteam.quarto.pub/attribute-table/" target="_blank"><strong>link</strong></a>.```{r, echo=FALSE}library(dplyr)library(kableExtra)library(tibble)alts <- jsonlite::fromJSON(df_json) %>% filter(qID == 1 & altID != 3) %>% mutate( altID = paste('Option', altID), upfront = scales::dollar(upfront), discount = paste0(discount,'%'), window = paste(window, 'hrs'), guaranteed = paste(guaranteed, '%'), lv_2_charger_Yes = case_when(lv_2_charger_Yes == 1 ~ 'Yes', lv_2_charger_Yes == 0 ~ 'No')) %>% select( `Options:` = altID, `Upfront Incentive:` = upfront, `Free Level 2 Charger:` = lv_2_charger_Yes, `Electricity Price Discount:` = discount, `Override Window:` = window, `Guaranteed Range if charged for 8 hrs:` = guaranteed)alts_t <- as.data.frame(t(alts)) %>% rownames_to_column(var = 'Attribute')colnames(alts_t) <- NULLkable(alts_t, escape = FALSE, align = c('r', 'c', 'c')) %>% kable_styling(bootstrap_options = c('striped', 'hover'), full_width = FALSE, position = 'center') %>% column_spec(column = 1, width = '20em') %>% column_spec(column = 2, width = '10em') %>% column_spec(column = 3, width = '10em') %>% row_spec(row = 1, bold = TRUE)```- Option 1- Option 2- I am not interestedThe other 4 questions are the same and are thus **omitted**.### 2. Data DictionaryDictionary of **Files**:```{r}dict_file <-data.frame(name =c('cbc_questions','choice_data','pevConjoint1','pevConjoint2','pevConjoint3'),type =c('CSV','CSV','CSV','CSV','CSV'),description =c('Conjoint questions for the pilot survey.','Cleaned results from the pilot survey.','Part 1 of pilot survey result from formr.','Part 2 of pilot survey result from formr','Part 3 of pilot survey result from formr')) %>%mutate(count =row_number()) %>%select(count, everything())dict_file %>%kable(format ='html',escape =FALSE,align =c('c', 'l', 'c', 'l'),col.names =c('No.', 'Name', 'Type', 'Description'),caption ='Dictionary of Files') %>%kable_styling(bootstrap_options =c('striped', 'hover'))```Dictionary of **Data Frames**:```{r}dict_df <-data.frame(name =c('choice_data','choice_data_charging','choice_data_demo','choice_data_demo_2','choice_data_vehicle','choice_data_wider','choice_db_eff','choice_rand','coeff_figure_1','coeff_figure_2','coeff_table','design_db_eff','design_pilot','design_rand','model','p1','p2','p3','power_db_eff','power_rand','profiles','survey'),type =c('A tibble: 375 × 41','A tibble: 70 × 36','A tibble: 100 × 34','A tibble: 50 × 36','A tibble: 125 × 33','A tibble: 25 × 36','A tibble: 8000 × 12','A tibble: 8000 × 11','A data frame: 3 × 5','A data frame: 3 × 5','A data frame: 6 × 5','A tibble: 8000 × 11','A tibble: 7500 × 12','A tibble: 8000 × 10','A list: 32','A tibble: 29 × 11','A tibble: 28 × 11','A tibble: 28 × 11','A data frame: 50 × 4','A data frame: 50 × 4','A data frame: 216 × 6','A tibble: 1,500 × 12'),description =c('Conjoint questions for the pilot survey.','Longer version of choice data about charging.','Longer version of choice data about main demography.','Longer version of choice data about education and income.','Longer version of choice data about vehicle.','Original wider version of choice data.','Simulated random choices for Bayesian D-efficient design.','Simulated random choices for randomized design.','Coeffients used for model summary figure 1.','Coeffients used for model summary figure 2.','Coeffients used for model summary table.','Survey design in Bayesian D-efficient fractional factorial.','Survey design for pilot survey.','Survey design in Randomized factorial.','Utility model generated by logitr.','Part 1 of pilot survey result from formr.','Part 2 of pilot survey result from formr.','Part 3 of pilot survey result from formr.','Power analysis for Bayesian D-efficient design.','Power analysis for randomized design.','Profiles with attributes and levels (same for all designs).','Conjoint questions for the pilot survey (online version).' )) %>%mutate(count =row_number()) %>%select(count, everything())dict_df %>%kable(format ='html',escape =FALSE,align =c('c', 'l', 'l', 'l'),col.names =c('No.', 'Name', 'Type', 'Description'),caption ='Dictionary of Data Frames') %>%kable_styling(bootstrap_options =c('striped', 'hover'))```### 3. Miscellaneous#### 3.1 Summary of Model```{r}summary(model)```#### 3.2 Randomized Design**Counts** of each level of each attribute:```{r}cbc_balance(design_rand)```**Overlap** across attributes:```{r}cbc_overlap(design_rand)```#### 3.3 Bayesian D-efficient Design**Counts** of each level of each attribute:```{r}cbc_balance(design_db_eff)```**Overlap** across attributes:```{r}cbc_overlap(design_db_eff)```