1. Project & Data Context
This case study focuses on automating likert‑scale analysis for a Public Health PhD dissertation surveyed 413 healthcare workers across urban and rural hospitals to assess their perceptions of standard operating protocols via a 10-item, 5-point Likert scale. The goals were to:
- Collapse and aggregate responses into meaningful categories
- Generate reproducible dashboards of item and composite scores
- Execute inferential tests on the derived categorical outcomes
This case study details the end-to-end software stack and workflow that automated each step—maximizing consistency, auditability, and efficiency.
2. Software Ecosystem for focusing on automating likert‑scale analysis
- R (Base + tidyverse)
- Data import (
read.csv) and reshaping (pivot_longer) - Custom functions for scale collapsing and composite scoring
- Inferential tests:
chisq.test(),t.test(),glm()
- Data import (
- Microsoft Excel + VBA
- “Masterchart” creation macro: merges pre-survey demographic files with cleaned Likert scores
- Automated dashboard macro: iterates over item groups to produce summary tables and charts
- R Markdown & Git
- Parameterized
.Rmdtemplates for generating HTML/PDF reports with embedded code, tables, and narrative - Git version control for script integrity and reproducibility
- Parameterized
3. Data Ingestion & Masterchart Standardization for automating likert‑scale analysis
- Raw Data Import
raw_df <- read.csv(\"survey_responses.csv\") demo_df <- read.csv(\"demographics.csv\")
- Schema Validation
validate_schema(raw_df, expected_cols = paste0(\"Q\", 1:10))
- Excel Masterchart Macro
- VBA script
CreateMasterchart()reads both CSVs, standardizes column names, and outputsmasterchart.csvwith one row per respondent, demographic fields + raw item scores.
- VBA script
4. Likert-Scale Transformation Pipeline for automating likert‑scale analysis
All transformations occur in likert_transform.R:
- Collapsing 5-Point → 3-Point Scale
collapse_likert <- function(x)
{
case_when
(
x >= 4 ~ 3, # Agree
x == 3 ~ 2, # Neutral
x <= 2 ~ 1 # Disagree
)
}
df3 <- raw_df %>% mutate(across(Q1:Q10, collapse_likert))
- Composite Scoring of Related Items
df3 <- df3 %>%
rowwise() %>%
mutate
(
composite_training = mean(c(Q2, Q5), na.rm=TRUE),
composite_protocol = mean(c(Q1, Q4, Q7), na.rm=TRUE)
) # Convert composites back into 3-point categories
df3 <- df3 %>%
mutate(across(starts_with(\"composite_\"), ~ case_when
(
. > 2.33 ~ 3,
. >= 1.67 ~ 2,
TRUE ~ 1
)))
- Export Cleaned Masterchart
write.csv(df3, \"likert_masterchart.csv\", row.names=FALSE)
5. Automated Dashboard Generation
- Excel VBA Macro:
- Loops over each
Q1:Q10andcomposite_*column - Builds a pivot table of frequency counts (1, 2, 3) by
hospital_type - Creates a clustered bar chart for each pivot and labels axes/text
- Loops over each
- Output:
- 13 dashboards (10 individual items + 3 composites), each on its own worksheet, ready for inclusion in R Markdown reports.
6. Inferential Testing Pipeline
In inferential_tests.R, the cleaned masterchart is used to test whether high-response rates differ by hospital type:
df3 <- read.csv(\"likert_masterchart.csv\")
# Define “high agreement” as category 3
df3 <- df3 %>% mutate(high_composite_protocol = if_else(composite_protocol == 3, 1, 0))
# Chi-square test
cs_result <- chisq.test(table(df3$hospital_type, df3$high_composite_protocol))
write.csv(as.data.frame(cs_result[c(\"statistic\",\"p.value\")]), \"chi_square_protocol.csv\")
Diagnostic plots (mosaic plots, residual heatmaps) are generated automatically and saved to /results/.
7. Reproducible Reporting & Audit
- R Markdown Template:
- Parameters:
data_path,output_format - Sections: Data summary, Dashboard gallery, Inferential results, Executive summary
- Parameters:
- Git Workflow:
mainbranch holds stable release; feature branches (feature/likert-transform,feature/dashboard-macro) undergo pull-request reviews.
- Audit Report:
- A final Data Audit PDF includes schema checks, transformation logs, macro version stamp, and test assumptions (e.g., expected cell counts for chi-square) .
8. Lessons & Best Practices
- Centralize Data Prep: Use a “masterchart” as the single data source to avoid divergence.
- Automate Repetitive Tasks: VBA for dashboards, R functions for transformations ensure uniform output.
- Parameterize Reports: R Markdown templates make it trivial to re-run analyses on updated data.
- Version Control Everything: Preserves a complete audit trail of data changes and script edits.
This Software & Workflow Walkthrough illustrates how combining Excel macros with R scripting creates a robust, transparent pipeline for Likert-scale data in a Public Health dissertation—delivering audit-ready dashboards and statistical outputs with minimal manual effort.
Want to explore more PhD-level case studies? Check out our Comprehensive Case Studies on PhD Statistical Analysis guide page
Discover more from Ankit Gupta
Subscribe to get the latest posts sent to your email.
