Case Study: Automating Likert-Scale Data Transformation & Analysis in a Public Health Dissertation

Home » Case Study: Automating Likert-Scale Data Transformation & Analysis in a Public Health Dissertation

1. Project & Data Context

This case study focuses on automating likert‑scale analysis for a Public Health PhD dissertation surveyed 413 healthcare workers across urban and rural hospitals to assess their perceptions of standard operating protocols via a 10-item, 5-point Likert scale. The goals were to:

  • Collapse and aggregate responses into meaningful categories
  • Generate reproducible dashboards of item and composite scores
  • Execute inferential tests on the derived categorical outcomes

This case study details the end-to-end software stack and workflow that automated each step—maximizing consistency, auditability, and efficiency.


2. Software Ecosystem for focusing on automating likert‑scale analysis

  • R (Base + tidyverse)
    • Data import (read.csv) and reshaping (pivot_longer)
    • Custom functions for scale collapsing and composite scoring
    • Inferential tests: chisq.test(), t.test(), glm()
  • Microsoft Excel + VBA
    • “Masterchart” creation macro: merges pre-survey demographic files with cleaned Likert scores
    • Automated dashboard macro: iterates over item groups to produce summary tables and charts
  • R Markdown & Git
    • Parameterized .Rmd templates for generating HTML/PDF reports with embedded code, tables, and narrative
    • Git version control for script integrity and reproducibility

3. Data Ingestion & Masterchart Standardization for automating likert‑scale analysis

  • Raw Data Import
raw_df <- read.csv(\"survey_responses.csv\") demo_df <- read.csv(\"demographics.csv\")
  • Schema Validation
validate_schema(raw_df, expected_cols = paste0(\"Q\", 1:10))
  • Excel Masterchart Macro
    • VBA script CreateMasterchart() reads both CSVs, standardizes column names, and outputs masterchart.csv with one row per respondent, demographic fields + raw item scores.

4. Likert-Scale Transformation Pipeline for automating likert‑scale analysis

All transformations occur in likert_transform.R:

  • Collapsing 5-Point → 3-Point Scale
collapse_likert <- function(x) 
   { 
    case_when
      ( 
       x >= 4 ~ 3, # Agree 
       x == 3 ~ 2, # Neutral 
       x <= 2 ~ 1 # Disagree 
     )
   } 
df3 <- raw_df %>% mutate(across(Q1:Q10, collapse_likert))
  • Composite Scoring of Related Items
df3 <- df3 %>% 
rowwise() %>% 
mutate
( 
 composite_training = mean(c(Q2, Q5), na.rm=TRUE), 
 composite_protocol = mean(c(Q1, Q4, Q7), na.rm=TRUE) 
) # Convert composites back into 3-point categories 
df3 <- df3 %>% 
mutate(across(starts_with(\"composite_\"), ~ case_when
 ( 
   . > 2.33 ~ 3, 
   . >= 1.67 ~ 2, 
   TRUE ~ 1 
)))
  • Export Cleaned Masterchart
write.csv(df3, \"likert_masterchart.csv\", row.names=FALSE)

5. Automated Dashboard Generation

  • Excel VBA Macro:
    • Loops over each Q1:Q10 and composite_* column
    • Builds a pivot table of frequency counts (1, 2, 3) by hospital_type
    • Creates a clustered bar chart for each pivot and labels axes/text
  • Output:
    • 13 dashboards (10 individual items + 3 composites), each on its own worksheet, ready for inclusion in R Markdown reports.

6. Inferential Testing Pipeline

In inferential_tests.R, the cleaned masterchart is used to test whether high-response rates differ by hospital type:

df3 <- read.csv(\"likert_masterchart.csv\")

# Define “high agreement” as category 3
df3 <- df3 %>% mutate(high_composite_protocol = if_else(composite_protocol == 3, 1, 0))

# Chi-square test
cs_result <- chisq.test(table(df3$hospital_type, df3$high_composite_protocol))
write.csv(as.data.frame(cs_result[c(\"statistic\",\"p.value\")]), \"chi_square_protocol.csv\")

Diagnostic plots (mosaic plots, residual heatmaps) are generated automatically and saved to /results/.


7. Reproducible Reporting & Audit

  • R Markdown Template:
    • Parameters: data_path, output_format
    • Sections: Data summary, Dashboard gallery, Inferential results, Executive summary
  • Git Workflow:
    • main branch holds stable release; feature branches (feature/likert-transform, feature/dashboard-macro) undergo pull-request reviews.
  • Audit Report:
    • A final Data Audit PDF includes schema checks, transformation logs, macro version stamp, and test assumptions (e.g., expected cell counts for chi-square) .

8. Lessons & Best Practices

  • Centralize Data Prep: Use a “masterchart” as the single data source to avoid divergence.
  • Automate Repetitive Tasks: VBA for dashboards, R functions for transformations ensure uniform output.
  • Parameterize Reports: R Markdown templates make it trivial to re-run analyses on updated data.
  • Version Control Everything: Preserves a complete audit trail of data changes and script edits.

This Software & Workflow Walkthrough illustrates how combining Excel macros with R scripting creates a robust, transparent pipeline for Likert-scale data in a Public Health dissertation—delivering audit-ready dashboards and statistical outputs with minimal manual effort.


Want to explore more PhD-level case studies? Check out our Comprehensive Case Studies on PhD Statistical Analysis guide page


Discover more from Ankit Gupta

Subscribe to get the latest posts sent to your email.

Leave a ReplyCancel reply

Discover more from Ankit Gupta

Subscribe now to keep reading and get access to the full archive.

Continue reading