What a Semester Project Taught Me About Statistics for Business Students
Reflections from a semester of student-led projects and passion-driven questions
Well, that went by fast. The semester is over, and Elon commencement ceremonies occur today, Friday, May 23. I teach an asynchronous Principles of Economics during the first summer session, which starts in June.
I am jotting some lessons learned from teaching the business statistics course while they are fresh on my mind.
My biggest lesson learned was that a semester-long project can be a powerful way to organize lower-level courses. I used a project that stayed front and center all semester, built into the lessons and the grading. The Passion Driven Statistics approach provides a reasonable starting point, but I found some things worked better💡 than others 🔄.
Since Elon designates the course as data-intensive, I'll organize my thoughts using the five learning outcomes.
Students think critically about when and what data are needed, assess data sources appropriate to the information needed, and identify the context in which data are produced and used.
💡Allowing students to choose their questions kept students motivated and helped them practice thinking critically about their questions the whole semester. Some students who were not doing as well with traditional exams could demonstrate critical thinking with their project and share it with their peers during a one-minute elevator speech finale.
Students access, or collect when appropriate, data and explain the ethical considerations of how this data is collected and used.
💡I provided students access to a subset of data and codebook from Add Health. At this level, access vs. collection was the right choice. If you don't have time to curate and develop a similar data set and codebook, the Passion Driven Statistics suggestion worked well.
🔄 I needed to allow for more time on the codebook and how the questionnaire measures variables. Students often made assumptions about what the question asked and how respondents answered.
🔄 I also recommend an in-class discussion about the ethical considerations of how the data was collected and used. Since I provided access to the data instead of having students collect it, I overlooked the opportunity for a small group and classroom discussion.
Students identify appropriate data-analysis methods for working with a given data set and explain the associated limitations and interpretations.
💡Students selected two variables from the Add Health data set before identifying or learning the appropriate data analysis method. They discovered that their choice (Categorical-Categorical, Categorical-Quantitative, or Quantitative-Quantitative) plays a big role in choosing an analysis method and how it answers a question about a relationship.
🔄I took a risk by introducing Python, and on the whole, it worked. However, the speed at which coding is changing with the availability of AI-assisted platforms means I will want to re-assess whether and how I use Python in a future course. For example, free online platforms like Google Colab let you write and run Python code in your browser — with no setup required.
🔄I underestimated the time it would take for students to understand whether their selected variables were categorical or quantitative. The scaffolding of the assignments helped me identify the issue because it allowed students to think and practice this skill while allowing for errors. Some students switched variables after learning how the questionnaire measured the variable, which afforded additional practice in the skill.
🔄Students are disappointed when they can't reject their null hypothesis. It is worth a long walk with a cup of tea to think about the causes and consequences of this perception.
Students communicate their data-intensive work to a variety of audiences through multiple modalities (written, visual, verbal).
💡I highly recommend the Passion Driven Statistics suggestion to use a poster project type of deliverable. It's amenable to scaffolded writing that follows the pace of the class. In other words, students can complete the poster in stages as they learn the material. At the end of the semester, I also added a one-minute elevator speech requirement (without the poster visual) that they would use to explain their analysis during a job interview or with their family.
🔄Even though a traditional written paper was not required, generative AI use emerged in poster development and elevator speeches in ways I did not fully anticipate when I prepared AI guidelines.
Students identify and proficiently use appropriate software tools to perform at least 2 of the following:
Proof—or “clean"—data to make it amenable to further analysis,
💡Students understood why you might use Python instead of a spreadsheet to clean and analyze the data.
Transform data (e.g. from text to numbers), through coding or another process.
🔄 Even though I provided access to a clean data set, I underestimated student difficulties in excluding observations or working with text-to-number operations for some categorical variables.
Visualize and compare attributes of a data set,
💡The poster project requires students to create bar charts and histograms to describe the frequency counts for each categorical variable.
Analyze data for pertinent information, and
Develop models that estimate relationships among variables in the data.
🔄In the poster project, students focused on three paths to estimate whether there was a relationship: Test for Independence, ANOVA (equality of means for three or more groups), and simple linear regression. Given the inference tools we covered and the Add Health data set selection, I am concerned that the students who chose two categorical variables (most students) could only test whether there is evidence that something may be happening with the two variables. Maybe that is ok for exploratory analysis like this, but the risk is that they confuse statistical and economic significance when communicating results for decision analysis. I would want to consider this more if I teach the class again. Have I unwittingly made p-values sticky?
I plan to post during June to share what I learn from my asynchronous Principles of Economics class. I am experimenting with social annotation activities for the Monday Morning Economist and some classical progymnasmata writing activities related to inventing arguments. See you in June!

