We are excited to introduce our speakers for the 2026 Oceania Stata Conference!
Nicholas Cox
Some Graphical Tips for Stata Users
The talk will cover a miscellany of graphical tips, some old, some new. Nick will discuss using both official commands and community-contributed commands. He will range from small stuff (often it is a matter of detail to change a good graph into a much better one), through various techniques and tricks, to broad strategy, both in learning and using Stata easily and effectively and in working with graphics for research, teaching, and service.
About the speaker
Nicholas Cox is a statistically minded geographer at Durham University. He works mostly with environmental data, secondarily with social science data. His interests include statistical graphics, exploratory data analysis, distributions, transformations, generalised linear models, directional data analysis, and the history of statistics. He contributes talks, postings, FAQs, and programs to the Stata user community. He has co-authored 16 commands in official Stata. He was an author of several inserts in the Stata Technical Bulletin and is an editor of the Stata Journal. His “Speaking Stata” articles on graphics from 2004 to 2013 have been collected as Speaking Stata Graphics (2014). He also edits the Tips in the Stata Journal, intermittently collected in book form (most recently in 2024).
Aramayis Dallakyan
Introduction to Explainable Machine Learning Using Stata
Machine learning (ML) has become a powerful tool for modeling complex data and providing accurate predictions. However, the "black-box" nature of many ML models often raises concerns about their explainability and trustworthiness. Explainable machine learning (XML) seeks to address these concerns by enhancing the transparency andunderstanding of ML predictions. This talk aims to provide a practical guide to XML techniques. It begins with an overview of ensemble decision tree models such as random forests and gradient boosting, which are widely used but often difficult to interpret. Aramayis then introduces methods for explaining predictions using both global and local XML techniques. These include state-of-the-art approaches such as SHAP values, individual conditional expectation (ICE) plots, variable importance measures, partial dependence plots, and global surrogate models.
About the speaker
Aramayis Dallakyan is a Senior Statistician and Software Developer at StataCorp LLC. His research interests lie at the intersection of high-dimensional time series, causal discovery, and statistical/machine learning. His work has appeared in leading venues in statistics, data science, and machine learning. Aramayis earned his PhD in Statistics from Texas A&M University.
Zixuan Cong
Preliminary Findings on Advancing Women's Health in Singapore through AI Acceptance
As artificial intelligence rapidly advances, it can integrate electronic health records, genetic profiles, and clinical information to enable individualized, female-specific prevention strategies. This presentation uses structural equation modelling in Stata to analyses survey data to examine factors influencing Singapore women’s adoption of AI-enabled healthcare.
Zixuan Cong is a Master's student in Behavioral and Implementation Science Interventions at the National University of Singapore, with hands-on experience in data analysis and user research with Stata. She is committed to addressing real-world challenges in health and behavioral sciences through rigorous, data-driven insights.
Pablo Gluzmann
SAMREGC: Stata Module to Perform Sensitivity Analysis of Main Regression Coefficients
This presentation introduces samregc, a fast, flexible, and simple Stata command that systematizes specification-based sensitivity analysis. It evaluates the robustness of target coefficients by analyzing all–subsets (or user defined subsets) regression results over alternative combinations of control variables.
About the speaker
Pablo Gluzmann is a PhD in Economics, Senior Researcher at CEDLAS-UNLP and CONICET, and a specialist in applied econometrics. He has been using Stata for more than 20 years for economic research and the development of user-written commands.
Links
Andrew Gray
"Can I just use n=30 in each group?": Using Stata for Sample Size Determination in an Increasingly Complex World
In this talk, Andrew will outline some of the practical and technical challenges involving sample size that he faces as a biostatistician in the health sciences. He will describe how Stata helps to support a workflow including simulations when needed and use examples from recent research projects.
About the speaker
Andrew Gray is a biostatistician in the Biostatistics Centre, University of Otago, where he collaborates on a wide range of health-related research projects as well as pursuing his own research. Prior to this, Andrew worked in a knowledge engineering research group in the Department of Information Science, University of Otago.
Yuke Li
How Cooking and Eating at Home Shape Emotional Well-Being: Insights from the Food & You Survey
Using data from the Food & You Survey, this study aimed to evaluate the behavioural pathways linking emotional well-being, cooking behaviour, and eating-at home practices, and to identify leverage points for public-health and behavioural interventions. Partial Least Squares Structural Equation Modelling was performed with Stata to assess these directional relationships.
About the speaker
Yuke Li is a Master’s student in Health Behavior and Implementation Science at the National University of Singapore, with a cross-disciplinary background in Food Science and Engineering. She specializes in leveraging Stata for regression analysis and statistical modeling, having analyzed over 500 datasets to explore the correlations between dietary behaviors, food safety, and mental health in her leading research.
Dean McKenzie
Healthcare Quality Control and Improvement Using Stata
Stata is widely used in healthcare to compare events such as falls, infections and episodes of delirium over time using control charts across hospital wards or across hospitals. This presentation demonstrates methods including control charts, funnel plots, ANOM and contrast and user written CHAID with the goal of developing healthcare quality improvement techniques.
About the speaker
Dean is Biostatistician with Epworth HealthCare, formerly with Monash University. He has been using Stata for more than 25 years, in a wide variery of medical, epidemiological and psychological applications.
Irma Mooi-Reci
XTVFREG: Stata Module for Estimating Variance Function Panel Regression
This presentation introduces xtvfreg, a new Stata module that implements an iterative mean variance panel regression estimator in which both the conditional mean and conditional variance of the dependent variable are modeled as functions of covariates. The estimator is designed for researchers working with panel data in which heteroskedasticity is substantively meaningful.
About the speaker
Irma studies labor market and career mobility dynamics and has been exploring these patterns with Stata since 2005. From tracking career jumps to decoding job flows, Irma turns complex labor data into clear insights and occasionally into graphs that make labour scholars smile.
Marianna Nitti
rdlasso: Regression Discontinuity with High-Dimensional Data
This presentation discusses a command, rdlasso, which allows the inclusion of high dimensional covariates in Regression Discontinuity Design (RDD) settings. The command allows for the inclusion of high-dimensional covariates in RDD for sharp and fuzzy cases, making the methodology accessible to Stata users and also automating the covariate selection procedure.
About the speaker
Marianna Nitti is a PhD student at Sapienza University of Rome. Her research interests include labor, family and gender economics. She is currently working on Stata and Python tools for data analysis.
Mathew Piercy
Fluid Balance in Postoperative Patients and Relationship to Acute Kidney Injury
This paper presents an audit of 330 postoperative patients examining the relationship between postoperative cumulative fluid balance over 7 days and the incidence and rate of recovery of acute kidney injury. The data was analysed with Stata using a zero-inflation Poisson regression model and compared to a GEE model and 2 models using h2o machine learning.
About the speaker
Matt is a newcomer to Stata, having joined the Stata community at the beginning of 2025. For the past 6 years he has worked with the Tasmanian Health Service at Northwest Regional Hospital Burnie as a staff specialist in ICU and anaesthetics.
Alannah Rudkin
The Use of Stata putdocx for Automating Data Safety Monitoring Committee Reports
Data safety monitoring committee meetings for clinical trials necessitate the creation of a statistical report from which the trial's safety, progress, and data integrity can be assessed. The presentation shows how much of the repetitive process can be streamlined and automated using Stata’s putdocx commands via a do-file.
About the speaker
Alannah is a biostatistician and project officer at the Murdoch Children's Research institute in Melbourne, Australia.
Thomas Soseco
Household Net Wealth Inequality in Indonesia: Evidence from a Dagum Type III Model
Investigating household net wealth inequality in Indonesia is important as it can worsen for low-class individuals or households who are unable to inherit sufficient capital for the next generations and maintain financial stability during a period of low or no income. This paper applies the Dagum Type III model to measure household net wealth inequality in Indonesia.
About the speaker
Thomas Soseco has over 10 years of experience using Stata, with a focus on its application in Economics and Development Economics.
Links
Xuelu Sun
gofbinreg: Goodness-of-fit Statistics in Binary Regression Models
When reporting the results of binary regression, it’s crucial to evaluate the overall model adequacy using goodness-of-fit statistics. This presentation introduces a command gofbinreg, which assesses the performance of the Hosmer-Lemeshow, normalised unweighted sum of squares and Hjort–Hosmer statistics to evaluate overall model adequacy.
About the speaker
Xuelu is a PhD candidate at Swinburne University of Technology. Her research focuses on goodness-of-fit statistics in regression models with categorical outcomes and further Stata implementation.
Luyang Xiao
Young Hearts at Risk: Preliminary Insights into Detection and Personalized Management of Acute Myocardial Infarction
This study investigates acute myocardial infarction among younger adults in Singapore and characterizes their clinical and risk profiles to guide early detection and targeted management. Using data from the National Registry of Diseases Office, a retrospective cohort of patients was analysed for 1-year all-cause mortality using Bayesian proportional hazards models in Stata.
About the speaker
Luyang Xiao is a Master's student in Behavioural and Implementation Science in Healthcare at the National University of Singapore (NUS). With four years of Stata experience, he focuses on leveraging real-world evidence to analyze public health issues and design data-driven behavioural interventions for individuals and organizations.
Ricardo Rodolfo Retamoza Yocupicio
Limitations and Comparison of the DFA PP and KPSS Unit Root Test: Evidence for Labour Variables of Mexico
Unit root tests have represented a great contribution to time series analysis by detecting variable stationarity. However, this presentation includes some of the criticisms that have been made to the unit root tests by executing in Stata the three best-known unit root tests for the main macroeconomic variables of Mexico, this with the intention of analyzing, both graphically and technically, whether the series are stationary or not.
About the speaker
Ricardo Retamoza is a PhD candidate in Economics at the National Autonomous University of Mexico (UNAM). He has worked with Stata on time series analysis, unit root tests, cointegration models, and vector autoregression applied to problems in the Mexican labor market such as informal employment, underemployment, unemployment, etc. He has previously participated in Stata Conferences held in Oslo, Norway, and São Paulo, Brazil.
Links
Hengni Yuan
Artificial Intelligence in Suicide Prevention: Comparative Evidence from a Network Meta Analysis
Artificial intelligence is emerging as a powerful tool in suicide prevention. Often outperforming traditional assessments, machine-learning models can analyse electronic health records and social-media language to identify subtle behavioural cues that precede suicidal thoughts or actions. This study applies network meta-analysis to the systematic review by Lejeune et al. (2022), which highlights the potential of AI in improving suicide-risk detection, screening, and monitoring.
About the speaker
Hengni is a Master's student in Behavioural and Implementation Sciences in Health at the NUS Yong Loo Lin School of Medicine. She is developing her expertise in applying Stata for data analysis within implementation and evaluation research in healthcare.
Shufan Zhao
Multistate Survival Modelling of Cardiovascular Admission and Mortality in a Heart Failure Cohort in Singapore
Heart failure patients often experience complex clinical trajectories involving hospitalisation and death. Conventional survival models that focus on a single endpoint may fail to capture these sequential outcomes adequately. This study applies a multistate survival framework to characterise transitions to cardiovascular admission and death.
About the speaker
Graduated with a master's degree in Behavioural and Implementation Science from the National University of Singapore, Shufan is a Stata user with developing proficiency and a strong interest in conducting advanced statistical modelling in health research.
Mark Chatfield
Finding Incorrect References to Variable Names or Event Names in a REDCap Database
Manually checking a REDCap database before it goes into production mode is a laborious task. By importing the data dictionary into Stata and then extracting references to [variable-names] and [event-names] in calculations, branching logic etc., I show how some basic checks can be done quickly.
About the speaker
Mark is a biostatistician at The University of Queensland. He collaborates with researchers in the Faculty of Health, Medicine and Behavioural Sciences and the UQ Clinical Trials Centre.
Links
Cindy Han
Uncovering Mathematical Values: A Stata-Powered Bilingual Text Analysis Tool
This story presents the development and application of the Values Automatic Sorting Algorithm, a custom text-analysis tool built entirely within Stata to efficiently process large-scale, open-ended bilingual survey data.
About the speaker
Dr Cindy Han is a Lecturer in the Department of Education at Swinburne University of Technology and a member of the Advisory Board for the Melbourne Girls’ Grammar Institute (MGGI). As a quantitative researcher, she specialises in using Stata to model the factors influencing educational outcomes, including student performance in mathematics and science.
Links
Gabor Mihala
A Minimalist Approach to Version Control and Reproducibility in Stata
This story introduces a deliberately, almost too-simple approach to version control and reproducibility for in-house Stata workflows: a short paragraph of Stata code that can be pasted into any do-file. This code automatically records the development history of the file and supports reproducible results.
About the speaker
Gabor is a biostatistician at the Australasian Kidney Trials Network. He has 15 years of experience using Stata to analyse sensitive data from clinical trials and health research studies.