GUILD: guidance for information about linking data sets

Journal article


Gilbert, Ruth, Lafferty, Rosemary, Hagger-Johnson, Gareth, Harron, Katie, Zhang, Li-Chun, Smith, Peter W.F., Dibben, Chris and Goldstein, Harvey. (2018). GUILD: guidance for information about linking data sets. Journal of Public Health. 40(1), pp. 191 - 198. https://doi.org/10.1093/pubmed/fdx037
AuthorsGilbert, Ruth, Lafferty, Rosemary, Hagger-Johnson, Gareth, Harron, Katie, Zhang, Li-Chun, Smith, Peter W.F., Dibben, Chris and Goldstein, Harvey
Abstract

Record linkage of administrative and survey data is increasingly used to generate evidence to inform policy and services. Although a powerful and efficient way of generating new information from existing data sets, errors related to data processing before, during and after linkage can bias results. However, researchers and users of linked data rarely have access to information that can be used to assess these biases or take them into account in analyses. As linked administrative data are increasingly used to provide evidence to guide policy and services, linkage error, which disproportionately affects disadvantaged groups, can undermine evidence for public health. We convened a group of researchers and experts from government data providers to develop guidance about the information that needs to be made available about the data linkage process, by data providers, data linkers, analysts and the researchers who write reports. The guidance goes beyond recommendations for information to be included in research reports. Our aim is to raise awareness of information that may be required at each step of the linkage pathway to improve the transparency, reproducibility, and accuracy of linkage processes, and the validity of analyses and interpretation of results.

Year2018
JournalJournal of Public Health
Journal citation40 (1), pp. 191 - 198
PublisherOxford University Press
ISSN1741-3842
Digital Object Identifier (DOI)https://doi.org/10.1093/pubmed/fdx037
Scopus EID2-s2.0-85044355447
Page range191 - 198
Research GroupInstitute for Learning Sciences and Teacher Education (ILSTE)
Publisher's version
File Access Level
Controlled
Place of publicationUnited Kingdom
Permalink -

https://acuresearchbank.acu.edu.au/item/87y13/guild-guidance-for-information-about-linking-data-sets

Restricted files

Publisher's version

  • 119
    total views
  • 0
    total downloads
  • 1
    views this month
  • 0
    downloads this month
These values are for the period from 19th October 2020, when this repository was created.

Export as

Related outputs

Enhanced use of educational accountability data to monitor educational progress of Australian students with focus on Indigenous students
Cumming, Joy, Goldstein, Harvey and Hand, Kirstine. (2020). Enhanced use of educational accountability data to monitor educational progress of Australian students with focus on Indigenous students. Educational Assessment, Evaluation and Accountability. 32, pp. 29-51. https://doi.org/10.1007/s11092-019-09310-x
Estimating reliability statistics and measurement error variances using instrumental variables with longitudinal data
Goldstein, Harvey, Haynes, Michele, Leckie, George and Tran, Phuong. (2020). Estimating reliability statistics and measurement error variances using instrumental variables with longitudinal data. Longitudinal and Life Course Studies. 11(3), pp. 289 - 306. https://doi.org/10.1332/175795920X15844303873216
Mindfulness-based intervention for educators: Effects of a school-based cluster randomized controlled study
Hwang, Yoon-Suk, Goldstein, Harvey, Medvedev, Oleg N., Singh, Nirbhay N., Noh, Jae-Eun and Hand, Kirstine Alicia. (2019). Mindfulness-based intervention for educators: Effects of a school-based cluster randomized controlled study. Mindfulness. 10(7), pp. 1417 - 1436. https://doi.org/10.1007/s12671-019-01147-1
A software package for the application of probabilistic anonymisation to sensitive individual-level data: A proof of principle with an example from the ALSPAC birth cohort study
Avraam, Demetris, Boyd, Andy, Goldstein, Harvey and Burton, Paul. (2018). A software package for the application of probabilistic anonymisation to sensitive individual-level data: A proof of principle with an example from the ALSPAC birth cohort study. Longitudinal and Life Course Studies. 9(4), pp. 433-446. https://doi.org/10.14301/llcs.v9i4.478
Multilevel growth curve models that incorporate a random coefficient model for the level 1 variance function
Goldstein, Harvey, Leckie, George, Charlton, Christopher, Tilling, Kate and Browne, William J.. (2018). Multilevel growth curve models that incorporate a random coefficient model for the level 1 variance function. Statistical Methods in Medical Research. 27(11), pp. 3478 - 3491. https://doi.org/10.1177/0962280217706728
Bayesian models for weighted data with missing values: a bootstrap approach
Goldstein, Harvey, Carpenter, James and Kenward, Michael G.. (2018). Bayesian models for weighted data with missing values: a bootstrap approach. Journal of the Royal Statistical Society Series C: Applied Statistics. 67(4), pp. 1071 - 1081. https://doi.org/10.1111/rssc.12259
A guide to evaluating linkage quality for the analysis of linked data
Harron, Katie, Doidge, James C., Knight, Hannah E., Gilbert, Ruth, Goldstein, Harvey, Cromwell, David A. and van der Meulen, Jan H.. (2017). A guide to evaluating linkage quality for the analysis of linked data. International Journal of Epidemiology. 46(5), pp. 1699 - 1710. https://doi.org/10.1093/ije/dyx177
A Bayesian model for measurement and misclassification errors alongside missing data, with an application to higher education participation in Australia
Goldstein, Harvey, Browne, William J. and Charlton, Christopher. (2017). A Bayesian model for measurement and misclassification errors alongside missing data, with an application to higher education participation in Australia. Journal of Applied Statistics. 45(5), pp. 918 - 931. https://doi.org/10.1080/02664763.2017.1322558
Probabilistic linkage to enhance deterministic algorithms and reduce data linkage errors in hospital administrative data
Hagger-Johnson, Gareth, Harron, Katie, Goldstein, Harvey, Aldridge, Rob and Gilbert, Ruth. (2017). Probabilistic linkage to enhance deterministic algorithms and reduce data linkage errors in hospital administrative data. Journal of Innovation in Health Informatics. 24(2), pp. 234 - 246. https://doi.org/10.14236/jhi.v24i2.891
A scaling approach to record linkage
Goldstein, Harvey, Harron, Katie and Cortina-Borja, Mario. (2017). A scaling approach to record linkage. Statistics in Medicine. 36(16), pp. 2514 - 2521. https://doi.org/10.1002/sim.7287
Utilising identifier error variation in linkage of large administrative data sources
Harron, Katie, Hagger-Johnson, Gareth, Gilbert, Ruth and Goldstein, Harvey. (2017). Utilising identifier error variation in linkage of large administrative data sources. BMC Medical Research Methodology. 17(1), pp. 1 - 9. https://doi.org/10.1186/s12874-017-0306-8
Challenges in administrative data linkage for research
Harron, Katie, Dibben, Chris, Boyd, James, Hjern, Anders, Azimaee, Mahmoud, Barreto, Mauricio L. and Goldstein, Harvey. (2017). Challenges in administrative data linkage for research. Big Data and Society. 4(2), pp. 1 - 12. https://doi.org/10.1177/2053951717745678
Integrating area-based and national samples in birth cohort studies: the case of life study
Goldstein, Harvey, Sera, Francesco, Elias, Peter and Dezateux, Carol. (2017). Integrating area-based and national samples in birth cohort studies: the case of life study. Longitudinal and Life Course Studies. 8(3), pp. 281 - 289. https://doi.org/10.14301/llcs.v8i3.439
The evolution of school league tables in England 1992-2016: 'contextual value-added’, ‘expected progress’ and ‘progress 8’
Leckie, George and Goldstein, Harvey. (2017). The evolution of school league tables in England 1992-2016: 'contextual value-added’, ‘expected progress’ and ‘progress 8’. British Educational Research Journal. 43(2), pp. 193 - 212. https://doi.org/10.1002/berj.3264
Handling attrition and non-response in longitudinal data with an application to a study of Australian youth
Cumming, Jacqueline Joy and Goldstein, Harvey. (2016). Handling attrition and non-response in longitudinal data with an application to a study of Australian youth. Longitudinal and Life Course Studies. 7(1), pp. 53 - 63. https://doi.org/10.14301/llcs.v7i1.342
Record linkage
Goldstein, Harvey and Harron, Katie. (2016). Record linkage. In In K. Harron, H. Goldstein and C. Dibben (Ed.). Methodological developments in data linkage John Wiley & Sons.
Trends in examination performance and exposure to standardised tests in England and Wales
Goldstein, Harvey and Leckie, George. (2016). Trends in examination performance and exposure to standardised tests in England and Wales. British Educational Research Journal. 42(3), pp. 367 - 375. https://doi.org/10.1002/berj.3220
Interviewer effects on non-response propensity in longitudinal surveys : A multilevel modelling approach
Vassallo, Rebecca, Durrant, Gabriele, Smith, Peter and Goldstein, Harvey. (2015). Interviewer effects on non-response propensity in longitudinal surveys : A multilevel modelling approach. Royal Statistical Society. Journal. Series A: Statistics in Society. 178(1), pp. 83 - 99. https://doi.org/10.1111/rssa.12049
A multilevel modelling approach to measuring changing patterns of ethnic composition and segregation among London secondary schools, 2001-2010
Leckie, George and Goldstein, Harvey. (2015). A multilevel modelling approach to measuring changing patterns of ethnic composition and segregation among London secondary schools, 2001-2010. Royal Statistical Society. Journal. Series A: Statistics in Society. 178(2), pp. 405 - 424. https://doi.org/10.1111/rssa.12066
Validity, science and educational measurement
Goldstein, Harvey. (2015). Validity, science and educational measurement. Assessment in Education: Principles, Policy & Practice. 22(2), pp. 193 - 201. https://doi.org/10.1080/0969594X.2015.1015402
Population sampling in longitudinal surveys
Goldstein, Harvey, Lynn, Peter, Muniz-terrera, Graciela, Hardy, Rebecca, O'Muircheartaigh, Colm, Skinner, Chris and Lehtonen, Risto. (2015). Population sampling in longitudinal surveys. Longitudinal and Life Course Studies (online). 6(4), pp. 447 - 452. https://doi.org/10.14301/llcs.v6i4.345
After the RCT : Who comes to a family-based intervention for childhood overweight or obesity when it is implemented at scale in the community?
Fagg, James, Cole, Tim, Cummins, Steven, Goldstein, Harvey, Morris, Stephen, Radley, Duncan, Sacher, Paul and Law, Catherine. (2015). After the RCT : Who comes to a family-based intervention for childhood overweight or obesity when it is implemented at scale in the community? Journal of Epidemiology and Community Health. 69(2), pp. 142 - 148. https://doi.org/10.1136/jech-2014-204155
Data linkage errors in hospital administrative data when applying a pseudonymisation algorithm to paediatric intensive care records
Hagger-Johnson, Gareth, Harron, Katie, Fleming, Tom, Gilbert, Ruth, Goldstein, Harvey, Landy, Rebecca and Parslow, Roger. (2015). Data linkage errors in hospital administrative data when applying a pseudonymisation algorithm to paediatric intensive care records. BMJ Open. 5(8), pp. 1 - 8. https://doi.org/10.1136/bmjopen-2015-008118
Identifying possible false matches in anonymized hospital administrative data without patient identifiers
Hagger-Johnson, Gareth, Harron, Katie, Gonzallez-Izquierdo, Arturo, Cortina-Borja, Mario, Dattani, Nirupa, Muller-Pebody, Berit, Parslow, Roger, Gilbert, Ruth and Goldstein, Harvey. (2015). Identifying possible false matches in anonymized hospital administrative data without patient identifiers. Health Services Research. 50(4), pp. 1162 - 1178. https://doi.org/10.1111/1475-6773.12272
Evaluating bias due to data linkage error in electronic healthcare records
Harron, Katie, Wade, Angie, Gilbert, Ruth, Muller-Pebody, Berit and Goldstein, Harvey. (2014). Evaluating bias due to data linkage error in electronic healthcare records. BMC Medical Research Methodology. 14(1), pp. 1 - 10. https://doi.org/10.1186/1471-2288-14-36
Fitting multilevel multivariate models with missing data in responses and covariates that may include interactions and non-linear terms
Goldstein, Harvey, Carpenter, James and Browne, William. (2014). Fitting multilevel multivariate models with missing data in responses and covariates that may include interactions and non-linear terms. Journal of the Royal Statistical Society Series A: Statistics in Society. 177(2), pp. 553 - 564. https://doi.org/10.1111/rssa.12022
Panel attrition : How important is interviewer continuity?
Lynn, Peter, Kaminska, Olena and Goldstein, Harvey. (2014). Panel attrition : How important is interviewer continuity? Journal of Official Statistics. 30(3), pp. 443 - 457. https://doi.org/10.2478/JOS-2014-0028
Using league table rankings in public policy formation : Statistical issues
Goldstein, Harvey. (2014). Using league table rankings in public policy formation : Statistical issues. Annual Review of Statistics and Its Application. 1, pp. 385 - 399.
Modelling survival and mortality risk to 15 years of age for a national cohort of children with serious congenital heart defects diagnosed in infancy
Knowles, Rachel L., Bull, Catherine, Wren, Christopher, Wade, Angela, Goldstein, Harvey and Dezateux, Carol. (2014). Modelling survival and mortality risk to 15 years of age for a national cohort of children with serious congenital heart defects diagnosed in infancy. PLoS ONE. 9(8), pp. 1 - 15. https://doi.org/10.1371/journal.pone.0106806
From trial to population: A study of a family-based community intervention for childhood overweight implemented at scale
Fagg, James, Chadwick, P., Cole, Tim, Cummins, Steven, Goldstein, Harvey, Lewis, H., Morris, Sue, Radley, Duncan, Sacher, Paul and Law, Catherine. (2014). From trial to population: A study of a family-based community intervention for childhood overweight implemented at scale. International Journal of Obesity. 38(10), pp. 1343 - 1349. https://doi.org/10.1038/ijo.2014.103
Using league table rankings in public policy formation: Statistical issues
Goldstein, Harvey. (2014). Using league table rankings in public policy formation: Statistical issues. Annual Review of Statistics and Its Application. 1(1), pp. 385 - 399. https://doi.org/10.1146/annurev-Statistics-022513-115615
Knowledge and numbers in education
Goldstein, Harvey and Moss, Gemma. (2014). Knowledge and numbers in education. Comparative Education. 50(3), pp. 259 - 265. https://doi.org/10.1080/14681366.2014.926138
Adjusting for differential misclassification in multilevel models: The relationship between child exposure to smoke and cognitive development
Ferrao, Maria and Goldstein, Harvey. (2014). Adjusting for differential misclassification in multilevel models: The relationship between child exposure to smoke and cognitive development. Quality and Quantity (Print). 48(1), pp. 251 - 258. https://doi.org/10.1007/s11135-012-9765-5
University mission creep? Comparing EU and US faculty views of university involvement in regional economic development and commercialization
Goldstein, Harvey, Bergman, Edward M. and Maier, Gunther. (2013). University mission creep? Comparing EU and US faculty views of university involvement in regional economic development and commercialization. The Annals of Regional Science. 50(2), pp. 453 - 477. https://doi.org/10.1007/s00168-012-0513-5
Evaluating educational changes: A statistical perspective
Goldstein, Harvey. (2013). Evaluating educational changes: A statistical perspective. Ensaio: Avaliacao e Politicas Publicas em Educacao. 21(78), pp. 101 - 114. https://doi.org/10.1590/S0104-40362013005000002
Linkage, Evaluation and Analysis of National Electronic Healthcare Data : Application to Providing Enhanced Blood-Stream Infection Surveillance in Paediatric Intensive Care
Harron, Katie, Goldstein, Harvey, Wade, Angie, Muller-Pebody, Berit, Parslow, Roger and Gilbert, Ruth. (2013). Linkage, Evaluation and Analysis of National Electronic Healthcare Data : Application to Providing Enhanced Blood-Stream Infection Surveillance in Paediatric Intensive Care. PLoS ONE. 8(12), pp. 1 - 11. https://doi.org/10.1371/journal.pone.0085278
Linkage, evaluation and analysis of National Electronic Healthcare Data : Application to providing enhanced blood-stream infection surveillance in paediatric intensive care
Harron, Katie, Goldstein, Harvey, Wade, Angie, Muller-Pebody, Berit, Goldstein, Harvey, Parslow, Roger and Gilbert, Ruth. (2013). Linkage, evaluation and analysis of National Electronic Healthcare Data : Application to providing enhanced blood-stream infection surveillance in paediatric intensive care. PLoS One (online). 8(12), pp. 1 - 11. https://doi.org/10.1007/s00134-013-2841-z
Risk-adjusted monitoring of blood-stream infection in paediatric intensive care : A data linkage study
Harron, Katie, Wade, Angie, Muller-Pebody, Berit, Goldstein, Harvey, Parslow, Roger, Gray, Jim, Hartley, John, Mok, Quen and Gilbert, Ruth. (2013). Risk-adjusted monitoring of blood-stream infection in paediatric intensive care : A data linkage study. Intensive Care Medicine. 39(6), pp. 1080 - 1087. https://doi.org/10.1007/s00134-013-2841-z
Transitioning to the new economy: Individual, regional and intermediation influences on workforce retraining outcomes
Goldstein, H. A., Lowe, N. and Donegan, M.. (2012). Transitioning to the new economy: Individual, regional and intermediation influences on workforce retraining outcomes. Regional Studies. 46(1), pp. 105 - 118. https://doi.org/10.1080/00343404.2010.486786
Francis Galton, measurement, psychometrics and social progress
Goldstein, Harvey. (2012). Francis Galton, measurement, psychometrics and social progress. Assessment in Education: Principles, Policy & Practice. 19(2), pp. 147 - 158. https://doi.org/10.1080/0969594X.2011.614220
The quality of planning scholarship and doctoral education
Goldstein, Harvey A.. (2012). The quality of planning scholarship and doctoral education. Journal of Planning Education and Research. 32(4), pp. 493 - 496. https://doi.org/10.1177/0739456X12449484
Multilevel Modeling of Social Segregation
Leckie, George, Pillinger, Rebecca, Jones, Kelvyn and Goldstein, Harvey. (2012). Multilevel Modeling of Social Segregation. Journal of Educational and Behavioral Statistics. 37(1), pp. 3 - 30. https://doi.org/10.3102/1076998610394367
The analysis of record-linked data using multiple imputation with data value priors
Goldstein, Harvey, Harron, Katie and Wade, Angie. (2012). The analysis of record-linked data using multiple imputation with data value priors. Statistics in Medicine. 31(28), pp. 3481 - 3493. https://doi.org/10.1002/sim.5508
Measuring success: League tables in the public sector
Foley, Beth and Goldstein, Harvey. (2012). Measuring success: League tables in the public sector London, United Kingdom: The British Academy.
REALCOM-IMPUTE Software for Multilevel Multiple Imputation with Mixed Response Types
Carpenter, James, Goldstein, Harvey and Kenward, Michael. (2011). REALCOM-IMPUTE Software for Multilevel Multiple Imputation with Mixed Response Types. Journal of Statistical Software. 45(5), pp. 1 - 14.
A note on 'The limitations of school league tables to inform school choice'
Leckie, George and Goldstein, Harvey. (2011). A note on 'The limitations of school league tables to inform school choice'. Journal of the Royal Statistical Society Series A: Statistics in Society. 174(3), pp. 833 - 836. https://doi.org/10.1111/j.1467-985X.2010.00688.x
Estimating research performance by using research grant award gradings
Goldstein, Harvey. (2011). Estimating research performance by using research grant award gradings. Journal of the Royal Statistical Society Series A: Statistics in Society. 174(1), pp. 83 - 93. https://doi.org/10.1111/j.1467-985X.2010.00657.x
Understanding uncertainty in school league tables
Leckie, George and Goldstein, Harvey. (2011). Understanding uncertainty in school league tables. Fiscal Studies. 32(2), pp. 207 - 224. https://doi.org/10.1111/j.1475-5890.2011.00133.x
Patchwork intermediation: Challenges and opportunities for regionally coordinated workforce development
Lowe, Nichola, Goldstein, Harvey and Donegan, Mary. (2011). Patchwork intermediation: Challenges and opportunities for regionally coordinated workforce development. Economic Development Quarterly: the journal of American economic revitalization. 25(2), pp. 158 - 171. https://doi.org/10.1177/0891242410383413
Pupil composition and accountability: An analysis in English primary schools
Kounali, Daphne, Robinson, Anthony, Lauder, Hugh and Goldstein, Harvey. (2010). Pupil composition and accountability: An analysis in English primary schools. International Journal of Educational Research. 49(2-3), pp. 49 - 68. https://doi.org/10.1016/j.ijer.2010.08.001
MCMC sampling for a multilevel model with nonindependent residuals within and between cluster units
Browne, William and Goldstein, Harvey. (2010). MCMC sampling for a multilevel model with nonindependent residuals within and between cluster units. Journal of Educational and Behavioral Statistics. 35(4), pp. 453 - 473. https://doi.org/10.3102/1076998609359788
Pupil composition and accountability: An analysis in English primary schools
Lauder, Hugh, Kounali, Daphne, Robinson, Anthony and Goldstein, Harvey. (2010). Pupil composition and accountability: An analysis in English primary schools. International Journal of Educational Research. 45(2-3), pp. 49 - 68. https://doi.org/10.1016/j.ijer.2010.08.001
Statistical modelling of repeated measurement data
Goldstein, Harvey. (2010). Statistical modelling of repeated measurement data. Longitudinal and Life Course Studies. 1(2), pp. 170 - 185. https://doi.org/10.14301/llcs.v1i2.67
Handling attrition and non-response in longitudinal data
Goldstein, Harvey. (2009). Handling attrition and non-response in longitudinal data. Longitudinal and Life Course Studies.
Multilevel multivariate modelling of childhood growth, numbers of growth measurements and adult characteristics
Goldstein, Harvey and Kounali, Daphne. (2009). Multilevel multivariate modelling of childhood growth, numbers of growth measurements and adult characteristics. Royal Statistical Society. Journal. Series A: Statistics in Society.
Multilevel models with multivariate mixed response types
Goldstein, Harvey, Carpenter, James R., Kenward, Michael G. and Levin, Kate A.. (2009). Multilevel models with multivariate mixed response types. Statistical Modelling.
Comment peut-on utiliser les etudes comparatives internationales pour doter les politiques educatives d'information fiables?
Goldstein, Harvey. (2009). Comment peut-on utiliser les etudes comparatives internationales pour doter les politiques educatives d'information fiables? Revue Francaise de Pedagogie.
Comment: Citation Statistics
Goldstein, Harvey and Spiegelhalter, David. (2009). Comment: Citation Statistics. Statistical Science.
The limitations of using school league tables to inform school choice
Leckie, George and Goldstein, Harvey. (2009). The limitations of using school league tables to inform school choice. Royal Statistical Society. Journal. Series A: Statistics in Society.
Evidence and education policy - some reflections and allegations
Goldstein, Harvey. (2008). Evidence and education policy - some reflections and allegations. Cambridge Journal of Education.
Adjusting for measurement error in the value added model: evidence from Portugal
Ferrao, Maria Eugenia and Goldstein, Harvey. (2008). Adjusting for measurement error in the value added model: evidence from Portugal. Quality and Quantity. 43, pp. 951 - 963. https://doi.org/10.1007/s11135-008-9171-1
Review of 'Monitoring Educational Achievement'
Goldstein, Harvey. (2008). Review of 'Monitoring Educational Achievement'. International Journal of Educational Development.
Modelling measurement errors and category misclassifications in multilevel models
Goldstein, Harvey, Kounali, Daphne and Robinson, Anthony. (2008). Modelling measurement errors and category misclassifications in multilevel models. Statistical Modelling.
School league tables: what can they really tell us?
Goldstein, Harvey. (2008). School league tables: what can they really tell us? Significance.
The effects of year repetition (redoublement) on the progress of pupils in the first three years of French schooling
Goldstein, Harvey. (2008). The effects of year repetition (redoublement) on the progress of pupils in the first three years of French schooling.
Techniques for Monitoring the Comparability of Examination Standards
Goldstein, Harvey. (2007). Techniques for Monitoring the Comparability of Examination Standards Qualifications and Curriculum Authority.