Latest del news

del

Incorporating conditional dependence in latent class models for probabilistic record linkage: Does it matter?

By projecteuclid.org
Published On :: Wed, 16 Oct 2019 22:03 EDT

Huiping Xu, Xiaochun Li, Changyu Shen, Siu L. Hui, Shaun Grannis.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1753--1790.

Abstract:
The conditional independence assumption of the Felligi and Sunter (FS) model in probabilistic record linkage is often violated when matching real-world data. Ignoring conditional dependence has been shown to seriously bias parameter estimates. However, in record linkage, the ultimate goal is to inform the match status of record pairs and therefore, record linkage algorithms should be evaluated in terms of matching accuracy. In the literature, more flexible models have been proposed to relax the conditional independence assumption, but few studies have assessed whether such accommodations improve matching accuracy. In this paper, we show that incorporating the conditional dependence appropriately yields comparable or improved matching accuracy than the FS model using three real-world data linkage examples. Through a simulation study, we further investigate when conditional dependence models provide improved matching accuracy. Our study shows that the FS model is generally robust to the conditional independence assumption and provides comparable matching accuracy as the more complex conditional dependence models. However, when the match prevalence approaches 0% or 100% and conditional dependence exists in the dominating class, it is necessary to address conditional dependence as the FS model produces suboptimal matching accuracy. The need to address conditional dependence becomes less important when highly discriminating fields are used. Our simulation study also shows that conditional dependence models with misspecified dependence structure could produce less accurate record matching than the FS model and therefore we caution against the blind use of conditional dependence models.

Incorporating conditional dependence in latent class models for probabilistic record linkage: Does it matter?

A hierarchical Bayesian model for single-cell clustering using RNA-sequencing data

A Bayesian mark interaction model for analysis of tumor pathology images

Sequential decision model for inference and prediction on nonuniform hypergraphs with application to knot matching from computational forestry

RCRnorm: An integrated system of random-coefficient hierarchical regression models for normalizing NanoString nCounter data

Modeling seasonality and serial dependence of electricity price curves with warping functional autoregressive dynamics

Network modelling of topological domains using Hi-C data

A hidden Markov model approach to characterizing the photo-switching behavior of fluorophores

Imputation and post-selection inference in models with missing data: An application to colorectal cancer surveillance guidelines

Introduction to papers on the modeling and analysis of network data—II

Local law and Tracy–Widom limit for sparse stochastic block models

A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics

Efficient estimation in single index models through smoothing splines

Reliable clustering of Bernoulli mixture models

A new McKean–Vlasov stochastic interpretation of the parabolic–parabolic Keller–Segel model: The one-dimensional case

Strictly weak consensus in the uniform compass model on &#36;mathbb{Z}&#36;

Consistent structure estimation of exponential-family random graph models with block structure

The maximal degree in a Poisson–Delaunay graph

Distances and large deviations in the spatial preferential attachment model

Robust estimation of mixing measures in finite mixture models

Stochastic differential equations with a fractionally filtered delay: A semimartingale model for long-range dependent processes

Consistent semiparametric estimators for recurrent event times models with application to virtual age models

From the coalfields of Somerset to the Adelaide Hills and beyond : the story of the Hewish Family : three centuries of one family's journey through time / Maureen Brown.

Boeing says it&#39;s about to start building the 737 Max plane again in the middle of the coronavirus pandemic, even though it already has more planes than it can deliver

Delta, citing health concerns, drops service to 10 US airports. Is yours on the list?

Joint Modeling of Longitudinal Relational Data and Exogenous Variables

Bayesian Inference in Nonparanormal Graphical Models

Additive Multivariate Gaussian Processes for Joint Species Distribution Modeling with Heterogeneous Data

Dynamic Quantile Linear Models: A Bayesian Approach

Learning Semiparametric Regression with Missing Covariates Using Gaussian Process Models

Adaptive Bayesian Nonparametric Regression Using a Kernel Mixture of Polynomials with Application to Partial Linear Models

Bayesian Design of Experiments for Intractable Likelihood Models Using Coupled Auxiliary Models and Multivariate Emulation

Scalable Bayesian Inference for the Inverse Temperature of a Hidden Potts Model

Hierarchical Normalized Completely Random Measures for Robust Graphical Modeling

Spatial Disease Mapping Using Directed Acyclic Graph Auto-Regressive (DAGAR) Models

Estimating the Use of Public Lands: Integrated Modeling of Open Populations with Convolution Likelihood Ecological Abundance Regression

Bayes Factors for Partially Observed Stochastic Epidemic Models

Probability Based Independence Sampler for Bayesian Quantitative Learning in Graphical Log-Linear Marginal Models

Semiparametric Multivariate and Multiple Change-Point Modeling

Model Criticism in Latent Space

Low Information Omnibus (LIO) Priors for Dirichlet Process Mixture Models

Efficient Acquisition Rules for Model-Based Approximate Bayesian Computation

Fast Model-Fitting of Bayesian Variable Selection Regression Using the Iterative Complex Factorization Algorithm

A Bayesian Nonparametric Spiked Process Prior for Dynamic Model Selection

Analysis of the Maximal a Posteriori Partition in the Gaussian Dirichlet Process Mixture Model

Efficient Bayesian Regularization for Graphical Model Selection

Variational Message Passing for Elaborate Response Regression Models

Modeling Population Structure Under Hierarchical Dirichlet Processes

A Tale of Two Parasites: Statistical Modelling to Support Disease Control Programmes in Africa

Risk Models for Breast Cancer and Their Validation

CBC Flooring's Indelval is environmentally friendly

EWR-DEL-BLR in UA J and AI (UK) C -- and a warning about crew rest on United 789s

Lean Hypotheses and Effectual Commitments: An Integrative Framework Delineating the Methods of Science and Entrepreneurship

Image modeling for biomedical organs

Minecraft's business model is 'leave users alone' — will it be Microsoft's?

NFL Commissioner Roger Goodell says he never considered resigning following abuse scandals

Indonesia's Indosat, GoTo launch local language AI model

Mondelez names EVP and president for North America

Mondelez becomes Official Snacks of MLS

Gluten-Free Products: Delicious and Nutritious

Super Micro Stock Could Get Delisted. What to Consider If You Own the Shares.

AI Guidelines for Businesses: Using AI in Your Own Company

Delegate Information for IWMW 2008 now available

Delegates to get preferential rates when using University of Aberdeen Sports facilities

Bill Clinton addresses AIDS 2014 delegates in Melbourne

Subscribe To Our Newsletter

Strictly weak consensus in the uniform compass model on $mathbb{Z}$

Boeing says it's about to start building the 737 Max plane again in the middle of the coronavirus pandemic, even though it already has more planes than it can deliver