Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
430 views
in Technique[技术] by (71.8m points)

record linkage - R fastLink package inserting m and u prob into emlinkMARmov

Trying to use the R fastLink package and inserting the starting values for probabilities in the the emlinkMARmov function.

The result is an error when I run emlinkMARmov. Specifically if i run the function with:

  1. 1-4 probability values, this is the result

p.gamma.k.m[[i]] : subscript out of bounds

  1. 5 or more values and the result is

Error in p.gamma.k.m[[i]] <- *vtmp* : more elements supplied than there are to replace

According to the documentation here https://github.com/tedenamorado/fastLink, I need to feed in a vector of length = # of linkage fields. This means 5. So not sure why the function did not work with 5 values in pgammakm?

Code below..

library('RecordLinkage')
RLdata10000$trueid <- identity.RLdata10000
RLdata10000$id <- 1:nrow(RLdata10000)

library('fastLink')
library(tidyverse)

## Create Agreement Vectors
g1 <- gammaCKpar(RLdata10000$fname_c1, RLdata10000$fname_c1, cut.a = 0.94, cut.p = 0.88)
g2 <- gammaCKpar(RLdata10000$lname_c1, RLdata10000$lname_c1, cut.a = 0.94, cut.p = 0.88)
g3 <- gammaKpar(RLdata10000$by, RLdata10000$by)
g4 <- gammaKpar(RLdata10000$bm, RLdata10000$bm)
g5 <- gammaKpar(RLdata10000$bd, RLdata10000$bd)
nr <- nrow(RLdata10000)

## Count Patterns + EM
counts <- tableCounts(list(g1, g2, g3, g4, g5), nobs.a = nr, nobs.b = nr)
# Put in starting positions for m and u for features to influence outcome
pgammakm = c(1.331910971833257e-14, 4.683960094033817e-03, 4.112060625003156e-02, 1.432229262705961e-02, 
             4.692862863782486e-02)     #probability that conditional on being in the matched set, we observed a specific agreement
pgammaku = c(0.986035579798584805,0.973281071443571011, 0.987948005162691989,0.916094176950708938, 
             0.966858439069919084)    #probability that conditional on being in the unmatched set, we observed a specific agreement value for field k

resEM <- emlinkMARmov(counts, nobs.a = nr, nobs.b = nr, p.gamma.k.m=pgammakm, p.gamma.k.u = pgammaku )

## Matches
matches <- matchesLink(list(g1, g2, g3, g4, g5), nobs.a = nr, nobs.b = nr, em = resEM, thresh = 0.98)
question from:https://stackoverflow.com/questions/66054319/r-fastlink-package-inserting-m-and-u-prob-into-emlinkmarmov

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...