Title: | Structural Modeling for Multiple Latent Class Variables |
---|---|
Description: | Provides comprehensive tools for the implementation of Structural Latent Class Models (SLCM), including Latent Transition Analysis (LTA; Linda M. Collins and Stephanie T. Lanza, 2009) <doi:10.1002/9780470567333>, Latent Class Profile Analysis (LCPA; Hwan Chung et al., 2010) <doi:10.1111/j.1467-985x.2010.00674.x>, and Joint Latent Class Analysis (JLCA; Saebom Jeon et al., 2017) <doi:10.1080/10705511.2017.1340844>, and any other extended models involving multiple latent class variables. |
Authors: | Youngsun Kim [aut, cre] |
Maintainer: | Youngsun Kim <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.3.1 |
Built: | 2025-02-05 05:57:46 UTC |
Source: | https://github.com/kim0sun/slca |
This dataset contains responses from the National Longitudinal Study of Adolescent Health (Add Health), focusing on adolescents' experiences with depression. The subjects, who were in Grades 10 and 11 during the 1994–1995 academic year, provided data on at least one measure of adolescent delinquency in Wave I.
These data can be used to replicate the latent class analysis conducted by Collins and Lanza (2009).
The dataset includes five covariates, notably grade level and sex of respondents, along with variables capturing depressive emotions: sadness (S1-S4
), feeling disliked (D1-D2
), and feelings of failure (F1-F2
).
Responses for these variables were initially categorized as "Never," "Sometimes," "Often," or "Most or All of the Time." In this dataset, responses have been recoded as "No" for "Never" and "Yes" for all other responses, providing a longitudinal perspective on adolescent depression across Waves I and II. Variables with the suffix "w1"
are from Wave I, while those with the suffix "w2"
are from Wave II.
addhealth
addhealth
A data frame with 2061 rows and 18 variables:
GRADE
Respondent's grade level at Wave I.
SEX
Respondent's sex
levels: (1)Male
, (2)Female
.
S1w1
, S1w2
I felt that I could not shake off the blues even with help from my family and friends.
S2w1
, S2w2
I felt depressed.
S3w1
, S3w2
I felt lonely.
S4w1
, S4w2
I felt sad.
D1w1
, D1w2
People were unfriendly to me.
D2w1
, D2w2
I felt that people disliked me
F1w1
, F1w2
I thought my life had been a failure.
F2w1
, F2w2
I felt life was not worth living
https://addhealth.cpc.unc.edu/data/#public-use
Collins, L.M., & Lanza, S.T. (2009). Latent Class and Latent Transition Analysis: With Applications in the Social, Behavioral, and Health Sciences.
J.R. Udry. The National Longitudinal Study of Adolescent Health (Add Health), Waves I & II, 1994-1996. Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 2003.
library(magrittr) data <- addhealth[1:300,] lta5 <- slca( DEP1(5) ~ S1w1 + S2w1 + S3w1 + S4w1 + D1w1 + D2w1 + F1w1 + F2w1, DEP2(5) ~ S1w2 + S2w2 + S3w2 + S4w2 + D1w2 + D2w2 + F1w2 + F2w2, DEP1 ~ DEP2 ) %>% estimate(data, control = list(em.tol = 1e-6)) lta5inv <- slca( DEP1(5) ~ S1w1 + S2w1 + S3w1 + S4w1 + D1w1 + D2w1 + F1w1 + F2w1, DEP2(5) ~ S1w2 + S2w2 + S3w2 + S4w2 + D1w2 + D2w2 + F1w2 + F2w2, DEP1 ~ DEP2, constraints = c("DEP1", "DEP2") ) %>% estimate(data, control = list(em.tol = 1e-6)) compare(lta5inv, lta5, test = "chisq") lta5inv %>% param()
library(magrittr) data <- addhealth[1:300,] lta5 <- slca( DEP1(5) ~ S1w1 + S2w1 + S3w1 + S4w1 + D1w1 + D2w1 + F1w1 + F2w1, DEP2(5) ~ S1w2 + S2w2 + S3w2 + S4w2 + D1w2 + D2w2 + F1w2 + F2w2, DEP1 ~ DEP2 ) %>% estimate(data, control = list(em.tol = 1e-6)) lta5inv <- slca( DEP1(5) ~ S1w1 + S2w1 + S3w1 + S4w1 + D1w1 + D2w1 + F1w1 + F2w1, DEP2(5) ~ S1w2 + S2w2 + S3w2 + S4w2 + D1w2 + D2w2 + F1w2 + F2w2, DEP1 ~ DEP2, constraints = c("DEP1", "DEP2") ) %>% estimate(data, control = list(em.tol = 1e-6)) compare(lta5inv, lta5, test = "chisq") lta5inv %>% param()
slca
ModelsConducts a relative model fit test between two fitted SLCM models using the deviance statistic.
compare( model1, model2, test = c("none", "chisq", "boot"), nboot = 50, method = c("hybrid", "em", "nlm"), plot = FALSE, maxiter = 1000, tol = 1e-08, verbose = FALSE )
compare( model1, model2, test = c("none", "chisq", "boot"), nboot = 50, method = c("hybrid", "em", "nlm"), plot = FALSE, maxiter = 1000, tol = 1e-08, verbose = FALSE )
model1 |
an object of class |
model2 |
another object of class |
test |
a character string specifying the type of test to be conducted. If |
nboot |
an integer specifying the number of bootstrap iterations to perform (used only when |
method |
a character string specifying the estimation method for bootstrapping. |
plot |
a logical value indicating whether to display a histogram of G-squared statistics for the bootstrap samples (applicable only for |
maxiter |
an integer specifying the maximum number of iterations allowed during each bootstrap estimation round. The default is 100. |
tol |
numeric value setting the convergence tolerance for each bootstrap iteration. The default is |
verbose |
a logical value indicating whether to print progress updates on completed bootstrap iterations. The default is |
A data.frame
containing the number of parameters (Df), loglikelihood, AIC, BIC, G-squared statistics, and the residual degree of freedom for each object.
If a statistical test is conducted (via test
), the resulting p-value for the comparison is also included.
library(magrittr) data <- gss7677[gss7677$COHORT == "YOUNG", ] stat2 <- slca(status(2) ~ PAPRES + PADEG + MADEG) %>% estimate(data = data, control = list(verbose = FALSE)) stat3 <- slca(status(3) ~ PAPRES + PADEG + MADEG) %>% estimate(data = data, control = list(verbose = FALSE)) stat4 <- slca(status(4) ~ PAPRES + PADEG + MADEG) %>% estimate(data = data, control = list(verbose = FALSE)) gof(stat2, stat3, stat4) gof(stat2, stat3, stat4, test = "chisq") gof(stat2, stat3, stat4, test = "boot") compare(stat3, stat4) compare(stat3, stat4, test = "chisq") compare(stat3, stat4, test = "boot")
library(magrittr) data <- gss7677[gss7677$COHORT == "YOUNG", ] stat2 <- slca(status(2) ~ PAPRES + PADEG + MADEG) %>% estimate(data = data, control = list(verbose = FALSE)) stat3 <- slca(status(3) ~ PAPRES + PADEG + MADEG) %>% estimate(data = data, control = list(verbose = FALSE)) stat4 <- slca(status(4) ~ PAPRES + PADEG + MADEG) %>% estimate(data = data, control = list(verbose = FALSE)) gof(stat2, stat3, stat4) gof(stat2, stat3, stat4, test = "chisq") gof(stat2, stat3, stat4, test = "boot") compare(stat3, stat4) compare(stat3, stat4, test = "chisq") compare(stat3, stat4, test = "boot")
Computes confidence intervals for one or more parameters of a fitted model.
## S3 method for class 'slcafit' confint(object, parm, level = 0.95, type = c("param", "logit"), ...)
## S3 method for class 'slcafit' confint(object, parm, level = 0.95, type = c("param", "logit"), ...)
object |
an object of class |
parm |
an integer or string specifying the parameters for which confidence intervals are to be computed. |
level |
a numeric value representing the confidence level for the intervals. The default is |
type |
a character string specifying the format in which the results should be returned. Options include |
... |
additional arguments. |
A matrix
with two columns representing the confidence intervals for the selected parameters. The column names correspond to the specified confidence level:
100 * (level / 2)%
: The lower bound of the confidence interval.
100 * (1 - level / 2)%
: The upper bound of the confidence interval.
The level
argument determines the confidence level, with common values being 0.95
for a 95% confidence interval and 0.99
for a 99% confidence interval.
param(nlsy_jlcpa, index = TRUE) confint(nlsy_jlcpa) confint(nlsy_jlcpa, 1:4)
param(nlsy_jlcpa, index = TRUE) confint(nlsy_jlcpa) confint(nlsy_jlcpa, 1:4)
slca
ObjectEstimates the parameters of a model created using the slca
function.
estimate(x, ...) ## S3 method for class 'slca' estimate(x, data, method = c("em", "hybrid", "nlm"), fix2zero = NULL, control = slcaControl(), ...)
estimate(x, ...) ## S3 method for class 'slca' estimate(x, data, method = c("em", "hybrid", "nlm"), fix2zero = NULL, control = slcaControl(), ...)
x |
an |
... |
additional arguments passed to the estimation process. |
data |
a |
method |
a character string specifying the estimation method for SLCM parameters. The default is |
fix2zero |
a |
control |
a |
The fix2zero
argument allows you to constrain specific parameters to zero. Each parameter is associated with a unique index, which can be identified using the param function with the argument index = TRUE
. To apply constraints, provide the relevant parameter indices in the fix2zero
arguments with vector.
An object of class slcafit
containing the following components:
model |
a |
method |
the estimation method used. |
arg |
a brief description of the model used during estimation. |
mf |
the |
par |
the log of the estimated paramters. |
logit |
the log-odds of the estimated parameters. |
score |
the score function for the estimated parameters. |
posterior |
a |
convergence |
a logical indicator of whether convergence was achieved. |
loglikelihood |
the loglikelihood value of the estimated model. |
control |
the control settings used during the estimation process. |
The returned object can be further processed using the param function to extract the estimated parameters or their standard errors. The regress function allows for logistic regression analysis using a three-step approach to evaluate the effects of external variables on latent class variables. Additionally, several other methods are available, including predict.slcafit, reorder.slcafit, gof, and others.
m <- slca(lc[3] ~ y1 + y2 + y3 + y4) pi <- rep(1 / 3, 3) rho <- c(.9, .1, .9, .1, .9, .1, .9, .1, # class 1 .9, .1, .9, .1, .1, .9, .1, .9, # class 2 .1, .9, .1, .9, .1, .9, .1, .9) # class 3 dt <- simulate(m, 200, parm = c(pi, rho)) estimate(m, dt$response) # Several estimation methods estimate(m, dt$response, method = "em", control = slcaControl(verbose = TRUE)) # default estimate(m, dt$response, method = "nlm", control = slcaControl(verbose = TRUE)) estimate(m, dt$response, method = "hybrid", control = slcaControl(verbose = TRUE)) # Parameter restriction mf <- estimate(m, dt$response) param(mf, index = TRUE) mf0 <- estimate(mf, fix2zero = c(4, 6, 8, 10)) param(mf0) # Estimation control estimate(m, dt$response, control = slcaControl(nrep = 3, verbose = TRUE)) estimate(m, dt$response, control = slcaControl(init.param = c(pi, rho)))
m <- slca(lc[3] ~ y1 + y2 + y3 + y4) pi <- rep(1 / 3, 3) rho <- c(.9, .1, .9, .1, .9, .1, .9, .1, # class 1 .9, .1, .9, .1, .1, .9, .1, .9, # class 2 .1, .9, .1, .9, .1, .9, .1, .9) # class 3 dt <- simulate(m, 200, parm = c(pi, rho)) estimate(m, dt$response) # Several estimation methods estimate(m, dt$response, method = "em", control = slcaControl(verbose = TRUE)) # default estimate(m, dt$response, method = "nlm", control = slcaControl(verbose = TRUE)) estimate(m, dt$response, method = "hybrid", control = slcaControl(verbose = TRUE)) # Parameter restriction mf <- estimate(m, dt$response) param(mf, index = TRUE) mf0 <- estimate(mf, fix2zero = c(4, 6, 8, 10)) param(mf0) # Estimation control estimate(m, dt$response, control = slcaControl(nrep = 3, verbose = TRUE)) estimate(m, dt$response, control = slcaControl(init.param = c(pi, rho)))
slca
ModelComputes the AIC, BIC, and deviance statistic (G-squared) for assessing the goodness-of-fit of a fitted slca
model. If the test
argument is specified, absolute model fit can be evaluated using deviance statistics.
gof(object, ...) ## S3 method for class 'slcafit' gof( object, ..., test = c("none", "chisq", "boot"), nboot = 100, plot = FALSE, maxiter = 100, tol = 1e-6, verbose = FALSE ) ## S3 method for class 'slcafit' gof( object, ..., test = c("none", "chisq", "boot"), nboot = 100, plot = FALSE, maxiter = 100, tol = 1e-06, verbose = FALSE )
gof(object, ...) ## S3 method for class 'slcafit' gof( object, ..., test = c("none", "chisq", "boot"), nboot = 100, plot = FALSE, maxiter = 100, tol = 1e-6, verbose = FALSE ) ## S3 method for class 'slcafit' gof( object, ..., test = c("none", "chisq", "boot"), nboot = 100, plot = FALSE, maxiter = 100, tol = 1e-06, verbose = FALSE )
object |
an object of class |
... |
additional objects of class |
test |
a character string specifying the type of test to be conducted. If |
nboot |
an integer specifying the number of bootstrap rounds to be performed. |
plot |
a logical value indicating whether to print histogram of G-squared statistics for boostrap samples, only for |
maxiter |
an integer specifying the maximum number of iterations allowed for the estimation process during each bootstrap iteration. The default is 100. |
tol |
a numeric value specifying the convergence tolerance for each bootstrap iteration. The default is |
verbose |
a logical value indicating whether to print progress updates on the number of bootstrapping rounds completed. |
A data.frame
containing the number of parameters (Df), loglikelihood, AIC, BIC, G-squared statistics, and the residual degree of freedom for each object.
If a statistical test is performed (using test
), the result includes the corresponding p-value.
library(magrittr) data <- gss7677[gss7677$COHORT == "YOUNG", ] stat2 <- slca(status(2) ~ PAPRES + PADEG + MADEG) %>% estimate(data = data, control = list(verbose = FALSE)) stat3 <- slca(status(3) ~ PAPRES + PADEG + MADEG) %>% estimate(data = data, control = list(verbose = FALSE)) stat4 <- slca(status(4) ~ PAPRES + PADEG + MADEG) %>% estimate(data = data, control = list(verbose = FALSE)) gof(stat2, stat3, stat4) gof(stat2, stat3, stat4, test = "chisq") gof(stat2, stat3, stat4, test = "boot") compare(stat3, stat4) compare(stat3, stat4, test = "chisq") compare(stat3, stat4, test = "boot")
library(magrittr) data <- gss7677[gss7677$COHORT == "YOUNG", ] stat2 <- slca(status(2) ~ PAPRES + PADEG + MADEG) %>% estimate(data = data, control = list(verbose = FALSE)) stat3 <- slca(status(3) ~ PAPRES + PADEG + MADEG) %>% estimate(data = data, control = list(verbose = FALSE)) stat4 <- slca(status(4) ~ PAPRES + PADEG + MADEG) %>% estimate(data = data, control = list(verbose = FALSE)) gof(stat2, stat3, stat4) gof(stat2, stat3, stat4, test = "chisq") gof(stat2, stat3, stat4, test = "boot") compare(stat3, stat4) compare(stat3, stat4, test = "chisq") compare(stat3, stat4, test = "boot")
This dataset contains responses from the General Social Survey (GSS) for the years 1976 and 1977, focusing on social status and tolerance towards minorities.
The dataset can be used to replicate the analyses conducted in McCutcheon (1985) and Bakk et al. (2014).
It includes covariates such as interview year, age, sex, race, education level, and income. Social status-related variables include father's occupation and education level, as well as mother's education level. Tolerance towards minorities is measured by agreement with three questions: (1) allowing public speaking, (2) allowing teaching, and (3) allowing literature publication.
gss7677
gss7677
A data frame with 2942 rows and 14 variables:
YEAR
Interview year (1976, 1977).
COHORT
Respondent's age cohort.
Levels: (1) YOUNG
, (2) YOUNG-MIDDLE
, (4) MIDDLE
, (5) OLD
.
SEX
Respondent's sex.
Levels: (1) MALE
, (2) FEMALE
.
RACE
Respondent's race.
Levels: (1) WHITE
, (2) BLACK
, (3) OTHER
.
DEGREE
Respondent's education level.
Levels: (1) LT HS
, (2) HIGH-SCH
, (3) HIGHER
.
REALRINC
Respondent's income.
PAPRES
Father's occupational prestige.
Levels: (1) LOW
, (2) MEDIUM
, (3) HIGH
.
PADEG
Father's education level.
Levels: (1) LT HS
, (2) HIGH-SCH
, (3) COLLEGE
, (4) BACHELOR
, (5) GRADUATE
.
MADEG
Mother's education level.
Levels: (1) LT HS
, (2) HIGH-SCH
, (3) COLLEGE
, (4) BACHELOR
, (5) GRADUATE
.
TOLRAC
Tolerance towards racists.
TOLCOM
Tolerance towards communists.
TOLHOMO
Tolerance towards homosexuals.
TOLATH
Tolerance towards atheists.
TOLMIL
Tolerance towards militarists.
General Social Survey (GSS) 1976, 1977
Bakk Z, Kuha J. (2021) Relating latent class membership to external variables: An overview. Br J Math Stat Psychol. 74(2):340-362.
McCutcheon, A. L. (1985). A latent class analysis of tolerance for nonconformity in the American public. Public Opinion Quarterly, 49, 474–488.
library(magrittr) gss500 <- gss7677[1:500,] %>% na.omit model_stat <- slca(status(3) ~ PAPRES + PADEG + MADEG) %>% estimate(data = gss500, control = list(em.tol = 1e-6)) summary(model_stat) param(model_stat) model_tol <- slca(tol(4) ~ TOLRAC + TOLCOM + TOLHOMO + TOLATH + TOLMIL) %>% estimate(data = gss500, control = list(em.tol = 1e-6)) summary(model_tol) param(model_tol) model_lta <- slca( status(3) ~ PAPRES + PADEG + MADEG, tol(4) ~ TOLRAC + TOLCOM + TOLHOMO + TOLATH + TOLMIL, status ~ tol ) %>% estimate(data = gss500, control = list(em.tol = 1e-6)) summary(model_lta) param(model_lta) regress(model_lta, status ~ SEX, gss500) regress(model_lta, status ~ SEX, gss500, method = "BCH") regress(model_lta, status ~ SEX, gss500, method = "ML")
library(magrittr) gss500 <- gss7677[1:500,] %>% na.omit model_stat <- slca(status(3) ~ PAPRES + PADEG + MADEG) %>% estimate(data = gss500, control = list(em.tol = 1e-6)) summary(model_stat) param(model_stat) model_tol <- slca(tol(4) ~ TOLRAC + TOLCOM + TOLHOMO + TOLATH + TOLMIL) %>% estimate(data = gss500, control = list(em.tol = 1e-6)) summary(model_tol) param(model_tol) model_lta <- slca( status(3) ~ PAPRES + PADEG + MADEG, tol(4) ~ TOLRAC + TOLCOM + TOLHOMO + TOLATH + TOLMIL, status ~ tol ) %>% estimate(data = gss500, control = list(em.tol = 1e-6)) summary(model_lta) param(model_lta) regress(model_lta, status ~ SEX, gss500) regress(model_lta, status ~ SEX, gss500, method = "BCH") regress(model_lta, status ~ SEX, gss500, method = "ML")
An slca
model estimated using the NLSY97 dataset.
nlsy_jlcpa
nlsy_jlcpa
An slcafit
object estimated for JLCPA model using nlsy97
dataset.
Bureau of Labor Statistics, U.S. Department of Labor. National Longitudinal Survey of Youth 1997 cohort, 1997-2017 (rounds 1-18). Produced and distributed by the Center for Human Resource Research (CHRR), The Ohio State University. Columbus, OH: 2019.
Jeon, S., Seo, T. S., Anthony, J. C., & Chung, H. (2022). Latent Class Analysis for Repeatedly Measured Multiple Latent Class Variables. Multivariate Behavioral Research, 57(2–3), 341–355.
This dataset contains substance use behavior data from the National Longitudinal Survey of Youth 1997 (NLSY97) for three years: 1998, 2003, and 2008. The dataset focuses on youth born in 1984 and tracks three types of substance use behaviors: tobacco/cigarette smoking, alcohol drinking, and marijuana use.
nlsy97
nlsy97
A data frame with 1004 rows and 38 columns:
SEX
Respondent's sex
RACE
Respondent's race
ESMK_98
, ESMK_03
, ESMK_08
(Ever smoked) Ever smoked in 1998, 2003, and 2008 (0: No, 1: Yes)
FSMK_98
, FSMK_03
, FSMK_08
(Frequent smoke) Monthly smoking in 1998, 2003, and 2008 (0: No, 1: Yes)
DSMK_98
, DSMK_03
, DSMK_08
(Daily smoke) Daily smoking in 1998, 2003, and 2008 (0: No, 1: Yes)
HSMK_98
, HSMK_03
, HSMK_08
(Heavy smoke) 10+ cigarettes per day in 1998, 2003, and 2008 (0: No, 1: Yes)
EDRK_98
, EDRK_03
, EDRK_08
(Ever drunk) Ever drunk in 1998, 2003, and 2008? (0: No, 1: Yes)
CDRK_98
, CDRK_03
, CDRK_08
(Current drinker) Monthly drinking in 1998, 2003, and 2008 (0: No, 1: Yes)
WDRK_98
, WDRK_03
, WDRK_08
(Weakly drinker) 5+ days drinking in a month in 1998, 2003, and 2008 (0: No, 1: Yes)
BDRK_98
, BDRK_03
, BDRK_08
(Binge drinker) 5+ drinks on the same day at least one time in the last 30 day (0: No, 1: Yes)
EMRJ_98
, EMRJ_03
, EMRJ_08
(Ever marijuana used) Have you ever used marijuana in 1998, 2003, and 2008? (0: No, 1: Yes)
CMRJ_98
, CMRJ_03
, CMRJ_08
(Current marijuana user) Monthly marijuana use in 1998, 2003, and 2008 (0: No, 1: Yes)
OMRJ_98
, OMRJ_03
, OMRJ_08
(Occasional marijuana user) 10+ days marijuana use in a month in 1998, 2003, and 2008 (0: No, 1: Yes)
SMRJ_98
, SMRJ_03
, SMRJ_08
(School/work marijuana user) Marijuana use before/during school or work in 1998, 2003, and 2008 (0: No, 1: Yes)
National Longitudinal Survey of Youth 1997 (NLSY97)
Bureau of Labor Statistics, U.S. Department of Labor. National Longitudinal Survey of Youth 1997 cohort, 1997-2017 (rounds 1-18). Produced and distributed by the Center for Human Resource Research (CHRR), The Ohio State University. Columbus, OH: 2019.
library(magrittr) nlsy_smoke <- slca(SMK_98(3) ~ ESMK_98 + FSMK_98 + DSMK_98 + HSMK_98) %>% estimate(data = nlsy97, control = list(verbose = FALSE)) summary(nlsy_smoke) # JLCA model_jlca <- slca( SMK_98(3) ~ ESMK_98 + FSMK_98 + DSMK_98 + HSMK_98, DRK_98(3) ~ EDRK_98 + CDRK_98 + WDRK_98 + BDRK_98, MRJ_98(3) ~ EMRJ_98 + CMRJ_98 + OMRJ_98 + SMRJ_98, SUB_98(4) ~ SMK_98 + DRK_98 + MRJ_98 ) %>% estimate(data = nlsy97, control = list(verbose = FALSE)) summary(model_jlca) param(model_jlca) # JLCPA nlsy_jlcpa <- slca( SMK_98(3) ~ ESMK_98 + FSMK_98 + DSMK_98 + HSMK_98, DRK_98(3) ~ EDRK_98 + CDRK_98 + WDRK_98 + BDRK_98, MRJ_98(3) ~ EMRJ_98 + CMRJ_98 + OMRJ_98 + SMRJ_98, SUB_98(5) ~ SMK_98 + DRK_98 + MRJ_98, SMK_03(3) ~ ESMK_03 + FSMK_03 + DSMK_03 + HSMK_03, DRK_03(3) ~ EDRK_03 + CDRK_03 + WDRK_03 + BDRK_03, MRJ_03(3) ~ EMRJ_03 + CMRJ_03 + OMRJ_03 + SMRJ_03, SUB_03(5) ~ SMK_03 + DRK_03 + MRJ_03, SMK_08(3) ~ ESMK_08 + FSMK_08 + DSMK_08 + HSMK_08, DRK_08(3) ~ EDRK_08 + CDRK_08 + WDRK_08 + BDRK_08, MRJ_08(3) ~ EMRJ_08 + CMRJ_08 + OMRJ_08 + SMRJ_08, SUB_08(5) ~ SMK_08 + DRK_08 + MRJ_08, PROF(4) ~ SUB_98 + SUB_03 + SUB_08, constraints = list( c("SMK_98", "SMK_03", "SMK_08"), c("DRK_98", "DRK_03", "DRK_08"), c("MRJ_98", "MRJ_03", "MRJ_08"), c("SUB_98 ~ SMK_98", "SUB_03 ~ SMK_03", "SUB_08 ~ SMK_08"), c("SUB_98 ~ DRK_98", "SUB_03 ~ DRK_03", "SUB_08 ~ DRK_08"), c("SUB_98 ~ MRJ_98", "SUB_03 ~ MRJ_03", "SUB_08 ~ MRJ_08") ) ) %>% estimate(nlsy97, control = list(verbose = FALSE))
library(magrittr) nlsy_smoke <- slca(SMK_98(3) ~ ESMK_98 + FSMK_98 + DSMK_98 + HSMK_98) %>% estimate(data = nlsy97, control = list(verbose = FALSE)) summary(nlsy_smoke) # JLCA model_jlca <- slca( SMK_98(3) ~ ESMK_98 + FSMK_98 + DSMK_98 + HSMK_98, DRK_98(3) ~ EDRK_98 + CDRK_98 + WDRK_98 + BDRK_98, MRJ_98(3) ~ EMRJ_98 + CMRJ_98 + OMRJ_98 + SMRJ_98, SUB_98(4) ~ SMK_98 + DRK_98 + MRJ_98 ) %>% estimate(data = nlsy97, control = list(verbose = FALSE)) summary(model_jlca) param(model_jlca) # JLCPA nlsy_jlcpa <- slca( SMK_98(3) ~ ESMK_98 + FSMK_98 + DSMK_98 + HSMK_98, DRK_98(3) ~ EDRK_98 + CDRK_98 + WDRK_98 + BDRK_98, MRJ_98(3) ~ EMRJ_98 + CMRJ_98 + OMRJ_98 + SMRJ_98, SUB_98(5) ~ SMK_98 + DRK_98 + MRJ_98, SMK_03(3) ~ ESMK_03 + FSMK_03 + DSMK_03 + HSMK_03, DRK_03(3) ~ EDRK_03 + CDRK_03 + WDRK_03 + BDRK_03, MRJ_03(3) ~ EMRJ_03 + CMRJ_03 + OMRJ_03 + SMRJ_03, SUB_03(5) ~ SMK_03 + DRK_03 + MRJ_03, SMK_08(3) ~ ESMK_08 + FSMK_08 + DSMK_08 + HSMK_08, DRK_08(3) ~ EDRK_08 + CDRK_08 + WDRK_08 + BDRK_08, MRJ_08(3) ~ EMRJ_08 + CMRJ_08 + OMRJ_08 + SMRJ_08, SUB_08(5) ~ SMK_08 + DRK_08 + MRJ_08, PROF(4) ~ SUB_98 + SUB_03 + SUB_08, constraints = list( c("SMK_98", "SMK_03", "SMK_08"), c("DRK_98", "DRK_03", "DRK_08"), c("MRJ_98", "MRJ_03", "MRJ_08"), c("SUB_98 ~ SMK_98", "SUB_03 ~ SMK_03", "SUB_08 ~ SMK_08"), c("SUB_98 ~ DRK_98", "SUB_03 ~ DRK_03", "SUB_08 ~ DRK_08"), c("SUB_98 ~ MRJ_98", "SUB_03 ~ MRJ_03", "SUB_08 ~ MRJ_08") ) ) %>% estimate(nlsy97, control = list(verbose = FALSE))
slcafit
ObjectPrints the estimated parameters of an slca
model using an slcafit
object.
param(object, ...) ## S3 method for class 'slcafit' param( object, type = c("probs", "logit"), se = FALSE, index = FALSE, ... )
param(object, ...) ## S3 method for class 'slcafit' param( object, type = c("probs", "logit"), se = FALSE, index = FALSE, ... )
object |
an object of class |
... |
additional arguments passed to other methods. |
type |
a character string specifying the format in which the estimated parameters should be displayed. The options are |
se |
a logical value indicating whether to display standard errors ( |
index |
a logical value indicating whether to include ( |
A list
containing the requested estimated parameters or their standard errors (if se = TRUE
). The components of the list include:
pi |
Membership probabilities for the root latent variable. |
tau |
Conditional probabilities between latent class variables, represented with uppercase letters to account for measurement invariance. |
rho |
Item response probabilities for each measurement model, represented with lowercase letters to account for measurement invariance. |
slca
ObjectProvides predicted class memberships or posterior probabilities for new data based on a fitted slca
model.
## S3 method for class 'slcafit' predict(object, newdata, type = c("class", "posterior"), ...)
## S3 method for class 'slcafit' predict(object, newdata, type = c("class", "posterior"), ...)
object |
An object of class |
newdata |
A |
type |
A character string indicating the type of prediction. Use |
... |
Additional arguments passed to other methods. |
A data.frame
or list
depending on the type
:
For type = "class"
, a data.frame
is returned where rows represent observations and columns correspond to latent class variables.
For type = "posterior"
, a list
is returned containing data.frame
s with posterior probabilities for each latent class variable.
Performs regression analysis to examine the influence of exogenous (external) variables on latent class variables in an estimated slca
model. The function uses logistic regression with a three-step approach to account for measurement error.
regress(object, ...) ## S3 method for class 'slcafit' regress( object, formula, data = parent.frame(), imputation = c("modal", "prob"), method = c("naive", "BCH", "ML"), ... ) ## S3 method for class 'slcafit' regress( object, formula, data = parent.frame(), imputation = c("modal", "prob"), method = c("naive", "BCH", "ML"), ... )
regress(object, ...) ## S3 method for class 'slcafit' regress( object, formula, data = parent.frame(), imputation = c("modal", "prob"), method = c("naive", "BCH", "ML"), ... ) ## S3 method for class 'slcafit' regress( object, formula, data = parent.frame(), imputation = c("modal", "prob"), method = c("naive", "BCH", "ML"), ... )
object |
an object of class |
... |
additional arguments. |
formula |
a formula specifying the regression model, including both latent class variables (from the estimated model) and exogenous variables. |
data |
an optional |
imputation |
a character string specifying the imputation method for latent class assignment. Options include:
|
method |
a character string specifying the method to adjust for bias in the three-step approach. Options include:
|
A list
of class reg.slca
with the following components:
coefficients |
A matrix of regression coefficients representing the odds ratios for each latent class against the baseline class (the last class). |
std.err |
A matrix of standard errors corresponding to the regression coefficients. |
vcov |
The variance-covariance matrix of the regression coefficients. |
dim |
The dimensions of the coefficients matrix. |
ll |
The log-likelihood of the regression model. |
The summary
function can be used to display the regression coefficients, standard errors, Wald statistics, and p-values.
Vermunt, J. K. (2010). Latent Class Modeling with Covariates: Two Improved Three-Step Approaches. Political Analysis, 18(4), 450–469. http://www.jstor.org/stable/25792024
library(magrittr) names(nlsy97) nlsy_jlcpa %>% regress(SMK_98 ~ SEX, nlsy97) nlsy_jlcpa %>% regress(PROF ~ SEX, nlsy97)
library(magrittr) names(nlsy97) nlsy_jlcpa %>% regress(SMK_98 ~ SEX, nlsy97) nlsy_jlcpa %>% regress(PROF ~ SEX, nlsy97)
Reorders the latent class membership for specified latent class variables in an slcafit
object.
## S3 method for class 'slcafit' reorder(x, ...)
## S3 method for class 'slcafit' reorder(x, ...)
x |
an object of class |
... |
additional arguments specifying the new order for the latent class variables. |
A modified slcafit
object with the latent classes reordered according to the specified order.
library(magrittr) nlsy_jlcpa %>% param # Reorder the RHO parameters as ascending order reordered1 <- nlsy_jlcpa %>% reorder(SMK_98 = c(1, 3, 2), DRK_98 = c(3, 2, 1), MRJ_98 = c(3, 1, 2)) reordered1 %>% param # Label class1: nonuse # class2: lifetime use # class3: current use # Reorder the TAU parameters for joint classes as ascending order reordered2 <- reordered1 %>% reorder(SUB_98 = c(3, 4, 5, 1, 2)) reordered2 %>% param # Label class1: nonuse # class2: heavy drinking only # class3: not heavy use # class4: heavy drinking & smoking # class5: heavy use # Reorder the TAU paramters for profiles as ascending order reordered3 <- reordered2 %>% reorder(PROF = c(4, 1, 3, 2)) reordered3 %>% param # Label class1: nonuse stayer # class2: heavy drinking advancer # class3: heavy drk & smk advancer # class4: heavy use advancer
library(magrittr) nlsy_jlcpa %>% param # Reorder the RHO parameters as ascending order reordered1 <- nlsy_jlcpa %>% reorder(SMK_98 = c(1, 3, 2), DRK_98 = c(3, 2, 1), MRJ_98 = c(3, 1, 2)) reordered1 %>% param # Label class1: nonuse # class2: lifetime use # class3: current use # Reorder the TAU parameters for joint classes as ascending order reordered2 <- reordered1 %>% reorder(SUB_98 = c(3, 4, 5, 1, 2)) reordered2 %>% param # Label class1: nonuse # class2: heavy drinking only # class3: not heavy use # class4: heavy drinking & smoking # class5: heavy use # Reorder the TAU paramters for profiles as ascending order reordered3 <- reordered2 %>% reorder(PROF = c(4, 1, 3, 2)) reordered3 %>% param # Label class1: nonuse stayer # class2: heavy drinking advancer # class3: heavy drk & smk advancer # class4: heavy use advancer
slca
ModelSimulates data based on a specified slca
model. If the model parameters are not already estimated, they can either be provided by the user or generated randomly.
## S3 method for class 'slca' simulate(object, nsim = 500, seed = NULL, parm, nlevel, ...)
## S3 method for class 'slca' simulate(object, nsim = 500, seed = NULL, parm, nlevel, ...)
object |
an |
nsim |
an integer specifying the number of response observations to simulate. The default is 500. |
seed |
an integer specifying the random seed for reproducibility. If not provided, results will vary across runs. |
parm |
a user-specified set of parameters to guide the simulation. This is required if the model has not been previously estimated. |
nlevel |
an integer or integer vector specifying the number of levels for each manifest item in the model. If a single integer is provided, all manifest items will have the same number of levels. The default is 2. |
... |
Additional arguments passed to other methods. |
A list
with the following components:
class |
A |
response |
A |
m1 <- slca(lc1[3] ~ x1 + x2 + x3 + x4 + x5, lc2[4] ~ y1 + y2 + y3 + y4 + y5) sim <- simulate(m1, 1000) sapply(sim$class, table) # simulate data with defined number of levels of manifest items m2 <- slca(lc1[3] ~ x1 + x2 + x3 + x4) sim <- simulate(m2, nlevel = c(3, 3, 3, 3)) d <- sim$response sapply(d, table) sim <- simulate(m2, nlevel = c(x1 = 2, x3 = 3, x4 = 4, x5 = 5)) d <- sim$response sapply(d, table) # simulate data with user-defined parameters pi <- rep(1 / 3, 3) rho <- c(.9, .1, .9, .1, .9, .1, .9, .1, .9, .1, .9, .1, .1, .9, .1, .9, .1, .9, .1, .9, .1, .9, .1, .9) par <- c(pi, rho) m3 <- slca(lc[3] ~ y1 + y2 + y3 + y4) sim <- simulate(m3, parm = par) mf <- estimate(m3, sim$response) param(mf)
m1 <- slca(lc1[3] ~ x1 + x2 + x3 + x4 + x5, lc2[4] ~ y1 + y2 + y3 + y4 + y5) sim <- simulate(m1, 1000) sapply(sim$class, table) # simulate data with defined number of levels of manifest items m2 <- slca(lc1[3] ~ x1 + x2 + x3 + x4) sim <- simulate(m2, nlevel = c(3, 3, 3, 3)) d <- sim$response sapply(d, table) sim <- simulate(m2, nlevel = c(x1 = 2, x3 = 3, x4 = 4, x5 = 5)) d <- sim$response sapply(d, table) # simulate data with user-defined parameters pi <- rep(1 / 3, 3) rho <- c(.9, .1, .9, .1, .9, .1, .9, .1, .9, .1, .9, .1, .1, .9, .1, .9, .1, .9, .1, .9, .1, .9, .1, .9) par <- c(pi, rho) m3 <- slca(lc[3] ~ y1 + y2 + y3 + y4) sim <- simulate(m3, parm = par) mf <- estimate(m3, sim$response) param(mf)
Constructs a latent structure with multiple latent class variables.
slca(formula = NULL, ..., constraints = NULL)
slca(formula = NULL, ..., constraints = NULL)
formula |
a formula specifying the latent structure. Detailed model specifications are provided under 'Details'. |
... |
additional formulae for defining the model structure. |
constraints |
a list of constraints to enforce measurement invariance. Detailed explanations of applying constraints are available under 'Details'. |
The formula
can be categorized into three types, each serving a distinct purpose:
Defining Latent Class Variables with Manifest Indicators:
Specify the relationship between a latent class variable and its manifest indicators. The latent class variable is on the left-hand side (lhs), denoted with square brackets []
or parentheses ()
to indicate the number of classes, and manifest indicators are listed on the right-hand side (rhs). For example:
LC1[k] ~ x1 + x2 + x3 LC2[k] ~ y1 + y2 + y3 LC3(k) ~ z1 + z2 + z3
Here, k
denotes the number of latent classes for the variable.
Relating Latent Class Variables to Each Other: Define relationships where one latent class variable influences another. For example:
LC2 ~ LC1
This formula implies that LC2
is conditionally dependent on LC1
.
Defining Higher-Level Latent Class Variables: Specify relationships where a latent class variable is measured by other latent class variables instead of manifest indicators. For example:
P[k] ~ LC1 + LC2 + LC3
This indicates that the latent variable P
is measured by the latent class variables LC1
, LC2
, and LC3
.
In all formulas, variables on the lhs influence those on the rhs.
The constraints
argument enforces specific conditions to ensure precise inference, such as measurement invariance. This is particularly useful for longitudinal analysis (eg. LTA or LCPA), where consistent meanings of latent classes across time are essential.
Measurement Invariance for the Measurement Model: Ensures probabilities associated with latent class variables remain consistent. For example:
c("LC1", "LC2", "LC3")
This ensures that LC1
, LC2
, and LC3
have semantically consistent measurement probabilities.
' 2. Measurement Invariance for the Structural Model: Applies constraints to ensure consistent interpretations of transition probabilities between latent class variables. For example:
c("P ~ LC1", "P -> LC2")
This ensures that the transitions from P
to LC1
and P
to LC2
are consistent.
An object of class slca
with the following components:
tree |
A |
latent |
A |
measure |
A |
struct |
A |
The printed model description is divided into four parts:
Latent variables: Lists the latent class variables and the number of classes for each variable. The root variable is marked with an asterisk (*
).
Measurement model: Displays manifest indicators for each latent class variable and any applied measurement constraints (lowercase letters indicate consistency).
Structural model: Describes the conditional relationships between latent class variables.
Dependency constraints: Outlines constraints applied to conditional dependencies, where uppercase letters represent consistent dependency structures.
# Standard LCA slca(lc[3] ~ y1 + y2 + y3 + y4) # Latent transition analysis (LTA) slca(lx[3] ~ x1 + x2 + x3 + x4, ly[2] ~ y1 + y2 + y3 + y4, lx ~ ly) # LTA with measurement invariance slca(l1[3] ~ y11 + y21 + y31 + y41, l2[3] ~ y12 + y22 + y32 + y42, l1 ~ l2, constraints = c("l1", "l2")) # Joint latent class analysis slca(lx[2] ~ x1 + x2 + x3 + x4, ly[3] ~ y1 + y2 + y3 + y4, lz[2] ~ z1 + z2 + z3 + z4, jc[3] ~ lx + ly + lz) # Latent class profile analysis (with measurement invariance) slca(l1[3] ~ x1 + x2 + x3 + x4, l2[3] ~ y1 + y2 + y3 + y4, l3[3] ~ z1 + z2 + z3 + z4, pf[4] ~ l1 + l2 + l3, constraints = c("l1", "l2", "l3"))
# Standard LCA slca(lc[3] ~ y1 + y2 + y3 + y4) # Latent transition analysis (LTA) slca(lx[3] ~ x1 + x2 + x3 + x4, ly[2] ~ y1 + y2 + y3 + y4, lx ~ ly) # LTA with measurement invariance slca(l1[3] ~ y11 + y21 + y31 + y41, l2[3] ~ y12 + y22 + y32 + y42, l1 ~ l2, constraints = c("l1", "l2")) # Joint latent class analysis slca(lx[2] ~ x1 + x2 + x3 + x4, ly[3] ~ y1 + y2 + y3 + y4, lz[2] ~ z1 + z2 + z3 + z4, jc[3] ~ lx + ly + lz) # Latent class profile analysis (with measurement invariance) slca(l1[3] ~ x1 + x2 + x3 + x4, l2[3] ~ y1 + y2 + y3 + y4, l3[3] ~ z1 + z2 + z3 + z4, pf[4] ~ l1 + l2 + l3, constraints = c("l1", "l2", "l3"))
slca
EstimationSpecifies control parameters for estimating slca
model.
slcaControl( em.iterlim = 5000, em.tol = 1e-08, nlm.iterlim = 1000, nlm.tol = 1e-10, init.param = NULL, nrep = 1, test.iter = 500, na.rm = FALSE, verbose = FALSE )
slcaControl( em.iterlim = 5000, em.tol = 1e-08, nlm.iterlim = 1000, nlm.tol = 1e-10, init.param = NULL, nrep = 1, test.iter = 500, na.rm = FALSE, verbose = FALSE )
em.iterlim |
an integer specifying the maximum number of iterations allowed for the EM algorithm. The default is |
em.tol |
a numeric value setting the tolerance for convergence of the EM algorithm. The default is |
nlm.iterlim |
an integer specifying the maximum number of iterations allowed when using the |
nlm.tol |
a numeric value setting the tolerance for convergence of the |
init.param |
a numeric vector specifying the initial parameter values for estimation. |
nrep |
an integer specifying the number of estimation trials. The default is |
test.iter |
an integer specifying the maximum number of iterations allowed for parameter testing. The default is |
na.rm |
a logical value indicating whether to remove observations containing missing values ( |
verbose |
a logical value indicating whether to display progress updates during the estimation process. The default is A |