Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
402 views
in Technique[技术] by (71.8m points)

r - Problems with lasso regression: lambda and confusion matrix

I am trying to do the lasso regression on turnover following the code that I have found on the following link: https://www.kaggle.com/acasalan/predict-bank-turnover-lasso-regression.

In doing so, there are two things that seem strange to me in my results:

  1. lambda.min and lambda.1se are equal;
  2. the confusion matrix results doesn't show the positive.

Here are the code:

# Split the data into training and test set
set.seed(123) # cercare significato di questo valore
training.samples <- Dati3$Dimissioni %>% 
  createDataPartition(p = 0.7, list = FALSE) # randomly split the data into training set (70% for building a predictive model) and test set (30% for evaluating the model)
train.data <- Dati3[training.samples, ]

x <- model.matrix(Dimissioni~.,train.data)[,-1]
# Convert the outcome (class) to a numerical variable
y <- train.data$Dimissioni
#R function glmnet() [glmnet package] for computing penalized logistic regression.

glmnet(x, y, family = "binomial", alpha = 1, lambda = NULL)

# Find the best lambda using cross-validation
set.seed(123) 
cv.lasso <- cv.glmnet(x, y, alpha = 1, family = "binomial")
plot(cv.lasso) # The left dashed vertical line indicates that the log of the optimal value of lambda is approximately -5, which is the one that minimizes the prediction error. 

cv.lasso$lambda.min # exact value of lambda
cv.lasso$lambda.1se # value of lambda that gives the simplest model but also lies within one standard error of the optimal value of lambda
# both the two methods results the same value: 0.008018156, 

# Using lambda.min as the best lambda, gives the following regression coefficients
coef(cv.lasso, cv.lasso$lambda.min)

# Final model with lambda.min (the same will be with lambda.1se)
lasso.model2 <- glmnet(x, y, alpha = 1, family = "binomial",
                      lambda = cv.lasso$lambda.min)

# Make prediction on test data
x.test <- model.matrix(Dimissioni ~., test.data)[,-1]
probabilities2 <- lasso.model2 %>% predict(newx = x.test)
predicted.classes2 <- ifelse(probabilities2 > 0.5, "pos", "neg")

# Model accuracy
observed.classes2 <- test.data$Dimissioni
mean(predicted.classes2 == observed.classes2)

#confusion matrix 
table(predicted.classes2, observed.classes2)
second <- table(predicted.classes2, observed.classes2)

# Precision or accuracy of predicting correctly employee turnover:
round(second[2,2]/ (second[2,2]+second[2,1]),4)

These are the results of the confusion matrix:

enter image description here

Thanks for the help.

question from:https://stackoverflow.com/questions/65830076/problems-with-lasso-regression-lambda-and-confusion-matrix

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...