. Advertisement .
..3..
. Advertisement .
..4..
As advised, I used some code samples in another forum but it could not improve the problem. My question is the “ error in eval(family$initialize) : y values must be 0 <= y <= 1 ” in r- how to solve it? Command line is:
library(ISLR)
dataCancer <- read.csv("~/Desktop/Isep/Machine
Leaning/TD/Project_Cancer/dataR2.csv")
attach(dataCancer)
names(dataCancer)
summary(dataCancer)
cor(dataCancer[,-11])
pairs(dataCancer[,-11])
#Step : Split data into training and testing data
training = (BMI>25)
testing = !training
training_data = dataCancer[training,]
testing_data = dataCancer[testing,]
Classification_testing = Classification[testing]
#Step : Fit a logistic regression model using training data
as.factor(dataCancer$Classification)
classification_model = glm(Classification ~ ., data =
training_data,family = binomial )
summary(classification_model)
and the result:
> classification_model = glm(Classification ~ ., data = training_data,family = binomial )
Error in eval(family$initialize) : y values must be 0 <= y <= 1
> summary(classification_model)
Error in summary(classification_model) : object 'classification_model' not found .
What does the message mean? Can you advise me to fix it? If you have other better answers, leave it in the answer box below.
The cause:
You added the
as.factor(dataCancer$Classification)
in the script, but even if the dataset dataCancer is attached, the dataset variable Classification is not transformed into a factor. It only returns a factor on the console.Solution:
As you want to fit the model on the training dataset, you either specify
or use the as.factor function in the glm line code
It’s asking you for y values between 1 and 0, because categorical features such as direction in your data are of type character’. With
as.factor(data$Direction)
, you need to convert them into type ‘factor.glm(Direction ~ lag2, data=...)
Does not require stock.direction to be declared.The command
class(variable)
can be used to check the class of variables. If they are character, you can convert them into factors and create a new column within the same data frame. The command should then work.