Function to find the Maximum Likelihood Estimates of regression parameters using TensorFlow.
mlereg_tf( ydist = y ~ Normal, formulas, data, available_distribution = TRUE, fixparam = NULL, initparam = NULL, link_function = NULL, optimizer = "AdamOptimizer", hyperparameters = NULL, maxiter = 10000, tolerance = .Machine$double.eps )
ydist | an object of class "formula" that specifies the distribution of the response variable. The default value is |
---|---|
formulas | a list containing objects of class "formula". Each element of the list represents the
linear predictor for each of the parameters of the regression model. The linear predictor is specified with
the name of the parameter and it must contain an |
data | a data frame containing the response variable and the covariates. |
available_distribution | logical. If |
fixparam | a list containing the fixed parameters of the model only if they exist. The parameters values and names must be specified in the list. |
initparam | a list with the initial values of the regression coefficients to be estimated. The list must contain the regression coefficients values and names. If you want to use the same initial values for all regression coefficients associated with a specific parameter, you can specify the name of the parameter and the value. If NULL the default initial value is zero. |
link_function | a list with names of parameters to be linked and the corresponding link function name. The available link functions are:
|
optimizer | a character indicating the name of the TensorFlow optimizer to be used in the optimization process. The default value is |
hyperparameters | a list with the hyperparameters values of the selected TensorFlow optimizer. If the hyperparameters are not specified, their default values will be used in the oprimization process. For more details of the hyperparameters go to this URL: https://www.tensorflow.org/api_docs/python/tf/compat/v1/train |
maxiter | a positive integer indicating the maximum number of iterations for the optimization algorithm. The default value is |
tolerance | a small positive number. When the difference between the loss value or the parameters values from one iteration to another is lower
than this value, the optimization process stops. The default value is |
This function returns the estimates, standard errors, Z-score and p-values of significance tests of the regression model coefficients as well as some information of the optimization process like the number of iterations needed for convergence.
mlereg_tf
computes the log-likelihood function based on the distribution specified in
ydist
and linear predictors specified in formulas
. Then, it finds the values of the regression coefficients
that maximizes this function using the TensorFlow opimizer specified in optimizer
.
The R
function that contains the probability mass/density function must not contain curly brackets. The only curly brackets that the function can contain are those that enclose the function,
that is, those that define the beginning and end of the R
function.
The summary, print, plot_loss
functions can be used with a mlereg_tf
object.
Sara Garcés Céspedes sgarcesc@unal.edu.co
#---------------------------------------------------------------------------------- # Estimation of coefficients of a Poisson regression model # Data frame with response variable and covariates counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) data <- data.frame(treatment, outcome, counts) # Use the mlereg_tf function estimation_1 <- mlereg_tf(ydist = counts ~ Poisson, formulas = list(lambda = ~ outcome + treatment), data = data, initparam = list(lambda = 1.0), optimizer = "AdamOptimizer", link_function = list(lambda = "log"), hyperparameters = list(learning_rate = 0.1)) # Get the summary of the estimates summary(estimation_1)#> Distribution: Poisson #> Number of observations: 9 #> TensorFlow optimizer: AdamOptimizer #> Negative log-likelihood: -274.7378 #> Loss function convergence, 110 iterations needed. #> ---------------------------------------------------------------- #> Distributional parameter: lambda #> ---------------------------------------------------------------- #> Estimate. Std..Error Z.value Pr...z.. #> (Intercept) 3.0446310 0.1708925 17.816 <2e-16 *** #> outcome2 -0.4541281 0.2021639 -2.246 0.0247 * #> outcome3 -0.2930921 0.1927491 -1.521 0.1284 #> treatment2 -0.0001777 0.1999979 -0.001 0.9993 #> treatment3 -0.0001776 0.1999979 -0.001 0.9993 #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> ----------------------------------------------------------------#---------------------------------------------------------------------------------- # Estimation of coefficients of a linear regression model with one fixed parameter # Data frame with response variable and covariates x <- runif(n = 1000, -3, 3) y <- rnorm(n = 1000, mean = 5 - 2 * x, sd = 3) data <- data.frame(y = y, x = x) # Use the mlereg_tf function estimation_2 <- mlereg_tf(ydist = y ~ Normal, formulas = list(mean = ~ x), data = data, fixparam = list(sd = 3), initparam = list(mean = list(Intercept = 1.0, x = 0.0)), optimizer = "AdamOptimizer", hyperparameters = list(learning_rate = 0.1)) # Get the summary of the estimates summary(estimation_2)#> Distribution: Normal #> Number of observations: 1000 #> TensorFlow optimizer: AdamOptimizer #> Negative log-likelihood: 652.0222 #> Loss function convergence, 115 iterations needed. #> ---------------------------------------------------------------- #> Distributional parameter: mean #> ---------------------------------------------------------------- #> Estimate. Std..Error Z.value Pr...z.. #> (Intercept) 4.86203 0.09489 51.24 <2e-16 *** #> x -1.95539 0.05372 -36.40 <2e-16 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> ----------------------------------------------------------------