Title: | Multiscale Geographically Weighted Negative Binomial Regression |
---|---|
Description: | Fits a geographically weighted regression model with different scales for each covariate. Uses the negative binomial distribution as default, but also accepts the normal, Poisson, or logistic distributions. Can fit the global versions of each regression and also the geographically weighted alternatives with only one scale, since they are all particular cases of the multiscale approach. Hanchen Yu (2024). "Exploring Multiscale Geographically Weighted Negative Binomial Regression", Annals of the American Association of Geographers <doi:10.1080/24694452.2023.2289986>. Fotheringham AS, Yang W, Kang W (2017). "Multiscale Geographically Weighted Regression (MGWR)", Annals of the American Association of Geographers <doi:10.1080/24694452.2017.1352480>. Da Silva AR, Rodrigues TCV (2014). "Geographically Weighted Negative Binomial Regression - incorporating overdispersion", Statistics and Computing <doi:10.1007/s11222-013-9401-9>. |
Authors: | Juliana Rosa [aut, cre], Jéssica Vasconcelos [aut], Alan da Silva [aut] |
Maintainer: | Juliana Rosa <[email protected]> |
License: | GPL-3 |
Version: | 0.2.0 |
Built: | 2025-02-24 05:18:17 UTC |
Source: | https://github.com/julianamrosa/mgwnbr |
The Georgia census data set from Fotheringham et al. (2002) in dataframe format.
data(georgia)
data(georgia)
A data frame with with 159 observations on the following 13 variables:
AreaKey
- an identification number for each county
Latitude
- the latitude of the county centroid
Longitud
- the longitude of the county centroid
TotPop90
- population of the county in 1990
PctRural
- percentage of the county population defined as rural
PctBach
- percentage of the county population with a bachelors degree
PctEld
- percentage of the county population aged 65 or over
PctFB
- percentage of the county population born outside the US
PctPov
- percentage of the county population living below the poverty line
PctBlack
- percentage of the county population who are black
ID
- a numeric vector of IDs
X
- a numeric vector of x coordinates
Y
- a numeric vector of y coordinates
Fits a geographically weighted regression model with different scales for each covariate. Uses the negative binomial distribution as default, but also accepts the normal, Poisson, or logistic distributions. Can fit the global versions of each regression and also the geographically weighted alternatives with only one scale, since they are all particular cases of the multiscale approach.
mgwnbr( data, formula, weight = NULL, lat, long, globalmin = TRUE, method, model = "negbin", mgwr = TRUE, bandwidth = "cv", offset = NULL, distancekm = FALSE, int = 50, h = NULL )
mgwnbr( data, formula, weight = NULL, lat, long, globalmin = TRUE, method, model = "negbin", mgwr = TRUE, bandwidth = "cv", offset = NULL, distancekm = FALSE, int = 50, h = NULL )
data |
name of the dataset. |
formula |
regression model formula as in |
weight |
name of the variable containing the sample weights, default value is |
lat |
name of the variable containing the latitudes in the dataset. |
long |
name of the variable containing the longitudes in the dataset. |
globalmin |
logical value indicating whether to find a global minimum in the optimization process, default value is |
method |
indicates the method to be used for the bandwidth calculation ( |
model |
indicates the model to be used for the regression ( |
mgwr |
logical value indicating if multiscale should be used ( |
bandwidth |
indicates the criterion to be used for the bandwidth calculation ( |
offset |
name of the variable containing the offset values, if null then is set to a vector of zeros, default value is |
distancekm |
logical value indicating whether to calculate the distances in km, default value is |
int |
integer indicating the number of iterations, default value is |
h |
integer indicating a predetermined bandwidth value, default value is |
A list that contains:
general_bandwidth
- General bandwidth value.
band
- Bandwidth values for each covariate.
measures
- Goodness of fit statistics.
ENP
- Effective number of parameters.
mgwr_param_estimates
- MGWR parameter estimates.
qntls_mgwr_param_estimates
- Quantiles of MGWR parameter estimates.
descript_stats_mgwr_param_estimates
- Descriptive statistics of MGWR parameter estimates.
p_values
- P-values for the t tests on parameter significance.
t_critical
- Critical values for the t tests on parameter significance.
mgwr_se
- MGWR standard errors.
qntls_mgwr_se
- Quantiles of MGWR standard errors.
descript_stats_se
- Descriptive statistics of MGWR standard errors.
global_param_estimates
- Parameter estimates for the global model.
t_test_dfs
- Denominator degrees of freedom for the t tests.
global_measures
- Goodness of fit statistics for the global model.
## Data data(georgia) for (var in c("PctFB", "PctBlack")){ georgia[, var] <- as.data.frame(scale(georgia[, var])) } ## Model mod <- mgwnbr(data=georgia, formula=PctBach~PctBlack+PctFB, lat="Y", long="X", globalmin=FALSE, method="adaptive_bsq", bandwidth="cv", model="gaussian", mgwr=FALSE, h=136) ## Bandwidths mod$general_bandwidth ## Goodness of fit measures mod$measures
## Data data(georgia) for (var in c("PctFB", "PctBlack")){ georgia[, var] <- as.data.frame(scale(georgia[, var])) } ## Model mod <- mgwnbr(data=georgia, formula=PctBach~PctBlack+PctFB, lat="Y", long="X", globalmin=FALSE, method="adaptive_bsq", bandwidth="cv", model="gaussian", mgwr=FALSE, h=136) ## Bandwidths mod$general_bandwidth ## Goodness of fit measures mod$measures