diff --git a/.nojekyll b/.nojekyll
index 4cc64dc..0f69e35 100644
--- a/.nojekyll
+++ b/.nojekyll
@@ -1 +1 @@
-b2163873
\ No newline at end of file
+66235ba6
\ No newline at end of file
diff --git a/aGHQ.html b/aGHQ.html
index 38353f2..15bb886 100644
--- a/aGHQ.html
+++ b/aGHQ.html
@@ -507,12 +507,12 @@
As stated above, the meanunitdev function can be applied to the vectors, \({\mathbf{y}}\) and \({{\boldsymbol{\eta}}}\), via dot-vectorization to produce a Vector{NamedTuple}, which is the typical form of a row-table.
@@ -530,23 +530,23 @@
sum(r.dev for r in rowtbl)
-
2411.194470806229
+
2411.1932567854824
@@ -624,12 +624,12 @@
β₀ =copy(com05fe.β) # keep a copy of the initial values
These initial values of \({\boldsymbol{\beta}}\) are from a least squares fit of \({\mathbf{y}}\), converted from {0,1} coding to {-1,1} coding, on the model matrix, \({\mathbf{X}}\).
@@ -637,14 +637,14 @@
deviance(setβ!(com05fe, βm05))
-
2411.194470806229
+
2411.1932567854824
For fairness in later comparisons we restore the initial values β₀ to the model. These are rough starting estimates with a deviance that is considerably greater than that at βm05.
The optimizer has determined a coefficient vector that reduces the deviance to 2409.38, at which point convergence was declared because changes in the objective are limited by round-off. This required about 500 evaluations of the deviance at candidate values of \({\boldsymbol{\beta}}\).
As with IRLS, PIRLS is a fast and stable algorithm for determining the mode of the conditional distribution \(({\mathcal{U}}|{\mathcal{Y}}={\mathbf{y}})\) with \({\boldsymbol{\theta}}\) and \({\boldsymbol{\beta}}\) held fixed.
BenchmarkTools.Trial: 4847 samples with 1 evaluation.
+ Range (min … max): 197.708 μs … 325.792 μs ┊ GC (min … max): 0.00% … 0.00%
+ Time (median): 202.834 μs ┊ GC (median): 0.00%
+ Time (mean ± σ): 204.467 μs ± 10.549 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
- ▁ ▇█
- ▂▂▂▂▁▂▂▂▂▂▂▂▂▂▂▁▁▂▂▂▁▂▁▂▂▁▁▂▂▂▂▁▂▁▁▁▁▄█▄▄▃▂▃▃▃▃▃██▇▆▃▃▅▄▃▂▂▂▂ ▃
- 200 μs Histogram: frequency by time 234 μs <
+ ▆▃ ▄ ▃█
+ ██▄▆█▄██▃▆▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▂▂▁▁▂▁▂▂▂▂▁▂▂▂▂▂▂▂▂▂▁▂▂ ▃
+ 198 μs Histogram: frequency by time 261 μs <
Memory estimate: 112 bytes, allocs estimate: 1.
@@ -1158,7 +1158,7 @@
laplaceapprox(pirls!(m))
-
2373.518052752164
+
2373.5180527521634
The remaining step is to optimize Laplace’s approximation to the GLMM deviance with respect to \(\theta\) and \({\boldsymbol{\beta}}\), which we do using the BOBYQA optimizer from NLopt.jl
Converged to θ = 0.5683043594028967 and β =[-0.3409777149845993, 0.3933796201906975, 0.6064857599227369, -0.012926172564277872, 0.03323478854784157, -0.005626184982660486]
+
Converged to θ = 0.5683043987329829 and β =[-0.3409777185006387, 0.39337972485653033, 0.606485842218137, -0.012926166672796164, 0.03323478753359014, -0.0056261864970479246]
These estimates differ somewhat from those for model com05.
@@ -1219,7 +1219,7 @@
)
-
Estimates for com05: θ = 0.5761507901895634, fmin = 2353.8241980539815, and β =[-0.3414913998306781, 0.3936080536502067, 0.6064861079468472, -0.012911714642169572, 0.03321662487439253, -0.005625046845040066]
+
Estimates for com05: θ = 0.5761302648250215, fmin = 2353.8241975646893, and β =[-0.3414688923098128, 0.3935941398175495, 0.6064453377878877, -0.012909611190401555, 0.03321034920788019, -0.0056245886330953615]
The discrepancy in the results is because the com05 results are based on a more accurate approximation to the integral called adaptive Gauss-Hermite Quadrature, which is discussed in Section C.6.
As we see, these “correction terms” relative to Laplace’s approximation are relatively small, compared to the contributions to the objective from each component of \({\mathbf{u}}\). Also, the corrections are all negative, in this case. Close examination of the individual curves in Figure C.5 shows that these curves, which are \(-2\log(f_j(z))\), are more-or-less odd functions, in the sense that the value at \(-z\) is approximately the negative of the value at \(z\). If we were integrating \(\log(f_j(z_j))\phi(z_j)\) with a normalized Gauss-Hermite rule the negative and positive values would cancel out, for the most part, and some of the integrals would be positive while others would be negative.
AbstractVector{T} where T<:NamedTuple (alias for AbstractArray{T, 1} where T<:NamedTuple)
+
+
AbstractVector{T} where T<:NamedTuple (alias for AbstractArray{T, 1} where T<:NamedTuple)
+
The actual implementation of a row-table or column-table type may be different from these prototypes but it must provide access methods as if it were one of these types. Tables.jl provides the “glue” to treat a particular data table type as if it were row-oriented, by calling Tables.rows or Tables.rowtable on it, or column-oriented, by calling Tables.columntable on it.
@@ -343,7 +345,7 @@
TypedTables.jl
is a lightweight package (about 1500 lines of source code) that provides a concrete implementation of column-tables, called simply Table, as a NamedTuple of vectors.
A Table that is constructed from another type of column-table, such as an Arrow.Table or a DataFrame or an explicit NamedTuple of vectors, is simply a wrapper around the original table’s contents. On the other hand, constructing a Table from a row table first creates a ColumnTable, then wraps it.
-
contratbl =Table(contra)
+
contratbl =Table(contra)
Table with 5 columns and 1934 rows:
dist urban livch age use
@@ -369,7 +371,7 @@
Symbols, not strings, usually typed as a : followed by the name, as shown in
-
columnnames(contratbl)
+
columnnames(contratbl)
(:dist, :urban, :livch, :age, :use)
The : form for creating the Symbol requires that the column name be a valid variable name in Julia. If, for example, a column name contains a blank, the : form must be replaced by an expression like var"<name>", which invokes what is called a “string macro”.
Table with 5 columns and 1934 rows:
dist urban livch age use
@@ -716,8 +718,8 @@
<
The JuliaData organization manages the development of several packages related to data science and data management, including DataFrames.jl, a comprehensive system for working with column-oriented data tables in Julia. Kamiński (2023), written by the primary author of that package, provides an in-depth introduction to data science facilities, in particular the DataFrames package, in Julia.
This package is particularly well-suited to more advanced data manipulation such as the split-apply-combine strategy (Wickham, 2011) and “joins” of data tables.
Bouchet-Valat & Kamiński (2023) compares the performance of DataFrames.jl to other data frame implementations in R and Python.
-
This page was rendered from git revision 31e9b61
+
Notice that, although there are 60 distinct districts, there are only 102 distinct combinations of dist and urban represented in the data. In 15 of the 60 districts there are no rural women in the sample and in 3 districts there are no urban women in the sample, as shown in a frequency table
@@ -1307,23 +1307,23 @@
Table with 4 columns and 44 rows:
age ch urban η
┌─────────────────────────────
- 1 │ -10 false N -1.44276
- 2 │ -7 false N -1.29428
- 3 │ -4 false N -1.24704
- 4 │ -1 false N -1.30104
- 5 │ 2 false N -1.45629
- 6 │ 5 false N -1.71278
- 7 │ 8 false N -2.07051
- 8 │ 11 false N -2.52948
- 9 │ 14 false N -3.0897
- 10 │ 17 false N -3.75115
- 11 │ 20 false N -4.51385
- 12 │ -10 true N -0.894068
- 13 │ -7 true N -0.546316
- 14 │ -4 true N -0.299807
- 15 │ -1 true N -0.15454
+ 1 │ -10 false N -1.44277
+ 2 │ -7 false N -1.29427
+ 3 │ -4 false N -1.24702
+ 4 │ -1 false N -1.30101
+ 5 │ 2 false N -1.45624
+ 6 │ 5 false N -1.71271
+ 7 │ 8 false N -2.07043
+ 8 │ 11 false N -2.5294
+ 9 │ 14 false N -3.0896
+ 10 │ 17 false N -3.75105
+ 11 │ 20 false N -4.51374
+ 12 │ -10 true N -0.89408
+ 13 │ -7 true N -0.546324
+ 14 │ -4 true N -0.299812
+ 15 │ -1 true N -0.154542
16 │ 2 true N -0.110516
- 17 │ 5 true N -0.167734
+ 17 │ 5 true N -0.167733
⋮ │ ⋮ ⋮ ⋮ ⋮
@@ -1346,7 +1346,7 @@
-
+
Figure 6.3: Linear predictor versus centered age from model com05
@@ -1411,23 +1411,23 @@
<
Table with 5 columns and 44 rows:
age ch urban η μ
┌────────────────────────────────────────
- 1 │ -10 false N -1.44276 0.191118
- 2 │ -7 false N -1.29428 0.215129
- 3 │ -4 false N -1.24704 0.223213
- 4 │ -1 false N -1.30104 0.213989
- 5 │ 2 false N -1.45629 0.189035
- 6 │ 5 false N -1.71278 0.152804
- 7 │ 8 false N -2.07051 0.111996
- 8 │ 11 false N -2.52948 0.0738171
- 9 │ 14 false N -3.0897 0.0435342
- 10 │ 17 false N -3.75115 0.0229515
- 11 │ 20 false N -4.51385 0.0108374
- 12 │ -10 true N -0.894068 0.290271
- 13 │ -7 true N -0.546316 0.366719
- 14 │ -4 true N -0.299807 0.425605
- 15 │ -1 true N -0.15454 0.461442
+ 1 │ -10 false N -1.44277 0.191117
+ 2 │ -7 false N -1.29427 0.215131
+ 3 │ -4 false N -1.24702 0.223217
+ 4 │ -1 false N -1.30101 0.213996
+ 5 │ 2 false N -1.45624 0.189043
+ 6 │ 5 false N -1.71271 0.152812
+ 7 │ 8 false N -2.07043 0.112004
+ 8 │ 11 false N -2.5294 0.073823
+ 9 │ 14 false N -3.0896 0.0435383
+ 10 │ 17 false N -3.75105 0.0229538
+ 11 │ 20 false N -4.51374 0.0108386
+ 12 │ -10 true N -0.89408 0.290269
+ 13 │ -7 true N -0.546324 0.366718
+ 14 │ -4 true N -0.299812 0.425604
+ 15 │ -1 true N -0.154542 0.461441
16 │ 2 true N -0.110516 0.472399
- 17 │ 5 true N -0.167734 0.458164
+ 17 │ 5 true N -0.167733 0.458165
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮
@@ -1450,7 +1450,7 @@
<
-
+
Figure 6.5: Predicted probability of contraception use versus centered age from model com05.
@@ -1466,10 +1466,10 @@
<
Table with 5 columns and 4 rows:
age ch urban η μ
┌───────────────────────────────────────
- 1 │ 2 false N -1.45629 0.189035
+ 1 │ 2 false N -1.45624 0.189043
2 │ 2 true N -0.110516 0.472399
- 3 │ 2 false Y -0.669101 0.338698
- 4 │ 2 true Y 0.676673 0.662996
+ 3 │ 2 false Y -0.669052 0.338709
+ 4 │ 2 true Y 0.676672 0.662995
The predicted probability of woman with centered age of 2, with children, living in an urban environment using artificial contraception is about 2/3, which is reasonably close to the smoothed frequency for that combination of covariates in Figure 6.2.
@@ -1487,7 +1487,7 @@
-
+
Figure 6.6: Caterpillar plot of the conditional modes of the random-effects for model com05
@@ -1506,12 +1506,12 @@