jssp01.Rmd

---
title: "乱数による分布の近似"
---

# パッケージの装備 

```{r libraries}
# データ整形汎用パッケージ
library(tidyverse)
# MCMC乱数発生器stanをRからつかうパッケージ
library(rstan)
# rstanを並列で使うオプション
options(mc.cores = parallel::detectCores())
# 変更なしの実行ファイルは保存しておくオプション
rstan_options(auto_write = TRUE)
# データ要約・可視化パッケージ
library(summarytools)
# 複数のグラフを並べて表示するパッケージ
library(gridExtra)
library(GGally)
# ベイズモデル比較指標計算パッケージ
library(loo)
# ベイズモデルの結果を可視化するパッケージ
library(bayesplot)
# 描画の際に文字化けするMacユーザは次の行のコメントアウトをとって実行する
old = theme_set(theme_gray(base_family = "HiraKakuProN-W3"))
```

# 乱数による近似

```{r rnorm}
# 数値例を発生
set.seed(12345)
# 標準正規分布から発生する100個の乱数をつくってみる
x100 <- rnorm(100,0,1)
# 一部表示
head(x100)
```
```{r mean the random}
mean(x100)        # 平均値
```

```{r var the random}
var(x100)         # 分散
```

```{r sd the random}
sd(x100)          # 標準偏差
```

```{r max the random}
max(x100)         # 最大値
```

```{r min the random}
min(x100)         # 最小値
```

```{r median the random}
median(x100)      # 中央値
```

```{r parcentile the random}
# パーセンタイル
# 0%, 25%, 50%, 75%, 100%
quantile(x100,probs=c(0,0.25,0.5,0.75,1))
```

## 毎回答えが違う

```{r each randoms}
# 標準正規乱数1
x100.1 <- rnorm(100,0,1)
# 標準正規乱数2
x100.2 <- rnorm(100,0,1)
# 標準正規乱数3
x100.3 <- rnorm(100,0,1)

# それぞれの平均値
mean(x100.1)
mean(x100.2)
mean(x100.3)
```

## サンプルサイズを増やすと理論値に近づく

```{r}
# 1000サンプル
x1000 <-  rnorm(1000,0,1)
mean(x1000)
```

```{r}
# 10000サンプル
x10000 <- rnorm(10000,0,1)
mean(x10000)
```

```{r}
# 100000サンプル
x100000 <-  rnorm(100000,0,1)
mean(x100000)
```


## 確率点

```{r}
# quantile関数でサンプルのパーセンタイル点を算出
quantile(x100000,probs=c(0,0.25,0.33,0.75,1))
# 標準正規分布の理論的q点
qnorm(0.25,0,1)
qnorm(0.33,0,1)
qnorm(0.75,0,1)
```


## ある数字よりも大きく(小さく)なる確率

```{r}
length(x100000[x100000<1.96])/length(x100000)
pnorm(1.96,0,1)
```

## 可視化

```{r visualize}
# データをデータフレームにまとめる
data.frame(class=c(rep(1,NROW(x100)),
                   rep(2,NROW(x1000)),
                   rep(3,NROW(x10000)),
                   rep(4,NROW(x100000))),
           value=c(x100,x1000,x10000,x100000)) %>%
  # グループ名を作る変数を作成
  mutate(class=as.factor(class)) %>% 
  # 作図。x軸は値。グループごとに分けたヒストグラム
  ggplot(aes(x=value))+geom_histogram(binwidth = 0.1)+xlim(-4,4)+
  facet_wrap(~class,scales = "free")

```

## 積分も簡単

```{r integrals}
# 該当する列数/総列数
NROW(x100000[x100000>0 & x100000 <1])/NROW(x100000)

data.frame(val=x100000) %>% mutate(itg=ifelse(val>0&val<1,1,2)) %>% 
  mutate(itg=as.factor(itg)) %>% 
  ggplot(aes(x=val,fill=itg))+geom_histogram(binwidth = 0.01) + 
  xlim(-4,4) +theme(legend.position = "none")
```