After going through this chapter, if your computer is unsuccessful with the
numypro
install, usecausact
on Posit Cloud (enable 4GB ram during install) to continue working through this book - the install instructions are well-tested on Posit Cloud and known to work.
The previous two chapters covered Bayes rule and generative DAGs. This chapter has you install necessary software to get the computer to use Bayes rule on the generative DAGs you make; sparing you from those awful computations. Specifically, you will use:
The causact
R
package (Fleischhacker and Nguyen (2022Fleischhacker, Adam J., and Thi Hong Nhung Nguyen. 2022. “Generative DAGs as an Interface into Probabilistic Programming with the r Package Causact.” The Journal of Open Source Software 7 (76): 4415. https://doi.org/10.21105/joss.04415.)) to visually depict generative DAGs, define your mathematical/computational models, and transparently interface with
the numpyro
Python package (Phan, Pradhan, and Jankowiak (2019Phan, Du, Neeraj Pradhan, and Martin Jankowiak. 2019. “Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro.” arXiv Preprint arXiv:1912.11554.)) that computationally automates Bayes rule and gives representative samples of posterior distributions, and
the arviz
and xarray
Python packages to manipulate the numpyro
output for easy and seamless exportation back to your R
ecosystem.
The below instructions will help you set up your computer environment to get causact
and its dependencies running smoothly. While Python is required, these instructions do not assume any previous installation of Python and only assume you have a recent version of R/RStudio (R
-version 4.1.X or higher) and a machine capable of running numpyro
. If your machine does not meet the system requirements in the margin, use Posit Cloud to continue your learning in this and subsequent chapters.
numpyro System Requirements: * Ubuntu 20.04 or later (64-bit) * macOS 10.11 or later (64-bit) * Windows 7 or later (64-bit) (Python 3 only) Use Posit Cloud to run RStudio if your system does not meet the above requirements.
This script installs causact
and numpyro
on your local machine or in Posit Cloud. The script allows you to complete the installation process without ever leaving RStudio. Try it by running each line one at a time and awaiting the system’s response before continuing; you only need to do this once.
Note: If opting for installation on Posit Cloud, temporarily adjust your project’s RAM to 4GB during the installation process (remember to APPLY CHANGES). This preemptive measure helps avoid encountering an Error: Error creating conda environment [exit code 137]. After installation, feel free to revert the settings to 1GB of RAM.
## INSTALLATION SCRIPT TO GET CAUSACT,
## and NUMPYRO WORKING TOGETHER HAPPILY
## NOTE: Run each line one at a time using CTRL+ENTER.
## Await completion of one line
## before running the next.
## If prompted to "Restart R", say YES.
#### STEP 0: Restart R in a Clean Session
#### use RStudio menu: SESSION -> RESTART R
#### STEP 1: INSTALL latest R PACKAGE
install.packages("causact")
#### STEP 2: INSTALL PYTHON DEPENDENCIES IN FINDABLE SPOT
causact::install_causact_deps()
## if asked to install minconda, please type "Y"
## and hit <ENTER> in the Console
## this can take up to 10 minutes
#### STEP 3: TEST THE INSTALLATION
library(causact)
graph = dag_create() %>%
dag_node("Normal RV",
rhs =normal(0,10))
graph %>% dag_render() ## see oval
drawsDF = graph %>% dag_numpyro() ## see "sample:..."
drawsDF %>% dagp_plot(densityPlot = TRUE) ## see plot
#### CONGRATS IF IT WORKS.
If the above script produced a plot in the last line - CONGRATS - move on to the next chapter!! If not, see the next section for troubleshooting tips.
Note: The September 11, 2023 release of
reticulate
(v1.32
) has caused an issue which gives aTypeError: the first argument must be callable
error when usingdag_numpyro()
on windows. If you experience this, install thedev
version ofreticulate
by following the below steps:
- Install RTOOLS by using installer at: https://cran.r-project.org/bin/windows/Rtools/
- Install
pak
package:
- Install the dev version of
reticulate
:
There are three pieces that need to be in place for the installation to be successful:
conda
environment that is accessible from R
.conda
environment called r-causact
set-up with numpyro
, arviz
, jaxlib
, and xarray
.R
session can connect to the r-causact
Python environment,Posit Cloud (https://posit.cloud/) is useful for those with chromebooks or computers that seem underpowered for modern analytics. The install script requires 4GB of RAM on POSIT cloud. You will need either a paid plan or share your instructor’s supplied workspace.__ If you have a laptop that can handle it, then I recommend sticking to using your locally-installed RStudio.
To ensure that Conda is available from R, use the reticulate
package from R:
If Needed, Install Reticulate: reticulate
is likely already on your system. Run library(reticulate)
to access its functions. If the reticulate
package is unavailable, install it by running install.packages("reticulate")
in the R console.
Test That R Can Find Conda: Use the reticulate::conda_binary()
function to find the path to the conda executable. If no executable is found, the function returns NULL. In this case, use the reticulate::install_miniconda()
function to install Miniconda directly from R. This function installs a private copy of Miniconda for your R session. Pay attention to messages to ensure the install is successful. If not successful, google your error message for possible solutions.
After each attempt at fixing your installation, restart R
and then run reticulate::conda_binary()
to verify whether conda
can be found by R. If still having issues, you can seek further help by filing an issue at https://github.com/flyaflya/causact/issues.
If R
is able to find conda, the next step is to create the appropriate R
-accessible Python environment. By convention for causact
, this environment must be named r-causact
. It can be created with the following code in R
:
reticulate::py_install(
packages = c("numpyro[cpu]==0.16.1",
"arviz==0.20.0",
"pandas==2.2.2"),
envname = "r-causact",
method = "conda",
python_version = "3.11",
pip = TRUE
)
Pay attention to messages during the install to aid your debugging. You can seek further help by filing an issue at https://github.com/flyaflya/causact/issues.
When R
connects to Python, the connection ties R
to a specific Python environment. We need to ensure that environment is r-causact
. To clear any current connection to Python, please restart your R session. If in RStudio, use the menu: SESSION -> RESTART R.
After restarting, run:
If this runs without error, then you should be good using causact
after running library(causact)
. If you get an error that you cannot debug, feel free to file an issue at https://github.com/flyaflya/causact/issues.