[Developer Issue]: Why not RTMB? #554

Andrea-Havron-NOAA · 2024-03-11T23:34:43Z

Issue details

RTMB is a wrapper interface that allows models written in R to be directly passed to TMB. Users no longer need to write models in C++ in order to access the TMB statistical inference engine, but can write them directly in R. Both RTMB models and original TMB models are translated into the same C++ computational graphs for statistical inferece.

The question, why not write FIMS in R and use RTMB? has come up several times. It is important to document this decision point and get feedback from the implementation team. A final version of the decision points should be made available to the community similar to issue NOAA-FIMS/noaa-fims.github.io#2.

I've come up with the following reasons:

FIMS started development before RTMB and would involve a cost to switch over from C++ to R.
FIMS aims to be portable and writing in C++ allows greater portability with platforms outside R (python - primary machine learning platform)
It is easier to introduce unintended bugs in R given it is a dynamic language and implements implicit conversion (R changes things on you without you realizing it).

There is a tradeoff between portability and community accessibility, both of which are stated goals of the FIMS project.

nathanvaughan-NOAA · 2024-03-11T23:42:10Z

I don't have an answer to the question but this made me think is it possible that we could use RTMB to allow user-specified R functions to be incorporated into a FIMS model such as a selective function? That could be a really useful use case for avoiding adding multitudes of function variations to base FIMS while allowing users flexibility?

JaneSullivan-NOAA · 2024-03-12T00:11:43Z

Pro RTMB:

NO compilation = fast, fun, interactive development
NO C++ or templated C++ = a concise code base
NO Rcpp
Only one line of code to gain TMB's simulation and OSA functionality (though I recognize there is still some development that needs to happen in order for this to be fully operational)
never leave the R environment
we can still have modularity
we can all contribute and use our existing expertise as R programmers and stock assessment scientists

JaneSullivan-NOAA · 2024-03-12T00:19:02Z

I don't have an answer to the question but this made me think is it possible that we could use RTMB to allow user-specified R functions to be incorporated into a FIMS model such as a selective function? That could be a really useful use case for avoiding adding multitudes of function variations to base FIMS while allowing users flexibility?

@nathanvaughan-NOAA to me this is the only "easy" thing to actually do in FIMS, so I don't see the benefit of creating a way to do this in R (PS it seems Matthew has already done something similar). The hard part of FIMS is the core architecture, the added dimensions, the Rcpp interface, connecting it to TMB, etc.

JaneSullivan-NOAA · 2024-03-12T00:21:47Z

It is easier to introduce unintended bugs in R given it is a dynamic language and implements implicit conversion (R changes things on you without you realizing it).
@Andrea-Havron-NOAA I think a lot of work has been done to prevent this. Can you provide an example?

nathanvaughan-NOAA · 2024-03-12T00:32:57Z

Great points @JaneSullivan-NOAA !! and thanks for the link, that is exactly what I was thinking.

msupernaw · 2024-03-12T13:53:38Z

R is a untyped programming language. In a untyped language, a type for a variable can change at any time. For example, a variable defined as a list in R can easily be redefined as any other type, such as a vector, matrix, S4 object, etc and no warning will be given. I believe this is the kind of unintended bug that Andrea may have been talking about. For example: library(FIMS) a<-new(Parameter) a<-3.1459 This will work in R, but would fail in any typed language because "a" is already defined. In the above example, if "a" was initially an independent variable (Parameter) in a model, it was just redefined as a real value without any warning from R and any use of "a" after the redefinition would result in a derivative chain missing important ad joint derivatives. This would in turn wrongly inform any quasi-newton minimizer and likely result in a wrong solution. This is the kind of unintended bug that can not be prevented in a untyped programming language.

kellijohnson-NOAA · 2024-03-12T14:13:42Z

Thanks @Andrea-Havron-NOAA for starting this. I agree that bullet (1) is true but I do not know if I see that as a limitation to not changing. I would like to think that if there was any kind of software that suited FIMS better than what we are currently using that we would put in the effort to make the switch. I am not saying that RTMB is better just that if it was I would think that especially at this stage of development that we would put in the effort to switch.

JaneSullivan-NOAA · 2024-03-12T16:04:36Z

R is a untyped programming language. In a untyped language, a type for a variable can change at any time. For example, a variable defined as a list in R can easily be redefined as any other type, such as a vector, matrix, S4 object, etc and no warning will be given. I believe this is the kind of unintended bug

@msupernaw my point is that this is an R issue and not an RTMB issue, at least not to my knowledge. RTMB does the typing for the user ("typing" - is that how you say it?). Specifically, before we identify this as a "problem" in the "why not RTMB" column, I think we should have at least one example of it actually being a problem in RTMB.

The closest thing I know of is that RTMB does not currently accept sparse matrix classes, so you have to as.vector() them inside a likelihood (example from Sean Anderson). This is a different issue though.

ChristineStawitz-NOAA · 2024-03-12T17:58:05Z

One other consideration is if using RTMB reduces portability, increasing our dependence on TMB which is externally developed and has no designated successor to maintain the code base. I know the user base is large enough there is hope the community could generate support, but it's a consideration IMO.

Andrea-Havron-NOAA · 2024-03-12T22:05:21Z

@msupernaw my point is that this is an R issue and not an RTMB issue, at least not to my knowledge. RTMB does the typing for the user ("typing" - is that how you say it?). Specifically, before we identify this as a "problem" in the "why not RTMB" column, I think we should have at least one example of it actually being a problem in RTMB.

The closest thing I know of is that RTMB does not currently accept sparse matrix classes, so you have to as.vector() them inside a likelihood (example from Sean Anderson). This is a different issue though.

Thanks for raising this point Jane. RTMB, though, is just an R interface for TMB. RTMB specific functions are written in C++ and exposed to R via Rcpp. The objective function gets coded up in R. Any non-RTMB specific function (for example in FIMS, the population loop and code written to calculate expected/derived values) would be written in R.

Here is an example from RTMB's mvrw.R where I introduced an error by dropping the indices on the sd vector when calculating the covariance matrix.

Correct code:

cov <- outer(1:stateDim,
                 1:stateDim,
                 function(i,j) rho^(abs(i-j)) * sds[i] * sds[j])
>  cov 
       [,1]   [,2] [,3]
[1,] 0.2500 0.5625 0.81
[2,] 0.5625 1.5625 2.25
[3,] 0.8100 2.2500 4.00

Code with missing indices:

cov <- outer(1:stateDim,
                 1:stateDim,
                 function(i,j) rho^(abs(i-j)) * sds * sds)
> cov
        [,1]   [,2]    [,3]
[1,] 0.25000 0.2250 0.20250
[2,] 1.40625 1.5625 1.40625
[3,] 3.24000 3.6000 4.00000

I compiled the TMB equivalent example, mvrw.cpp and it failed with 'invalid cast from type' error due to trying to multiply a scalar times a vector,

nathanvaughan-NOAA · 2024-03-12T22:42:16Z

@Andrea-Havron-NOAA and @JaneSullivan-NOAA. That is a good example of how R can do wild things when it makes incorrect assumptions. Are there any important cases where this could happen with RTMB due to a user error that wouldn't happen if we use C++ base code? I assume that we will just need to ensure this kind of error doesn't occur in developer written FIMS code through tests and case studies. I also understand @ChristineStawitz-NOAA's desire for portability due to unknown future support for TMB but we should also consider the potential for future NOAA support of FIMS if we limit accessibility to only/largely C++ which has already resulted in significant issues within our development team. Coding FIMS in all R would substantially increase current and future participation in development within both our development team and the rest of NOAA. Using SS3 as an example and almost 100% reliance on @Rick-Methot-NOAA for updates/support it seems that future FIMS development support may be much more precarious/limited than TMB support given the more limited user base? I also agree with @kellijohnson-NOAA that we should not consider porting over to RTMB as a negative at this point given the significant refactoring and ongoing future development that is already planned. That may have been a little all over the place so in summary.

Would moving to RTMB add new user-induced bugs that cannot be caught in our code, but would be with C++? If yes then that is a significant concern as silent errors that give plausible but wrong results are a huge liability in production assessments particularly during on the fly revisions that are often requested in the SE and I assume elsewhere.
If no to 1 then this just comes down to a tradeoff between community/developer accessibility vs future portability with the additional issue that FIMS would only remain portable as long as there was developer support to do the porting. We don't want to end up in an SS type situation where only a small group of people are golden handcuffed to a lifetime of development support.

JaneSullivan-NOAA · 2024-03-13T03:14:54Z

@Andrea-Havron-NOAA great example! Thank you :) I'll just go shut up now.

Andrea-Havron-NOAA · 2024-03-13T18:34:31Z

Thanks everyone for all the great feedback. I think any platform we develop needs to be somewhat independent of TMB. I see two main potential future use cases that would not depend on TMB:

FIMS as an operating model for MSEs/ecological/economic models
Machine Learning if/once ML outperforms statistical inference

JonBrodziak · 2024-03-13T19:18:59Z

In future assessment modeling situations, Brieman’s two cultures of data modeling and algorithmic modeling may both need to be considered to produce the best predictive accuracy (Breiman 2001). This contrasts with the long-term fisheries assessment modeling practice of deploying data models with parameter sets to be estimated in contrast to machine learning algorithms to make probabilistic classifications of stock status for management action. Optimization methods for assessments formulated as data models typically use gradient descent and assume differentiable objective functions. Other applicable but less common optimization approaches include stochastic search algorithms based on Monte Carlo simulation or generic hill-climbing algorithm based on simulated annealing or particle swarm optimizers for frequentist or Markov Chain Monte Carlo or Hamiltonian Markov Chain algorithms for Bayesian sampling from the target posterior distribution. In comparison, there are few instances of algorithmic or machine learning models that have been applied in marine science to date (Malde et al. 2020). However, more frequent future applications of machine learning for assessment purposes can be expected (Beyan and Brouwer 2020, Sartar et al. 2021) based on predictive accuracies being obtained for species identification (e.g., Smolinski et al. 2019, Allken et al. 2021), forecasting environmental conditions (e.g., Ye et al. 2021) forecasting recruitment or spawning biomass (Smolinksi 2018, Lüdtke and Pierce 2023) or setting prior distributions for probable stock size (Froese et al. 2023). While there is no certainty about the future needs for stock assessment modeling, an important future challenge is to ensure that any next-generation assessment platform be adaptive to potential changes from a data model-based to a more algorithmic modeling culture.

Cole-Monnahan-NOAA · 2024-03-13T23:07:56Z

@JonBrodziak can you give a little more context here? Are you saying that we should build FIMS to do traditional stats and ML equally well? How does this interact with RTMB vs C++?

FYI the key paper cited is here.

ChristineStawitz-NOAA · 2024-03-13T23:27:49Z

Happy to have this thread continue our discussion, but if you do have a specific pro or con or suggested action I would like to capture them in this doc.

JonBrodziak · 2024-03-14T22:10:18Z

Thanks for posting the link to Brieman (2001). Here is the context. FIMS is a general modeling system to provide BSIA and decision support for fisheries management. FIMS has been envisioned and set up a priori to build integrated assessment models. My point is that FIMS software infrastructure should be designed to be extensible enough to readily integrate ML-based decision support tools because they are inevitable.

ChristineStawitz-NOAA · 2024-04-16T17:47:58Z

@JaneSullivan-NOAA is going to start prototyping an age-length structured approach in RTMB which will help inform this decision

JaneSullivan-NOAA · 2024-04-16T18:43:14Z

Nathan and I have some meeting times set May 13-17 on the FIMS calendar if anyone else wants to join.

Andrea-Havron-NOAA · 2024-07-23T19:03:10Z

It's possible we may be able to pass a FIMS MakeADFun object to RTMB. FIMS would need to be compiled with TMBad to get this to work, which is scheduled for the quality milestone after M2. Benefits of passing to RTMB would be accessing visualization tools for the tape, although the usefulness of this is questionable given how complicated the tape will likely be.

Andrea-Havron-NOAA added the status: triage_needed This is not approved for this milestone, do not work on it yet label Mar 11, 2024

Andrea-Havron-NOAA assigned ChristineStawitz-NOAA Mar 11, 2024

ChristineStawitz-NOAA added theme: more features and removed status: triage_needed This is not approved for this milestone, do not work on it yet labels Apr 16, 2024

ChristineStawitz-NOAA assigned JaneSullivan-NOAA and unassigned ChristineStawitz-NOAA Apr 16, 2024

ChristineStawitz-NOAA added this to the Parking lot milestone May 20, 2024

kellijohnson-NOAA modified the milestones: Parking lot, 2 Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Developer Issue]: Why not RTMB? #554

[Developer Issue]: Why not RTMB? #554

Andrea-Havron-NOAA commented Mar 11, 2024

nathanvaughan-NOAA commented Mar 11, 2024

JaneSullivan-NOAA commented Mar 12, 2024

JaneSullivan-NOAA commented Mar 12, 2024

JaneSullivan-NOAA commented Mar 12, 2024

nathanvaughan-NOAA commented Mar 12, 2024

msupernaw commented Mar 12, 2024 via email •

edited by kellijohnson-NOAA

Loading

kellijohnson-NOAA commented Mar 12, 2024

JaneSullivan-NOAA commented Mar 12, 2024 •

edited by kellijohnson-NOAA

Loading

ChristineStawitz-NOAA commented Mar 12, 2024

Andrea-Havron-NOAA commented Mar 12, 2024 •

edited by kellijohnson-NOAA

Loading

nathanvaughan-NOAA commented Mar 12, 2024 •

edited by kellijohnson-NOAA

Loading

JaneSullivan-NOAA commented Mar 13, 2024

Andrea-Havron-NOAA commented Mar 13, 2024

JonBrodziak commented Mar 13, 2024

Cole-Monnahan-NOAA commented Mar 13, 2024

ChristineStawitz-NOAA commented Mar 13, 2024

JonBrodziak commented Mar 14, 2024 via email •

edited by kellijohnson-NOAA

Loading

ChristineStawitz-NOAA commented Apr 16, 2024

JaneSullivan-NOAA commented Apr 16, 2024 •

edited by kellijohnson-NOAA

Loading

Andrea-Havron-NOAA commented Jul 23, 2024

[Developer Issue]: Why not RTMB? #554

[Developer Issue]: Why not RTMB? #554

Comments

Andrea-Havron-NOAA commented Mar 11, 2024

Issue details

nathanvaughan-NOAA commented Mar 11, 2024

JaneSullivan-NOAA commented Mar 12, 2024

JaneSullivan-NOAA commented Mar 12, 2024

JaneSullivan-NOAA commented Mar 12, 2024

nathanvaughan-NOAA commented Mar 12, 2024

msupernaw commented Mar 12, 2024 via email • edited by kellijohnson-NOAA Loading

kellijohnson-NOAA commented Mar 12, 2024

JaneSullivan-NOAA commented Mar 12, 2024 • edited by kellijohnson-NOAA Loading

ChristineStawitz-NOAA commented Mar 12, 2024

Andrea-Havron-NOAA commented Mar 12, 2024 • edited by kellijohnson-NOAA Loading

nathanvaughan-NOAA commented Mar 12, 2024 • edited by kellijohnson-NOAA Loading

JaneSullivan-NOAA commented Mar 13, 2024

Andrea-Havron-NOAA commented Mar 13, 2024

JonBrodziak commented Mar 13, 2024

Cole-Monnahan-NOAA commented Mar 13, 2024

ChristineStawitz-NOAA commented Mar 13, 2024

JonBrodziak commented Mar 14, 2024 via email • edited by kellijohnson-NOAA Loading

ChristineStawitz-NOAA commented Apr 16, 2024

JaneSullivan-NOAA commented Apr 16, 2024 • edited by kellijohnson-NOAA Loading

Andrea-Havron-NOAA commented Jul 23, 2024

msupernaw commented Mar 12, 2024 via email •

edited by kellijohnson-NOAA

Loading

JaneSullivan-NOAA commented Mar 12, 2024 •

edited by kellijohnson-NOAA

Loading

Andrea-Havron-NOAA commented Mar 12, 2024 •

edited by kellijohnson-NOAA

Loading

nathanvaughan-NOAA commented Mar 12, 2024 •

edited by kellijohnson-NOAA

Loading

JonBrodziak commented Mar 14, 2024 via email •

edited by kellijohnson-NOAA

Loading

JaneSullivan-NOAA commented Apr 16, 2024 •

edited by kellijohnson-NOAA

Loading