-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Developer Issue]: Why not RTMB? #554
Comments
I don't have an answer to the question but this made me think is it possible that we could use RTMB to allow user-specified R functions to be incorporated into a FIMS model such as a selective function? That could be a really useful use case for avoiding adding multitudes of function variations to base FIMS while allowing users flexibility? |
Pro RTMB:
|
@nathanvaughan-NOAA to me this is the only "easy" thing to actually do in FIMS, so I don't see the benefit of creating a way to do this in R (PS it seems Matthew has already done something similar). The hard part of FIMS is the core architecture, the added dimensions, the Rcpp interface, connecting it to TMB, etc. |
|
Great points @JaneSullivan-NOAA !! and thanks for the link, that is exactly what I was thinking. |
R is a untyped programming language. In a untyped language, a type for a variable can change at any time. For example, a variable defined as a list in R can easily be redefined as any other type, such as a vector, matrix, S4 object, etc and no warning will be given. I believe this is the kind of unintended bug that Andrea may have been talking about.
For example:
library(FIMS)
a<-new(Parameter)
a<-3.1459
This will work in R, but would fail in any typed language because "a" is already defined. In the above example, if "a" was initially an independent variable (Parameter) in a model, it was just redefined as a real value without any warning from R and any use of "a" after the redefinition would result in a derivative chain missing important ad joint derivatives. This would in turn wrongly inform any quasi-newton minimizer and likely result in a wrong solution. This is the kind of unintended bug
that can not be prevented in a untyped programming language.
|
Thanks @Andrea-Havron-NOAA for starting this. I agree that bullet (1) is true but I do not know if I see that as a limitation to not changing. I would like to think that if there was any kind of software that suited FIMS better than what we are currently using that we would put in the effort to make the switch. I am not saying that RTMB is better just that if it was I would think that especially at this stage of development that we would put in the effort to switch. |
@msupernaw my point is that this is an R issue and not an RTMB issue, at least not to my knowledge. RTMB does the typing for the user ("typing" - is that how you say it?). Specifically, before we identify this as a "problem" in the "why not RTMB" column, I think we should have at least one example of it actually being a problem in RTMB. The closest thing I know of is that RTMB does not currently accept sparse matrix classes, so you have to |
One other consideration is if using RTMB reduces portability, increasing our dependence on TMB which is externally developed and has no designated successor to maintain the code base. I know the user base is large enough there is hope the community could generate support, but it's a consideration IMO. |
Thanks for raising this point Jane. RTMB, though, is just an R interface for TMB. RTMB specific functions are written in C++ and exposed to R via Rcpp. The objective function gets coded up in R. Any non-RTMB specific function (for example in FIMS, the population loop and code written to calculate expected/derived values) would be written in R. Here is an example from RTMB's mvrw.R where I introduced an error by dropping the indices on the sd vector when calculating the covariance matrix. Correct code:
Code with missing indices:
I compiled the TMB equivalent example, mvrw.cpp and it failed with 'invalid cast from type' error due to trying to multiply a scalar times a vector, |
@Andrea-Havron-NOAA and @JaneSullivan-NOAA. That is a good example of how R can do wild things when it makes incorrect assumptions. Are there any important cases where this could happen with RTMB due to a user error that wouldn't happen if we use C++ base code? I assume that we will just need to ensure this kind of error doesn't occur in developer written FIMS code through tests and case studies. I also understand @ChristineStawitz-NOAA's desire for portability due to unknown future support for TMB but we should also consider the potential for future NOAA support of FIMS if we limit accessibility to only/largely C++ which has already resulted in significant issues within our development team. Coding FIMS in all R would substantially increase current and future participation in development within both our development team and the rest of NOAA. Using SS3 as an example and almost 100% reliance on @Rick-Methot-NOAA for updates/support it seems that future FIMS development support may be much more precarious/limited than TMB support given the more limited user base? I also agree with @kellijohnson-NOAA that we should not consider porting over to RTMB as a negative at this point given the significant refactoring and ongoing future development that is already planned. That may have been a little all over the place so in summary.
|
@Andrea-Havron-NOAA great example! Thank you :) I'll just go shut up now. |
Thanks everyone for all the great feedback. I think any platform we develop needs to be somewhat independent of TMB. I see two main potential future use cases that would not depend on TMB:
|
In future assessment modeling situations, Brieman’s two cultures of data modeling and algorithmic modeling may both need to be considered to produce the best predictive accuracy (Breiman 2001). This contrasts with the long-term fisheries assessment modeling practice of deploying data models with parameter sets to be estimated in contrast to machine learning algorithms to make probabilistic classifications of stock status for management action. Optimization methods for assessments formulated as data models typically use gradient descent and assume differentiable objective functions. Other applicable but less common optimization approaches include stochastic search algorithms based on Monte Carlo simulation or generic hill-climbing algorithm based on simulated annealing or particle swarm optimizers for frequentist or Markov Chain Monte Carlo or Hamiltonian Markov Chain algorithms for Bayesian sampling from the target posterior distribution. In comparison, there are few instances of algorithmic or machine learning models that have been applied in marine science to date (Malde et al. 2020). However, more frequent future applications of machine learning for assessment purposes can be expected (Beyan and Brouwer 2020, Sartar et al. 2021) based on predictive accuracies being obtained for species identification (e.g., Smolinski et al. 2019, Allken et al. 2021), forecasting environmental conditions (e.g., Ye et al. 2021) forecasting recruitment or spawning biomass (Smolinksi 2018, Lüdtke and Pierce 2023) or setting prior distributions for probable stock size (Froese et al. 2023). While there is no certainty about the future needs for stock assessment modeling, an important future challenge is to ensure that any next-generation assessment platform be adaptive to potential changes from a data model-based to a more algorithmic modeling culture. |
@JonBrodziak can you give a little more context here? Are you saying that we should build FIMS to do traditional stats and ML equally well? How does this interact with RTMB vs C++? FYI the key paper cited is here. |
Happy to have this thread continue our discussion, but if you do have a specific pro or con or suggested action I would like to capture them in this doc. |
Thanks for posting the link to Brieman (2001). Here is the context.
FIMS is a general modeling system to provide BSIA and decision support for fisheries management. FIMS has been envisioned and set up a priori to build integrated assessment models. My point is that FIMS software infrastructure should be designed to be extensible enough to readily integrate ML-based decision support tools because they are inevitable.
|
@JaneSullivan-NOAA is going to start prototyping an age-length structured approach in RTMB which will help inform this decision |
Nathan and I have some meeting times set May 13-17 on the FIMS calendar if anyone else wants to join. |
It's possible we may be able to pass a FIMS MakeADFun object to RTMB. FIMS would need to be compiled with TMBad to get this to work, which is scheduled for the quality milestone after M2. Benefits of passing to RTMB would be accessing visualization tools for the tape, although the usefulness of this is questionable given how complicated the tape will likely be. |
Issue details
RTMB is a wrapper interface that allows models written in R to be directly passed to TMB. Users no longer need to write models in C++ in order to access the TMB statistical inference engine, but can write them directly in R. Both RTMB models and original TMB models are translated into the same C++ computational graphs for statistical inferece.
The question, why not write FIMS in R and use RTMB? has come up several times. It is important to document this decision point and get feedback from the implementation team. A final version of the decision points should be made available to the community similar to issue NOAA-FIMS/noaa-fims.github.io#2.
I've come up with the following reasons:
There is a tradeoff between portability and community accessibility, both of which are stated goals of the FIMS project.
The text was updated successfully, but these errors were encountered: