Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MuonFitter RNNFit model scripts #290

Merged
merged 5 commits into from
Nov 19, 2024
Merged

Conversation

jminock
Copy link
Contributor

@jminock jminock commented Oct 2, 2024

Collection of scripts to create and use models necessary for MuonFitter. Should these be located in ToolAnalysis or stored on the gpvms? Also, some of these scripts require "torch" as a python package which is not available in the current containers. Julie and I have currently used these scripts on our personal machines. I would like to integrate them into the central ANNIE framework if possible.

@marc1uk
Copy link
Collaborator

marc1uk commented Nov 15, 2024

OK.... so.

Step 1. is to run Data_prepare.py, which reads /home/jhe/annie/analysis/Muon_vertex/X.txt and /home/jhe/annie/analysis/Muon_vertex/Y.txt .... funnily enough I do not have a /home/jhe/... where do those come from?

Step 2. is to run RNN_train.py, which amongst others opens fitbyeye_wcsim_RNN.txt ....? where does this come from? This is in a test function, so perhaps it's not essential, but it does appear to be an embedded call, suggesting the script will fall over without it. Regardless of whether that test can be commented out, it would be good to know what it is, where it came from, and how to generate it.

Step 3. is Fit_data.py, but that seems to process ev_ai_eta_R{RUN}.txt files, suggesting there is a Step 2.5 which generates these files (from the MuonFitter toolchain run in "not reco" mode, according to your comment in #294). Can we update the readme with this missing step?

There also appears to be a model.pth file generated by RNN_train.py and used by Fit_data.py. I'm not sure if this is dataset specific (would seem odd), but perhaps we can make that model path configurable and put a standard model somewhere on /pnfs, assuming it is not dataset specific. If for some reason it is, i guess it should be treated similarly to the ev_ai_eta_R{RUN}.txt and tanktrackfitfile_r{RUN}_RNN.txt files....?

We also have extract_partnumber.py which is looking in /Users/juhe/annie/analysis/playground/ and writing to the rather specific /pnfs/annie/persistent/users/mnieslon/data/processed_hits_improved/R2630ProcessedRawData_TankAndMRDAndCTC_R2630S0 location.

Finally the rnn_fit.sh is hard-coding filelist="/home/jhe/annie/analysis/flist.txt". While it's so short it's pretty self explanatory this could be mentioned in the readme as a helper for step 2, i guess.

Sorry to give you a bunch of extra work, but this is kind of a mess. 😑

I'm not going to ask you to do it (i'm asking enough already), but these scripts could be converted to python Tools without too much work (mostly just copy-paste into the python tool template and update getting the configuration variables), then each of the steps could be chained together into a single ToolChain. Paths of inputs/outputs could even be passed via the datamodel, or hell, they could even pass data via the datamodel and do away with the intermediate files.... 🤯

@jminock
Copy link
Contributor Author

jminock commented Nov 15, 2024

Steps 1 and 2 are used to generate model files, which already exist in /pnfs/annie/persistent/simulations/models/MuonFitter/ as per #295 . These files were initially included in this PR but removed as per your request. There are very minimal instructions for how to use these scripts and other files associated with them because instructions for their usage were never created. I believe these scripts were ran outside of the gpvm's and the ToolAnalysis container during their development. I have never completed steps 1 and 2, and my understanding of how to is limited entirely to what is included in the READMEs. Files like X.txt and Y.txt must be on UCDavis servers. I can try to reach out to Julie and people at UC Davis to check if these files still exist.
Still, we have models that work, allowing the Tool to work.

For step 3, that follows instructions starting at line 44 from MuonFitterREADME.md from #295 .

To my understanding, model.pth is not dataset specific and is included in /pnfs/annie/persistent/simulations/models/MuonFitter/

For extract_partnumber.py and rnn_fit.sh, I've never explicitly used them. I wanted to include them because Julie gave them to be included. extract_partnumber.py I don't think is necessary at all and can be removed. rnn_fit.sh will have specific mention in the README for Step 3 as a helper function, thank you.

I'll update all the README files accordingly and make sure they are as present as possible. I'll also double check the necessity of extract_partnumber.py

James Minock and others added 2 commits November 15, 2024 14:38
…. Also changed location of MuonFitter README from MuonFitter/RNNFit/ to MuonFitter/
@marc1uk
Copy link
Collaborator

marc1uk commented Nov 19, 2024

OK, sounds good. Please do chase up Julie to find out more details on the inputs to training the model. While we have models that run, without knowing details of the training process the generated results may have caveats that we aren't aware of, or generally may just give poor performance (e.g., if they've been trained on WCSim simulations and QE / reflectivity values have changed, the model may need re-training to account for that).

@marc1uk marc1uk merged commit 88afcec into ANNIEsoft:Application Nov 19, 2024
1 check passed
@jminock jminock deleted the MFRNNFit branch December 11, 2024 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants