Merge branch 'development' into test_utils_dataloaders

prio-data · Nov 1, 2024 · 8bd6bf8 · 8bd6bf8
2 parents 8cfa061 + f888262
commit 8bd6bf8
Show file tree

Hide file tree

Showing 11 changed files with 357 additions and 25 deletions.
diff --git a/documentation/ADRs/008_no_jupyter_notebooks_in_production.md b/documentation/ADRs/008_no_jupyter_notebooks_in_production.md
@@ -3,7 +3,7 @@
 | ADR Info            | Details                                       |
 |---------------------|-----------------------------------------------|
 | Subject             | No Use of Jupyter Notebooks in Production     |
-| ADR Number          | 001                                           |
+| ADR Number          | 008                                           |
 | Status              | Accepted                                      |
 | Author              | Jim, Mihai, Xiaolong, Simon, Sara             |
 | Date                | 30.07.2024                                    |

diff --git a/documentation/ADRs/011_Common_Querysets_for_Model_Pipelines.md b/documentation/ADRs/011_Common_Querysets_for_Model_Pipelines.md
@@ -1,4 +1,4 @@
-# ADR 011 - Common Querysets for Model Pipelines
+#Common Querysets for Model Pipelines
 
 
 | ADR Info            | Details                           |

diff --git a/documentation/ADRs/014_input_drift_detection.md b/documentation/ADRs/014_input_drift_detection.md
@@ -4,7 +4,7 @@
 |-------------------|-----------------------|
 | Subject           | Input drift detection |
 | ADR Number        | 014                   |
-| Status            | proposed              |
+| Status            | Accepted              |
 | Author            | Jim Dale              |
 | Date              | 02/10/2024            |
 

diff --git a/documentation/ADRs/016_input_drift_detection_logging.md b/documentation/ADRs/016_input_drift_detection_logging.md
@@ -5,7 +5,7 @@
 |---------------------|-------------------------------|
 | Subject             | Input Drift Detection Logging |
 | ADR Number          | 016                           |
-| Status              | proposed                      |
+| Status              | Accepted                      |
 | Author              | Jim Dale                      |
 | Date                | 28/10/2014                    |
 
@@ -14,19 +14,19 @@ An input drift detection system has been implemented as part of the viewser data
 
 For related ADRs on the generation of different log files and other general logging standards/routines, please see the ADRs below:  [NOTE: new relevant ADRs links should be added]
 
-[009_log_file_for_generated_data](/documentation/ADRs/009_log_file_for_generated_data.md)
+- [009_log_file_for_generated_data](/documentation/ADRs/009_log_file_for_generated_data.md)
 
-[017_log_files_for_offline_evaluation](/documentation/ADRs/017_log_files_for_offline_evaluation.md)
+- [017_log_files_for_offline_evaluation](/documentation/ADRs/017_log_files_for_offline_evaluation.md)
 
-[018_log_files_for_online_evaluation](/documentation/ADRs/018_log_files_for_online_evaluation.md)
+- [018_log_files_for_online_evaluation](/documentation/ADRs/018_log_files_for_online_evaluation.md)
 
-[019_log_files_for_model_training](/documentation/ADRs/019_log_files_for_model_training.md)
+- [019_log_files_for_model_training](/documentation/ADRs/019_log_files_for_model_training.md)
 
-[020_log_files_and_realtime_alerts](/documentation/ADRs/020_log_files_and_realtime_alerts.md)
+- [020_log_files_and_realtime_alerts](/documentation/ADRs/020_log_files_and_realtime_alerts.md)
 
-[025_log_level_standards](/documentation/ADRs/025_log_level_standards.md)
+- [025_log_level_standards](/documentation/ADRs/025_log_level_standards.md)
 
-[026_log_files_for_input_data](/documentation/ADRs/026_log_files_for_input_data.md)
+- [026_log_files_for_input_data](/documentation/ADRs/026_log_files_for_input_data.md)
 
 
 ## Decision

diff --git a/documentation/ADRs/022_model_catalogs.md b/documentation/ADRs/022_model_catalogs.md
@@ -1,6 +1,5 @@
 
-
-## Create Model Catalogs
+# Create Model Catalogs
 
 
 | ADR Info            | Details           |

diff --git a/documentation/ADRs/023_production_development.md b/documentation/ADRs/023_production_development.md
@@ -1,11 +1,10 @@
-## Production and Development Branches
-*Using production and development branches instead of main*
+# Production and Development Branches
 
 | ADR Info            | Details           |
 |---------------------|-------------------|
 | Subject             | Production and Development Branches  |
 | ADR Number          | 023   |
-| Status              | proposed   |
+| Status              | Accepted   |
 | Author              | Borbála   |
 | Date                | 29.10.2024.     |
 

diff --git a/documentation/ADRs/024_development_and_production_sync.md b/documentation/ADRs/024_development_and_production_sync.md
@@ -1 +1,77 @@
-TODO
+
+## Development and Production Sync
+
+
+| ADR Info            | Details           |
+|---------------------|-------------------|
+| Subject             | Production and Development Branch Synchronization  |
+| ADR Number          | 024   |
+| Status              | Accepted   |
+| Author              | Simon  |
+| Date                | 31.10.2024.    |
+
+## Context
+
+We aim to establish a new benchmark in MLOps for early warning systems (EWS), specifically for conflict forecasting, which demands high standards of reliability, transparency, and seamless update processes. Given the high stakes of forecasting in EWS, the branching strategy must support robust, transparent, and consistent updates, with a focus on ensuring production stability while accommodating active, iterative development.
+
+To support continuous quality assurance, real-time monitoring, and rapid model updates, the synchronization between development and production branches must be structured to maintain reliability and performance while addressing the following critical needs:
+- Irregular Deployment Frequency: The project requires deployments ranging from weekly to monthly, demanding a workflow that can handle periodic updates without disrupting production stability.
+- Critical Model Monitoring: Ensuring real-time model monitoring is essential to maintain the accuracy and reliability of predictions, with a strong focus on data drift detection, model performance assessment, and feature validation across deployment cycles.
+- Coupled ML and Non-ML Components: Some non-ML components are tightly integrated with ML workflows, requiring synchronized updates to avoid dependency issues in production.
+- Versioning and Traceability: Maintaining version control and artifact management is crucial for reproducibility, rollback, and historical comparison, particularly in a pipeline that supports high-stakes decision-making and early action.
+
+This ADR defines the branching and synchronization structure necessary to support these requirements while adhering to MLOps best practices, ensuring the production branch remains stable and reliable for operational forecasting while allowing iterative improvements in development.
+
+## Decision
+
+To achieve the requirements described in the Context section, we will implement the following strategy for branching and synchronization strategy, optimized for the EWS pipeline:
+
+### Overview
+
+**Branch Structure and Sync Strategy**
+
+1. **Primary Branches**
+- **Production:** Serves as the stable branch for all production-ready code and models. Only validated updates are merged here, ensuring production stability for high-stakes decision-making.
+- **Development:** Acts as the main integration branch for feature development, model updates, and experiment integration. All new features are developed in dedicated feature branches based on this branch and merged via Pull Requests (PRs) to ensure controlled updates and testing.
+
+2. **Feature Branch Workflow**
+- Feature branches are created off development for isolated testing of new features, models, or configurations.
+- Each feature branch undergoes rigorous PR reviews and automated testing to ensure compatibility, stability, and performance before merging into development. This approach maintains the stability of development, reducing errors upon merging to production.
+
+3. **Syncing Development to Production**
+- **Periodic Pull Requests:** At regular intervals (between weekly and monthly), development will be merged into production via a Pull Request once a full validation cycle is completed.
+- **Staging Environment Validation:** A staging environment replicates production settings to validate the integrity of development before merging into production. This includes running inference tests, drift detection, performance checks, and monitoring to detect issues pre-deployment, ensuring production stability.
+
+4. **Hotfix Branches**
+- For urgent issues in production, hotfix branches are created directly from production, fixed, tested, and merged back into production. These hotfixes are then backported to development to maintain consistency between branches.
+
+5. **Versioning**
+- **Semantic Versioning:** Each production release is tagged with semantic versioning (e.g., v1.0.0, v1.1.0) to facilitate traceability and rollback.
+
+## Consequences
+
+**Positive Effects:**
+- **Production Stability:** Clear separation between development and production minimizes the risk of untested code or model updates affecting production stability.
+- **Enhanced Monitoring and Quality Assurance:** The use of a staging environment and comprehensive validation checks before each merge ensures consistent quality and reliability in production.
+- **Rapid Issue Resolution:** Hotfix branches allow urgent fixes to be deployed directly to production, reducing downtime and maintaining model performance for critical decision-making.
+
+**Negative Effects:**
+- **Increased Complexity in Workflow:** Multiple branches and regular sync requirements add to the complexity of the branching strategy, necessitating disciplined version control and coordination across teams.
+- **Resource Overhead for Staging and Testing:** Maintaining a staging environment and conducting extensive validation tests for each update demands additional resources but is justified by the critical need for model reliability in production.
+
+
+## Rationale
+This branching and sync structure balances flexibility in development with reliability in production. By keeping development and production branches separate and introducing a staging validation step, we ensure that production remains stable and capable of handling high-stakes forecasts while enabling iterative development in development. The addition of hotfix branches further reduces the risk of downtime due to critical issues in production.
+
+### Considerations
+- **Sync Delays:** Frequent updates in development may slow down synchronization with production if not carefully managed. Scheduled periodic merges and staging validation cycles mitigate this risk.
+- **Resource Allocation:** The staging environment and enhanced testing for each PR demand additional computational resources and time but align with the need for stability and reliability in conflict forecasting.
+
+## Additional Notes
+
+
+## Feedback and Suggestions
+Feedback is welcome on any additional sync requirements, monitoring tools, or branching conventions. Input on optimizing the staging environment and hotfix management process is also appreciated to ensure alignment with best practices.
+
+---
+
diff --git a/documentation/ADRs/025_log _level_standards.md b/documentation/ADRs/025_log _level_standards.md
@@ -5,7 +5,7 @@
 |---------------------|-------------------|
 | Subject             | Logging Levels Configuration  |
 | ADR Number          | 025   |
-| Status              | Proposed  |
+| Status              | Accepted  |
 | Author              | Simon   |
 | Date                | 30.10.2024     |
 
@@ -17,19 +17,19 @@ The following log levels—DEBUG, INFO, WARNING, ERROR, and CRITICAL—are confi
 
 For related ADRs on the generation of different log files and other general logging standards/routines, please see the ADRs below:  [NOTE: new relevant ADRs links should be added]
 
-[009_log_file_for_generated_data](/documentation/ADRs/009_log_file_for_generated_data.md)
+- [009_log_file_for_generated_data](/documentation/ADRs/009_log_file_for_generated_data.md)
 
-[016_input_drift_detection_logging](/documentation/ADRs/016_input_drift_detection_logging.md)
+- [016_input_drift_detection_logging](/documentation/ADRs/016_input_drift_detection_logging.md)
 
-[017_log_files_for_offline_evaluation](/documentation/ADRs/017_log_files_for_offline_evaluation.md)
+- [017_log_files_for_offline_evaluation](/documentation/ADRs/017_log_files_for_offline_evaluation.md)
 
-[018_log_files_for_online_evaluation](/documentation/ADRs/018_log_files_for_online_evaluation.md)
+- [018_log_files_for_online_evaluation](/documentation/ADRs/018_log_files_for_online_evaluation.md)
 
-[019_log_files_for_model_training](/documentation/ADRs/019_log_files_for_model_training.md)
+- [019_log_files_for_model_training](/documentation/ADRs/019_log_files_for_model_training.md)
 
-[020_log_files_and_realtime_alerts](/documentation/ADRs/020_log_files_and_realtime_alerts.md)
+- [020_log_files_and_realtime_alerts](/documentation/ADRs/020_log_files_and_realtime_alerts.md)
 
-[026_log_files_for_input_data](/documentation/ADRs/026_log_files_for_input_data.md)
+- [026_log_files_for_input_data](/documentation/ADRs/026_log_files_for_input_data.md)
 
 
 ## Decision

diff --git a/documentation/ADRs/027_ensmeble_reconciliation.md b/documentation/ADRs/027_ensmeble_reconciliation.md
@@ -0,0 +1,51 @@
+Ensemble reconcilation
+
+| ADR Info            | Details                 |
+|---------------------|-------------------------|
+| Subject             | Ensemble reconciliation |
+| ADR Number          | 027                     |
+| Status              | Accepted                |
+| Author              | Jim                     |
+| Date                | 01/11/2024              |
+
+## Context
+The notebook-based views3/fatalities002 pipeline generates a cm and a pgm ensemble. It was found that the pgm ensemble suffered from what might be termed normalisation issues, in that the peak and total numbers of fatalities forecast at pgm level are clearly too low. In particular, summing forecast fatalities over the pg cells belonging to a given country, a dcomapring to the fatalities forecast for the same country at cm level almost always gives the result that the summed pgm values are significantly - often an order of magnitude - lower
+As a quick fix, therefore, a reconciliation function was created which accepts a pgm forecast dataframe and a cm forecast dataframe, fetches via viewser a pgm->cm mapping, computes for every country for every month the sum over its constituent pg cells, and renormalises the pgm forecasts for those cells so that the sum matches the cm-level forecast. A check is performed which ensures that the set of months in the two input dfs is the same.
+This is then equivalent to an up-biasing of all the pgm models, which plainly is not a satisfying solution.
+The reconciliation will be applied to every pgm-level constituent model from which the pgm ensemble is built.
+this is a known issue with legacy models that were adapted in various forms from the old pipeline. Although some hyper-parameters might help mitigate these issues, the challenges are inherent to the models' architecture and loss functions.
+Going forward, the explicit goal for all model development efforts is to design architectures, loss functions, optimization routines, sampling strategies, and other methods that address these issues.
+
+## Decision
+This reconciliation is to be implemented in the pipeline as a temporary fix in lieu of improvements to the pgm models. The reconciliation function itself needs to be globally available, so should live in common utils.
+For each ensemble, a new item of metadata will be created, 'reconcile_with', whose value will either be None, or the name of another ensemble. No checks need be done on whether a valid choice has been made, since the function already checks to see that the two ensembles it is presented with have correctly-formatted indexes, and the identical month-sets. This change needs to be present in the ensemble-creation meta-tool, with the default value of None.
+In an ensemble's generate_forecast.py, a code fragment needs to be added where if reconcile_with is not None, the ensemble named by reconcile_with is fetched from storage and presented to the reconciliation function along with each pgm constituent model in turn. 
+Warnings are to be issued and logged if negative-valued forecasts are encountered (before setting them to zero) and if large normalisations are necessary.
+
+
+### Overview
+Reconciliation is being deployed partly to allow the aligning of forecasts from the new pipeline with those of the old. Warnings are issued to inform the user if large normalisations are being performed, which indicates poorly-performing pgm-level models.
+This feature is very simple to disable via the ensemble metadata dict.
+The reconciliation machinery will be maintained as a stable approach to maintain strict consistency between CM-level and aggregated PGM data. In future, it should NOT be viewed as a tool to systematically up-bias PGM models that underestimate conflict fatalities. This underestimation is fundamentally a modeling issue, not a reconciliation problem. Future work will be directed at finding genuine solutions to these issues, as opposed to sticking-plasters.
+
+## Consequences
+
+**Positive Effects:**
+- Allows replication of a necessary but frowned-upon feature of the old pipeline
+- Keeps the user informed about the relative performance of the pgm and cm models. Serious inconsistency between the two sets of models is a useful indicator of poor (probably pgm-level) model performance.
+
+**Negative Effects:**
+- This solution is little more than a hack which we arguably do not want in the codebase
+- This does require changes to the ensemble template and all extant ensembles to ensure that their metadata contains the new key.
+
+## Rationale
+This is the least intrusive means of implementing this feature, and allowing it to be easily turned on and off
+
+### Considerations
+None
+
+## Additional Notes
+None
+
+## Feedback and Suggestions
+Feedback welcomed