Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IWF-473: Upgrade Cadence dependencies #548

Merged
merged 2 commits into from
Feb 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docker-compose/.env
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
CASSANDRA_VERSION=3.11.9
ELASTICSEARCH_VERSION=7.16.2
CASSANDRA_VERSION=4.1.1
ELASTICSEARCH_VERSION=7.17.27
MYSQL_VERSION=8
POSTGRESQL_VERSION=13
TEMPORAL_VERSION=1.25
Expand Down
49 changes: 26 additions & 23 deletions docker-compose/ci-cadence-dependencies.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,43 +18,45 @@ services:
expose:
- 9200
cassandra:
image: cassandra:3.11
image: cassandra:${CASSANDRA_VERSION}
ports:
- "9042:9042"
healthcheck:
test: [ "CMD", "cqlsh", "-u cassandra", "-p cassandra", "-e describe keyspaces" ]
interval: 15s
timeout: 30s
retries: 10
networks:
- testing-network
zookeeper:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like zookeeper is no longer a dependency with the latest version of Cadence, is that right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the cadence MR that removed Zookeeper for reference: cadence-workflow/cadence@4cb4443

And this is the file I used to look at how Cadence has their Docker setup: https://github.com/cadence-workflow/cadence/blob/master/docker/docker-compose-es-v7.yml

image: wurstmeister/zookeeper:latest
ports:
- "2181:2181"
networks:
- testing-network
healthcheck:
test: [ "CMD-SHELL", "echo ruok | nc -w 2 zookeeper 2181" ]
interval: 5s
timeout: 10s
retries: 3
kafka:
image: wurstmeister/kafka:2.12-2.1.1
depends_on:
zookeeper:
condition: service_healthy
image: docker.io/bitnami/kafka:3.7
hostname: kafka
container_name: kafka
ports:
- "9092:9092"
environment:
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
# KRaft settings
- "KAFKA_CFG_NODE_ID=0"
- "KAFKA_CFG_PROCESS_ROLES=controller,broker"
- "KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=0@kafka:9093"
# Listeners
- "KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093"
- "KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092"
- "KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT"
- "KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER"
- "KAFKA_CFG_INTER_BROKER_LISTENER_NAME=PLAINTEXT"
# Topic settings
- "KAFKA_CFG_AUTO_CREATE_TOPICS_ENABLE=true"
networks:
- testing-network
healthcheck:
test:
[ "CMD", "kafka-topics.sh", "--list", "--zookeeper", "zookeeper:2181" ]
[ "CMD", "kafka-topics.sh", "--list", '--bootstrap-server', 'kafka:9092' ]
interval: 1s
timeout: 60s
retries: 60
cadence:
image: ubercadence/server:0.24.0-auto-setup
image: ubercadence/server:v1.2.16-auto-setup
ports:
- "8000:8000"
- "8001:8001"
Expand All @@ -65,6 +67,7 @@ services:
- "7935:7935"
- "7939:7939"
- "7833:7833"
- "7936:7936"
environment:
- "CASSANDRA_SEEDS=cassandra"
- "DYNAMIC_CONFIG_FILE_PATH=config/dynamicconfig/development_es.yaml"
Expand All @@ -74,7 +77,7 @@ services:
- "KAFKA_SEEDS=kafka"
depends_on:
cassandra:
condition: service_started
condition: service_healthy
kafka:
condition: service_healthy
elasticsearch:
Expand All @@ -87,7 +90,7 @@ services:
- cadence
environment:
- CADENCE_CLI_ADDRESS=cadence:7933
image: ubercadence/cli:0.24.0
image: ubercadence/cli:v1.2.16
networks:
- testing-network
stdin_open: true
Expand Down
49 changes: 26 additions & 23 deletions docker-compose/ci-cadence-temporal-dependencies.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,43 +67,45 @@ services:
- ./init-ci-temporal.sh:/etc/temporal/init-ci-temporal.sh
entrypoint: sh -c "/etc/temporal/init-ci-temporal.sh"
cassandra:
image: cassandra:3.11
image: cassandra:${CASSANDRA_VERSION}
ports:
- "9042:9042"
healthcheck:
test: [ "CMD", "cqlsh", "-u cassandra", "-p cassandra", "-e describe keyspaces" ]
interval: 15s
timeout: 30s
retries: 10
networks:
- testing-network
zookeeper:
image: wurstmeister/zookeeper:latest
ports:
- "2181:2181"
networks:
- testing-network
healthcheck:
test: [ "CMD-SHELL", "echo ruok | nc -w 2 zookeeper 2181" ]
interval: 5s
timeout: 10s
retries: 3
kafka:
image: wurstmeister/kafka:2.12-2.1.1
depends_on:
zookeeper:
condition: service_healthy
image: docker.io/bitnami/kafka:3.7
hostname: kafka
container_name: kafka
ports:
- "9092:9092"
environment:
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
# KRaft settings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Not sure what KRaft is

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://developer.confluent.io/learn/kraft/

Apache Kafka Raft (KRaft) is the consensus protocol that was introduced in KIP-500 to remove Apache Kafka’s dependency on ZooKeeper for metadata management. This greatly simplifies Kafka’s architecture by consolidating responsibility for metadata into Kafka itself, rather than splitting it between two different systems: ZooKeeper and Kafka. KRaft mode makes use of a new quorum controller service in Kafka which replaces the previous controller and makes use of an event-based variant of the Raft consensus protocol.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thx!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically seems to be the way forward so that we no longer need Zookeeper. But mainly, I copied most of the configs from the Cadence repo, so we're doing the same configs as they recommend right now

- "KAFKA_CFG_NODE_ID=0"
- "KAFKA_CFG_PROCESS_ROLES=controller,broker"
- "KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=0@kafka:9093"
# Listeners
- "KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093"
- "KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092"
- "KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT"
- "KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER"
- "KAFKA_CFG_INTER_BROKER_LISTENER_NAME=PLAINTEXT"
# Topic settings
- "KAFKA_CFG_AUTO_CREATE_TOPICS_ENABLE=true"
networks:
- testing-network
healthcheck:
test:
[ "CMD", "kafka-topics.sh", "--list", "--zookeeper", "zookeeper:2181" ]
[ "CMD", "kafka-topics.sh", "--list", '--bootstrap-server', 'kafka:9092' ]
interval: 1s
timeout: 60s
retries: 60
cadence:
image: ubercadence/server:0.24.0-auto-setup
image: ubercadence/server:v1.2.16-auto-setup
ports:
- "8000:8000"
- "8001:8001"
Expand All @@ -114,6 +116,7 @@ services:
- "7935:7935"
- "7939:7939"
- "7833:7833"
- "7936:7936"
environment:
- "CASSANDRA_SEEDS=cassandra"
- "DYNAMIC_CONFIG_FILE_PATH=config/dynamicconfig/development_es.yaml"
Expand All @@ -123,7 +126,7 @@ services:
- "KAFKA_SEEDS=kafka"
depends_on:
cassandra:
condition: service_started
condition: service_healthy
kafka:
condition: service_healthy
elasticsearch:
Expand All @@ -136,7 +139,7 @@ services:
- cadence
environment:
- CADENCE_CLI_ADDRESS=cadence:7933
image: ubercadence/cli:0.24.0
image: ubercadence/cli:v1.2.16
networks:
- testing-network
stdin_open: true
Expand Down
4 changes: 3 additions & 1 deletion docker-compose/dynamicconfig/docker.yaml
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
frontend.enableUpdateWorkflowExecution:
- value: true
- value: true
frontend.enableQueryAttributeValidation:
- value: false
54 changes: 29 additions & 25 deletions docker-compose/integ-dependencies.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,43 +67,45 @@ services:
- ./init-ci-temporal.sh:/etc/temporal/init-ci-temporal.sh
entrypoint: sh -c "/etc/temporal/init-ci-temporal.sh"
cassandra:
image: cassandra:3.11
image: cassandra:${CASSANDRA_VERSION}
ports:
- "9042:9042"
healthcheck:
test: [ "CMD", "cqlsh", "-u cassandra", "-p cassandra" ,"-e describe keyspaces" ]
interval: 15s
timeout: 30s
retries: 10
networks:
- testing-network
zookeeper:
image: wurstmeister/zookeeper:latest
ports:
- "2181:2181"
networks:
- testing-network
healthcheck:
test: [ "CMD-SHELL", "echo ruok | nc -w 2 zookeeper 2181" ]
interval: 5s
timeout: 10s
retries: 3
kafka:
image: wurstmeister/kafka:2.12-2.1.1
depends_on:
zookeeper:
condition: service_healthy
image: docker.io/bitnami/kafka:3.7
hostname: kafka
container_name: kafka
ports:
- "9092:9092"
environment:
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
# KRaft settings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kafka? 👀

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answered below: #548 (comment)

- "KAFKA_CFG_NODE_ID=0"
- "KAFKA_CFG_PROCESS_ROLES=controller,broker"
- "KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=0@kafka:9093"
# Listeners
- "KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093"
- "KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092"
- "KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT"
- "KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER"
- "KAFKA_CFG_INTER_BROKER_LISTENER_NAME=PLAINTEXT"
# Topic settings
- "KAFKA_CFG_AUTO_CREATE_TOPICS_ENABLE=true"
networks:
- testing-network
healthcheck:
test:
[ "CMD", "kafka-topics.sh", "--list", "--zookeeper", "zookeeper:2181" ]
[ "CMD", "kafka-topics.sh", "--list", '--bootstrap-server', 'kafka:9092']
interval: 1s
timeout: 60s
retries: 60
cadence:
image: ubercadence/server:0.24.0-auto-setup
image: ubercadence/server:v1.2.16-auto-setup
ports:
- "8000:8000"
- "8001:8001"
Expand All @@ -114,6 +116,7 @@ services:
- "7935:7935"
- "7939:7939"
- "7833:7833"
- "7936:7936"
environment:
- "CASSANDRA_SEEDS=cassandra"
- "DYNAMIC_CONFIG_FILE_PATH=config/dynamicconfig/development_es.yaml"
Expand All @@ -123,7 +126,7 @@ services:
- "KAFKA_SEEDS=kafka"
depends_on:
cassandra:
condition: service_started
condition: service_healthy
kafka:
condition: service_healthy
elasticsearch:
Expand All @@ -136,7 +139,7 @@ services:
- cadence
environment:
- CADENCE_CLI_ADDRESS=cadence:7933
image: ubercadence/cli:0.24.0
image: ubercadence/cli:v1.2.16
networks:
- testing-network
stdin_open: true
Expand All @@ -145,9 +148,10 @@ services:
- ./init-ci-cadence.sh:/etc/cadence/init-ci-cadence.sh
entrypoint: sh -c "/etc/cadence/init-ci-cadence.sh"
cadence-web:
image: ubercadence/web:v3.29.6
image: ubercadence/web:v4.0.0
environment:
- "CADENCE_TCHANNEL_PEERS=cadence:7933"
- "CADENCE_GRPC_PEERS=cadence:7833"
ports:
- "8088:8088"
depends_on:
Expand All @@ -169,4 +173,4 @@ services:
networks:
testing-network:
driver: bridge
name: testing-network
name: testing-network
29 changes: 4 additions & 25 deletions integ/workflow/wf_ignore_already_started/routers.go
Original file line number Diff line number Diff line change
@@ -1,18 +1,14 @@
package wf_ignore_already_started

import (
"github.com/indeedeng/iwf/integ/helpers"
"github.com/gin-gonic/gin"
"github.com/indeedeng/iwf/gen/iwfidl"
"github.com/indeedeng/iwf/service"
"github.com/indeedeng/iwf/service/common/ptr"
"log"
"net/http"
"strconv"
"sync"
"testing"
"time"

"github.com/gin-gonic/gin"
"github.com/indeedeng/iwf/gen/iwfidl"
"github.com/indeedeng/iwf/service"
)

/**
Expand All @@ -29,7 +25,6 @@ const (

type handler struct {
invokeHistory sync.Map
invokeData sync.Map
}

func NewHandler() *handler {
Expand All @@ -55,12 +50,6 @@ func (h *handler) ApiV1WorkflowStateStart(c *gin.Context, t *testing.T) {
}

if req.GetWorkflowStateId() == State1 {
nowInt, err := strconv.Atoi(req.StateInput.GetData())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not being used in the test assertions and it was causing this test to be flaky for some weird reason, so I removed it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that's for the best, if the value isn't being used but is causing issues it should be removed 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a link to a failed pipeline where this test was flaky? What was the error exactly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can look at the actions in the previous commit in this PR: 8194a71

I believe this was flaky because while the Cadence tests failed(https://github.com/indeedeng/iwf/actions/runs/13464338908/job/37627766325), the All Tests job passed (https://github.com/indeedeng/iwf/actions/runs/13464338906/job/37626732846) and the latter runs the same Cadence tests.

The error was this: https://github.com/indeedeng/iwf/actions/runs/13464338908/job/37627766325#step:4:5143

errors.go:6: TestIgnoreAlreadyStartedWorkflowCadence - Test failed with error: strconv.Atoi: parsing "": invalid syntax

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I looked closer, it seemed that the value wasn't being passed to the test router. And it the property it wrote to in the router wasn't used in the test for any assertion, so I removed this bit. The tests assertions haven't changed i.e. it's doing the same checks as before

if err != nil {
helpers.FailTestWithError(err, t)
}
now := int64(nowInt)
h.invokeData.Store("scheduled_at", now)
c.JSON(http.StatusOK, iwfidl.WorkflowStateStartResponse{
CommandRequest: &iwfidl.CommandRequest{
TimerCommands: []iwfidl.TimerCommand{
Expand Down Expand Up @@ -95,11 +84,6 @@ func (h *handler) ApiV1WorkflowStateDecide(c *gin.Context, t *testing.T) {
}

if req.GetWorkflowStateId() == State1 {
now := time.Now().Unix()
h.invokeData.Store("fired_at", now)
timerResults := req.GetCommandResults()
timerId := timerResults.GetTimerResults()[0].GetCommandId()
h.invokeData.Store("timer_id", timerId)
c.JSON(http.StatusOK, iwfidl.WorkflowStateDecideResponse{
StateDecision: &iwfidl.StateDecision{
NextStates: []iwfidl.StateMovement{
Expand All @@ -122,10 +106,5 @@ func (h *handler) GetTestResult() (map[string]int64, map[string]interface{}) {
invokeHistory[key.(string)] = value.(int64)
return true
})
invokeData := make(map[string]interface{})
h.invokeData.Range(func(key, value interface{}) bool {
invokeData[key.(string)] = value
return true
})
return invokeHistory, invokeData
return invokeHistory, nil
}