Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: garbage collection controller removes orphaned nics #686

Merged
merged 31 commits into from
Feb 21, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
d87e69f
test: garbage collection controller removes orphaned nics
Bryce-Soghigian Jan 29, 2025
c12c0c6
refactor: breaking logic into modular steps to reduce cyclomatic comp…
Bryce-Soghigian Feb 9, 2025
231578e
ci: golang-ci lint
Bryce-Soghigian Feb 9, 2025
c8d8c32
fix: going back to utilization as the default
Bryce-Soghigian Feb 9, 2025
b467514
refactor: moving env vars to azureEnv struct
Bryce-Soghigian Feb 10, 2025
13b16dd
refactor: using azure clients defined inside of env
Bryce-Soghigian Feb 10, 2025
3f43e74
refactor: using more generic list approach that can be generalized an…
Bryce-Soghigian Feb 10, 2025
80b6b15
refactor: using AZURE_RESOURCE_GROUP_MC as a name rather than AZURE_R…
Bryce-Soghigian Feb 10, 2025
f4f6cd6
ci: lint
Bryce-Soghigian Feb 10, 2025
610dc0b
test: checkin azure garbage collection into our e2e matrix
Bryce-Soghigian Feb 10, 2025
8edfb96
fix: propagating values to makefile
Bryce-Soghigian Feb 10, 2025
3bdf24f
fix: constructing mc rg
Bryce-Soghigian Feb 10, 2025
2ba4c99
fix: use CLUSTER_NAME instead of AZURE_CLUSTER_NAME
Bryce-Soghigian Feb 10, 2025
cff6567
test: refactoring to use environment
Bryce-Soghigian Feb 11, 2025
e0d56ca
fix: propagating location
Bryce-Soghigian Feb 11, 2025
d44c652
refactor: have acr e2e consume from environment vars stored in azureEnv
Bryce-Soghigian Feb 11, 2025
e2f565c
fix: removing import
Bryce-Soghigian Feb 11, 2025
6cf9f6f
refactor: renaming CLUSTER_NAME to match all other variables
Bryce-Soghigian Feb 11, 2025
8e7abd4
refactor: moving env vars outside of az-e2etest since e2etest now hol…
Bryce-Soghigian Feb 11, 2025
f566bef
ci: make presubmit
Bryce-Soghigian Feb 11, 2025
0476623
Merge branch 'main' into bsoghigian/e2e/nic-gc
Bryce-Soghigian Feb 13, 2025
e40d50e
refactor: removing readme
Bryce-Soghigian Feb 13, 2025
517e6a3
refactor: using lo.Must() + os.LookupEnv
Bryce-Soghigian Feb 13, 2025
513a708
Merge branch 'main' into bsoghigian/e2e/nic-gc
Bryce-Soghigian Feb 15, 2025
770b2bf
revert: go.mod go version change
Bryce-Soghigian Feb 18, 2025
8b63d12
refactor: moving azuregc suite to the nodeclaim suite
Bryce-Soghigian Feb 18, 2025
8f84f0c
Merge branch 'main' into bsoghigian/e2e/nic-gc
Bryce-Soghigian Feb 18, 2025
7bb2017
Update Makefile-az.mk
Bryce-Soghigian Feb 19, 2025
c3ef3db
refactor: using env block instead
Bryce-Soghigian Feb 19, 2025
920b172
fix: comment
Bryce-Soghigian Feb 19, 2025
602f72e
Merge branch 'main' into bsoghigian/e2e/nic-gc
tallaxes Feb 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions .github/workflows/e2e.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -156,9 +156,17 @@ jobs:
location: ${{ inputs.location }}
- name: run the ${{ inputs.suite }} test suite
if: inputs.suite != 'Nonbehavioral'
env:
AZURE_CLUSTER_NAME: ${{ env.CLUSTER_NAME }}
AZURE_RESOURCE_GROUP: ${{ env.RG_NAME }}
AZURE_LOCATION: ${{ inputs.location }}
AZURE_SUBSCRIPTION_ID: ${{ secrets.E2E_SUBSCRIPTION_ID }}
AZURE_ACR_NAME: ${{ env.ACR_NAME }}
TEST_SUITE: ${{ inputs.suite }}
GIT_REF: ${{ github.sha }}
run: |
AZURE_CLUSTER_NAME=${{ env.CLUSTER_NAME }} AZURE_RESOURCE_GROUP=${{ env.RG_NAME }} make az-creds
CLUSTER_NAME=${{ env.CLUSTER_NAME }} AZURE_ACR_NAME=${{ env.ACR_NAME}} TEST_SUITE="${{ inputs.suite }}" GIT_REF="$(git rev-parse HEAD)" make e2etests
make az-creds
make e2etests
- name: dump logs on failure
uses: ./.github/actions/e2e/dump-logs
if: failure() || cancelled()
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ e2etests: ## Run the e2e suite against your local cluster
# -count 1: prevents caching
# -timeout: If a test binary runs longer than TEST_TIMEOUT, panic
# -v: verbose output
cd test && CLUSTER_NAME=${CLUSTER_NAME} AZURE_ACR_NAME=${AZURE_ACR_NAME} go test \
cd test && AZURE_CLUSTER_NAME=${AZURE_CLUSTER_NAME} AZURE_ACR_NAME=${AZURE_ACR_NAME} AZURE_RESOURCE_GROUP=${AZURE_RESOURCE_GROUP} AZURE_SUBSCRIPTION_ID=${AZURE_SUBSCRIPTION_ID} AZURE_LOCATION=${AZURE_LOCATION} go test \
-p 1 \
-count 1 \
-timeout ${TEST_TIMEOUT} \
Expand Down
37 changes: 31 additions & 6 deletions test/pkg/environment/azure/environment.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,16 @@ limitations under the License.
package azure

import (
"fmt"
"os"
"testing"

"github.com/samber/lo"
v1 "k8s.io/api/core/v1"
karpv1 "sigs.k8s.io/karpenter/pkg/apis/v1"

"github.com/Azure/azure-sdk-for-go/sdk/azidentity"
"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/network/armnetwork"
"github.com/Azure/karpenter-provider-azure/pkg/apis/v1alpha2"
"github.com/Azure/karpenter-provider-azure/pkg/test"
"github.com/Azure/karpenter-provider-azure/test/pkg/environment/common"
Expand All @@ -40,16 +44,37 @@ const (

type Environment struct {
*common.Environment
Region string

NodeResourceGroup string
Region string
SubscriptionID string
VNETResourceGroup string
ACRName string
ClusterName string
ClusterResourceGroup string

VNETClient *armnetwork.VirtualNetworksClient
InterfacesClient *armnetwork.InterfacesClient
}

func NewEnvironment(t *testing.T) *Environment {
env := common.NewEnvironment(t)

return &Environment{
Region: "westus2",
Environment: env,
azureEnv := &Environment{
Environment: common.NewEnvironment(t),
SubscriptionID: lo.Must(os.LookupEnv("AZURE_SUBSCRIPTION_ID")),
ClusterName: lo.Must(os.LookupEnv("AZURE_CLUSTER_NAME")),
ClusterResourceGroup: lo.Must(os.LookupEnv("AZURE_RESOURCE_GROUP")),
ACRName: lo.Must(os.LookupEnv("ACR_NAME")),
Region: lo.Ternary(os.Getenv("AZURE_LOCATION") == "", "westus2", os.Getenv("AZURE_LOCATION")),
}

defaultNodeRG := fmt.Sprintf("MC_%s_%s_%s", azureEnv.ClusterResourceGroup, azureEnv.ClusterName, azureEnv.Region)
azureEnv.VNETResourceGroup = lo.Ternary(os.Getenv("VNET_RESOURCE_GROUP") == "", defaultNodeRG, os.Getenv("VNET_RESOURCE_GROUP"))
azureEnv.NodeResourceGroup = defaultNodeRG

cred := lo.Must(azidentity.NewDefaultAzureCredential(nil))
azureEnv.VNETClient = lo.Must(armnetwork.NewVirtualNetworksClient(azureEnv.SubscriptionID, cred, nil))
azureEnv.InterfacesClient = lo.Must(armnetwork.NewInterfacesClient(azureEnv.SubscriptionID, cred, nil))
return azureEnv
}

func (env *Environment) DefaultAKSNodeClass() *v1alpha2.AKSNodeClass {
Expand Down
84 changes: 84 additions & 0 deletions test/pkg/environment/azure/expectations.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
/*
Portions Copyright (c) Microsoft Corporation.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package azure

import (
"context"
"fmt"
"strings"
"time"

"github.com/samber/lo"

. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
karpv1 "sigs.k8s.io/karpenter/pkg/apis/v1"

"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/network/armnetwork"
)

func (env *Environment) EventuallyExpectKarpenterNicsToBeDeleted() {
GinkgoHelper()
Eventually(func() bool {
pager := env.InterfacesClient.NewListPager(env.NodeResourceGroup, nil)
for pager.More() {
resp, err := pager.NextPage(env.Context)
if err != nil {
return false
}

for _, nic := range resp.Value {
if nic.Tags != nil {
if _, exists := nic.Tags[strings.ReplaceAll(karpv1.NodePoolLabelKey, "/", "_")]; exists {
return false
}
}
}
}
return true
}).WithTimeout(10*time.Minute).WithPolling(10*time.Second).Should(BeTrue(), "Expected all orphan NICs to be deleted")
}

func (env *Environment) ExpectCreatedInterface(networkInterface armnetwork.Interface) {
GinkgoHelper()
poller, err := env.InterfacesClient.BeginCreateOrUpdate(env.Context, env.NodeResourceGroup, lo.FromPtr(networkInterface.Name), networkInterface, nil)
Expect(err).ToNot(HaveOccurred())
_, err = poller.PollUntilDone(env.Context, nil)
Expect(err).ToNot(HaveOccurred())
}

func (env *Environment) GetClusterSubnet() *armnetwork.Subnet {
GinkgoHelper()
vnet, err := firstVNETInRG(env.Context, env.VNETClient, env.VNETResourceGroup)
Expect(err).ToNot(HaveOccurred())
return vnet.Properties.Subnets[0]
}

// This returns the first vnet we find in the resource group, works for managed vnet, it hasn't been tested on custom vnet.
func firstVNETInRG(ctx context.Context, client *armnetwork.VirtualNetworksClient, vnetRG string) (*armnetwork.VirtualNetwork, error) {
pager := client.NewListPager(vnetRG, nil)
for pager.More() {
resp, err := pager.NextPage(ctx)
if err != nil {
return nil, fmt.Errorf("failed to list virtual networks: %w", err)
}
if len(resp.VirtualNetworkListResult.Value) > 0 {
return resp.VirtualNetworkListResult.Value[0], nil
}
}
return nil, fmt.Errorf("no virtual networks found in resource group: %s", vnetRG)
}
5 changes: 1 addition & 4 deletions test/suites/acr/suite_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ package acr

import (
"fmt"
"os"
"testing"
"time"

Expand All @@ -43,9 +42,7 @@ func TestAcr(t *testing.T) {
RegisterFailHandler(Fail)
BeforeSuite(func() {
env = azure.NewEnvironment(t)
acrName := os.Getenv("AZURE_ACR_NAME")
Expect(acrName).NotTo(BeEmpty(), "AZURE_ACR_NAME must be set for the acr test suite")
pauseImage = fmt.Sprintf("%s.azurecr.io/pause:3.6", acrName)
pauseImage = fmt.Sprintf("%s.azurecr.io/pause:3.6", env.ACRName)
})
RunSpecs(t, "Acr")
}
Expand Down
48 changes: 48 additions & 0 deletions test/suites/nodeclaim/azuregarbagecollection_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
/*
Portions Copyright (c) Microsoft Corporation.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package nodeclaim_test

import (
. "github.com/onsi/ginkgo/v2"
"github.com/samber/lo"

"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/network/armnetwork"
azkarptest "github.com/Azure/karpenter-provider-azure/pkg/test"
)

var _ = Describe("gc", func() {
It("should garbage collect network interfaces created by karpenter", func() {
env.ExpectCreatedInterface(armnetwork.Interface{
Name: lo.ToPtr("orphan-nic"),
Location: lo.ToPtr(env.Region),
Tags: azkarptest.ManagedTags("default"),
Properties: &armnetwork.InterfacePropertiesFormat{
IPConfigurations: []*armnetwork.InterfaceIPConfiguration{
{
Name: lo.ToPtr("ip-config"),
Properties: &armnetwork.InterfaceIPConfigurationPropertiesFormat{
Primary: lo.ToPtr(true),
Subnet: env.GetClusterSubnet(),
PrivateIPAllocationMethod: lo.ToPtr(armnetwork.IPAllocationMethodDynamic),
},
},
},
},
})
env.EventuallyExpectKarpenterNicsToBeDeleted()
})
})
Loading