Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task/raise alarm for failed canary run/cdd 2109 #1069

Open
wants to merge 61 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
a1557f8
Bump nodejs puppeteer `runtime_version`
A-Ashiq Jul 30, 2024
d285678
Provide env vars to canary runtime from module variables
A-Ashiq Jul 30, 2024
aa38b4a
Build zip file containing single source code file for canary script
A-Ashiq Jul 30, 2024
462b9ee
Add `timeout_in_seconds` variable to `cloud-watch-canary` module
A-Ashiq Jul 30, 2024
252eb68
Implement canary to traverse frontend pages c/w screenshots
A-Ashiq Jul 30, 2024
ad227bc
Set retention periods for canary runs
A-Ashiq Jul 30, 2024
6aa07c6
Bump `aws` provider from `v5.58.0` to `v5.60.0`
A-Ashiq Aug 5, 2024
aaefdd2
Add `slack_token` and `slack_channel_id` to slack secret
A-Ashiq Aug 5, 2024
a0cc09b
Add dedicated SNS topic to `cloud-watch-canary`
A-Ashiq Aug 5, 2024
51e9f64
Bump `archive` provider
A-Ashiq Aug 5, 2024
ffd5b64
Implement required outputs from `cloud-watch-canary` module
A-Ashiq Aug 5, 2024
9dffaf7
Add requirement for `lambda_function_notification_arn` variable to tr…
A-Ashiq Aug 5, 2024
ad0fd5c
Point `script_path` at broken links canary script
A-Ashiq Aug 5, 2024
5e930d3
Formatting
A-Ashiq Aug 5, 2024
49779f2
Send lambda function ARN to `cloud-watch-canary` module so notificati…
A-Ashiq Aug 5, 2024
be6f121
Add canary script used to crawl site and take snapshots of broken pages
A-Ashiq Aug 5, 2024
f0b024e
Add corresponding Cloudwatch alarm to be raised when canary run fails
A-Ashiq Aug 5, 2024
648ef3e
Delete redundant `canary-front-end-screenshots` canary script source …
A-Ashiq Aug 5, 2024
213944c
Add source code for Lambda function used to send notifications after …
A-Ashiq Aug 5, 2024
50ae722
Add packaging for `lambda-canary-notification`
A-Ashiq Aug 5, 2024
4880db5
Remove successful canary run reports after 1 day
A-Ashiq Aug 5, 2024
1aaa607
Keep failed canary reports for 2 weeks
A-Ashiq Aug 5, 2024
36c70a2
Fix issues around downloading snapshots
A-Ashiq Aug 5, 2024
5809baa
Rename function
A-Ashiq Aug 5, 2024
5cc4671
Filter for failed snapshots only
A-Ashiq Aug 5, 2024
13ead0c
Extract and return broken links as bullet point list in post to Slack
A-Ashiq Aug 8, 2024
2985011
Pass `brokenLinks` array to `buildSlackPostPayload()`
A-Ashiq Aug 8, 2024
f2d5219
Extract syntheics report and broken links report from s3
A-Ashiq Aug 8, 2024
23354f3
Add `keyToSearchFor` param to docstring
A-Ashiq Aug 8, 2024
e42245b
Remove unused import
A-Ashiq Aug 8, 2024
17236a5
Pass `keyToSearchFor` down into `extractReportKey()`
A-Ashiq Aug 8, 2024
95ec428
Extract synthetics report and broken links report to build slack post…
A-Ashiq Aug 8, 2024
612eb89
Formatting
A-Ashiq Aug 8, 2024
077e670
Relaunch browser after 50 pages
A-Ashiq Aug 8, 2024
f8af73b
Remove unused `slack/webhook` dependency
A-Ashiq Aug 9, 2024
43508f4
Add `aws-sdk/client-secrets-manager` to dependencies
A-Ashiq Aug 9, 2024
74d9574
Rename arg
A-Ashiq Aug 9, 2024
66534dd
Inject dependencies into all functions which depend on side-effect-he…
A-Ashiq Aug 9, 2024
a07e3eb
Add unit tests
A-Ashiq Aug 9, 2024
3c9c36d
Remove unused functions from being exported
A-Ashiq Aug 9, 2024
cf9a3f3
Revoke security group rules for cloudwatch canary module security gro…
A-Ashiq Aug 9, 2024
4b38150
Provide lambda context to `handler()`
A-Ashiq Aug 12, 2024
c71580c
Pass `S3_CANARY_LOGS_BUCKET_NAME` downstream into callee functions
A-Ashiq Aug 12, 2024
e09d7eb
Update tests to reflect capturing of env var for `S3_CANARY_LOGS_BUCK…
A-Ashiq Aug 12, 2024
2bed52e
Extract script from `src/` directory, output to ignored `builds/` dir…
A-Ashiq Aug 12, 2024
e1b571c
Set analysed period of alarm to be `timeout_in_seconds` variable
A-Ashiq Aug 14, 2024
0817d3f
Set alarm to trigger when `SuccessPercent` dips below 100%
A-Ashiq Aug 16, 2024
f8dae0b
Set notification lambda to consume event from eventbridge
A-Ashiq Aug 19, 2024
ce2457d
Set notification lambda to consume event from eventbridge
A-Ashiq Aug 19, 2024
37cfee4
Trigger for failed canary runs only
A-Ashiq Aug 19, 2024
2bbde92
Add `aws-sdk/client-synthetics` package
A-Ashiq Aug 19, 2024
f52b087
Allow notification lambda to call get the status of previous canary runs
A-Ashiq Aug 19, 2024
fc44528
Send notification if canary run state transitioned from PASSED -> FAI…
A-Ashiq Aug 19, 2024
7824bdc
Replace with HLC2 expression
A-Ashiq Aug 19, 2024
b7d31f2
Remove `"cloudwatch:PutMetricData"` policy for canary
A-Ashiq Aug 19, 2024
a726469
Delay next page for 500ms
A-Ashiq Aug 19, 2024
cafa6c5
Remove block to resursively find other links on page
A-Ashiq Aug 21, 2024
6db2ff8
Remove redundant `grabLinks()` function
A-Ashiq Aug 21, 2024
f49fcfd
Merge branch 'main' into task/raise-alarm-for-failed-canary-run/CDD-2109
A-Ashiq Oct 24, 2024
f50aa0a
Refresh lock file
A-Ashiq Oct 24, 2024
7d01ff7
Merge branch 'main' into task/raise-alarm-for-failed-canary-run/CDD-2109
A-Ashiq Oct 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
204 changes: 204 additions & 0 deletions src/canary-front-end-broken-links/index.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
const synthetics = require('Synthetics');
const log = require('SyntheticsLogger');
const BrokenLinkCheckerReport = require('BrokenLinkCheckerReport');
const SyntheticsLink = require('SyntheticsLink');
const syntheticsLogHelper = require('SyntheticsLogHelper');
const syntheticsConfiguration = synthetics.getConfiguration();


const delay = ms => new Promise(resolve => setTimeout(resolve, ms));

function extractUrlsFromSitemap(xml) {
const urlRegex = /<loc>(.*?)<\/loc>/g;
const urls = [];
let match;

while ((match = urlRegex.exec(xml)) !== null) {
urls.push(match[1].trim());
}
return urls;
}

async function parseSitemap(url) {
const response = await fetch(url);
const xmlString = await response.text();
return extractUrlsFromSitemap(xmlString)
}

async function fetchAndParseSitemap(url) {
try {
return await parseSitemap(url)
} catch (error) {
log.error("Error fetching or parsing XML:", error);
}
}

// maximum number of links that would be followed
const limit = null;

// Captures source page annotated screenshot for each link followed on a page.
const captureSourcePageScreenshot = true;

// Captures destination page screenshot after loading a link successfully.
const captureDestinationPageScreenshotOnSuccess = false;

// Captures destination page screenshot for broken links only. Note that links which do not return response have no destination screenshots.
const captureDestinationPageScreenshotOnFailure = true;

// Close and Re-launch browser after checking these many links. This clears up /tmp disk storage occupied by chromium and launches a new browser for next set of links.
// Increase or decrease based on complexity of your website.
const numOfLinksToReLaunchBrowser = 50;

// Take synthetics screenshot
const takeScreenshot = async function (fileName, suffix) {
try {
return await synthetics.takeScreenshot(fileName, suffix);
} catch (e) {
synthetics.addExecutionError('Unable to capture screenshot.', e);
}
}

// Get the fileName for the screenshot based on the URI
const getFileName = function (url, defaultName = 'loaded') {
if (!url) return defaultName;

const uri = new URL(url);
const pathname = uri.pathname.replace(/\/$/, ''); //remove trailing '/'
const fileName = !!pathname ? pathname.split('/').pop() : 'index';

// Remove characters which can't be used in S3
return fileName.replace(/[^a-zA-Z0-9-_.!*'()]+/g, '');
}

// Broken link checker blueprint just uses one page to test availability of several urls
// Reset the page in-between to force a network event in case of a single page app
const resetPage = async function (page) {
try {
await page.goto('about:blank', {waitUntil: ['load', 'networkidle0'], timeout: 30000});
} catch (e) {
synthetics.addExecutionError('Unable to open a blank page ', e);
}
}

const webCrawlerBlueprint = async function () {
const urls = await fetchAndParseSitemap(process.env.SITEMAP_URL);
const exploredUrls = urls.slice();
let synLinks = [];
let count = 0;

let canaryError = null;
let brokenLinkError = null;

let brokenLinkCheckerReport = new BrokenLinkCheckerReport();

syntheticsConfiguration.setConfig({
includeRequestHeaders: true, // Enable if headers should be displayed in HAR
includeResponseHeaders: true, // Enable if headers should be displayed in HAR
restrictedHeaders: [], // Value of these headers will be redacted from logs and reports
restrictedUrlParameters: [] // Values of these url parameters will be redacted from logs and reports
});


// Synthetics Puppeteer page instance
let page = await synthetics.getPage();

exploredUrls.forEach(url => {
synLinks.push(new SyntheticsLink(url));
});

while (synLinks.length > 0) {
await delay(500)
let link = synLinks.shift();
let nav_url = link.getUrl();
let sanitized_url = syntheticsLogHelper.getSanitizedUrl(nav_url);
link.withUrl(sanitized_url);
let fileName = getFileName(sanitized_url);
let response = null;

count++;

log.info("Current count: " + count + " Checking URL: " + sanitized_url);

if (count % numOfLinksToReLaunchBrowser === 0 && count !== limit) {
log.info("Closing current browser and launching new");

// Close browser and stops HAR logging.
await synthetics.close();

// Launches a new browser and start HAR logging.
await synthetics.launch();

page = await synthetics.getPage();
} else if (count !== 1) {
await resetPage(page);
}

try {
/* You can customize the wait condition here. For instance, using 'networkidle2' may be less restrictive.
networkidle0: Navigation is successful when the page has had no network requests for half a second. This might never happen if page is constantly loading multiple resources.
networkidle2: Navigation is successful when the page has no more then 2 network requests for half a second.
domcontentloaded: It's fired as soon as the page DOM has been loaded, without waiting for resources to finish loading. If needed add explicit wait with await new Promise(r => setTimeout(r, milliseconds))
*/

response = await page.goto(nav_url, {waitUntil: ['load', 'networkidle0'], timeout: 30000});
if (!response) {
brokenLinkError = "Failed to receive network response for url: " + sanitized_url;
log.error(brokenLinkError);
link = link.withFailureReason('Received null or undefined response.');
}
} catch (e) {
brokenLinkError = "Failed to load url: " + sanitized_url + ". " + e;
log.error(brokenLinkError);
link = link.withFailureReason(e.toString());
}

if (response && response.status() && response.status() < 400) {
link = link.withStatusCode(response.status()).withStatusText(response.statusText());
if (captureDestinationPageScreenshotOnSuccess) {
let screenshotResult = await takeScreenshot(fileName, 'succeeded');
link.addScreenshotResult(screenshotResult);
}
} else if (response) { // Received 400s or 500s
const statusString = "Status code: " + response.status() + " " + response.statusText();
brokenLinkError = "Failed to load url: " + sanitized_url + ". " + statusString;
log.info(brokenLinkError);

link = link.withStatusCode(response.status()).withStatusText(response.statusText()).withFailureReason(statusString);

if (captureDestinationPageScreenshotOnFailure) {
let screenshotResult = await takeScreenshot(fileName, 'failed');
link.addScreenshotResult(screenshotResult);
}
}

try {
// Adds this link to broken link checker report. Link with status code >= 400 is considered broken. Use addLink(link, isBrokenLink) to override this default behavior.
brokenLinkCheckerReport.addLink(link);
} catch (e) {
synthetics.addExecutionError('Unable to add link to broken link checker report.', e);
}
}

try {
synthetics.addReport(brokenLinkCheckerReport);
} catch (e) {
synthetics.addExecutionError('Unable to add broken link checker report.', e);
}

log.info("Total links checked: " + brokenLinkCheckerReport.getTotalLinksChecked());

// Fail canary if 1 or more broken links found.
if (brokenLinkCheckerReport.getTotalBrokenLinks() !== 0) {
brokenLinkError = brokenLinkCheckerReport.getTotalBrokenLinks() + " broken link(s) detected. " + brokenLinkError;
log.error(brokenLinkError);
canaryError = canaryError ? (brokenLinkError + " " + canaryError) : brokenLinkError;
}

if (canaryError) {
throw new Error(canaryError);
}
};

exports.handler = async () => {
return await webCrawlerBlueprint();
};
Loading