-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update prerender to include a content manifest csv output #2268
Merged
Merged
Changes from 8 commits
Commits
Show all changes
69 commits
Select commit
Hold shift + click to select a range
2411317
update prerender to include a content manifest csv output
TomWoodward e272313
add toc node types
TomWoodward a97648c
Add lanugage and book slug
TomWoodward 820f5ba
:shirt:
TomWoodward 6c9fc52
remove debug
TomWoodward 647bf66
:shirt:
TomWoodward c2ed25d
Merge branch 'main' into content-manifest
staxly[bot] 0bf7852
:pencil:
TomWoodward a17c394
Merge branch 'main' into content-manifest
staxly[bot] 49e3edb
Merge branch 'main' into content-manifest
staxly[bot] 4d7bd20
Merge branch 'main' into content-manifest
staxly[bot] 5239c74
Merge branch 'main' into content-manifest
staxly[bot] ea47213
Merge branch 'main' into content-manifest
staxly[bot] 5b7428a
Merge branch 'main' into content-manifest
staxly[bot] f358a63
Merge branch 'main' into content-manifest
staxly[bot] 75a765d
Merge branch 'main' into content-manifest
staxly[bot] 1bcb5cf
Merge branch 'main' into content-manifest
staxly[bot] c501953
Merge branch 'main' into content-manifest
staxly[bot] 0332826
Merge branch 'main' into content-manifest
staxly[bot] 0b2cb94
Merge branch 'main' into content-manifest
staxly[bot] 22b7cc1
Merge branch 'main' into content-manifest
staxly[bot] e91bf7c
Merge branch 'main' into content-manifest
staxly[bot] 7a1eee5
Merge branch 'main' into content-manifest
staxly[bot] 975fcc6
Merge branch 'main' into content-manifest
staxly[bot] 8b71007
Merge branch 'main' into content-manifest
staxly[bot] a1aa4ab
Merge branch 'main' into content-manifest
staxly[bot] 4a02768
Merge branch 'main' into content-manifest
staxly[bot] 6bbb730
Merge branch 'main' into content-manifest
staxly[bot] 24a6423
Merge branch 'main' into content-manifest
staxly[bot] f8c78fe
Merge branch 'main' into content-manifest
staxly[bot] 760eed6
Merge branch 'main' into content-manifest
staxly[bot] 6c8c6aa
Merge branch 'main' into content-manifest
staxly[bot] 9744545
Merge branch 'main' into content-manifest
staxly[bot] d4fd4db
Merge branch 'main' into content-manifest
staxly[bot] 92b85c9
Merge branch 'main' into content-manifest
staxly[bot] af38c32
Merge branch 'main' into content-manifest
staxly[bot] 7901fd4
Merge branch 'main' into content-manifest
staxly[bot] 6211457
Merge branch 'main' into content-manifest
staxly[bot] 2d49d31
Merge branch 'main' into content-manifest
staxly[bot] b087990
Merge branch 'main' into content-manifest
staxly[bot] 3b6b4e0
Merge branch 'main' into content-manifest
staxly[bot] 535ac31
Merge branch 'main' into content-manifest
staxly[bot] 619f502
Merge branch 'main' into content-manifest
staxly[bot] c3e26d5
Merge branch 'main' into content-manifest
staxly[bot] 693c49c
Merge branch 'main' into content-manifest
staxly[bot] 38ed8ff
Merge branch 'main' into content-manifest
staxly[bot] a24dcc2
Merge branch 'main' into content-manifest
staxly[bot] d47e334
Merge branch 'main' into content-manifest
staxly[bot] 4d4614d
Merge branch 'main' into content-manifest
staxly[bot] b136ea3
Merge branch 'main' into content-manifest
staxly[bot] e846e73
Merge branch 'main' into content-manifest
staxly[bot] f11f645
Merge branch 'main' into content-manifest
staxly[bot] 14c4356
Merge branch 'main' into content-manifest
staxly[bot] 707130a
Merge branch 'main' into content-manifest
staxly[bot] 1d8d938
Merge branch 'main' into content-manifest
staxly[bot] 036c699
Merge branch 'main' into content-manifest
staxly[bot] 2c5eef7
Merge branch 'main' into content-manifest
staxly[bot] 706cca3
Merge branch 'main' into content-manifest
staxly[bot] 61de4b4
Merge branch 'main' into content-manifest
staxly[bot] 12c8787
Merge branch 'main' into content-manifest
staxly[bot] be0f5b6
Merge branch 'main' into content-manifest
staxly[bot] a0165e8
Merge branch 'main' into content-manifest
staxly[bot] e5b8aa5
Merge branch 'main' into content-manifest
staxly[bot] 575b733
Merge branch 'main' into content-manifest
staxly[bot] ad7ef44
Merge branch 'main' into content-manifest
staxly[bot] 522838b
Merge branch 'main' into content-manifest
staxly[bot] 1bd512d
Merge branch 'main' into content-manifest
staxly[bot] 9214479
Merge branch 'main' into content-manifest
staxly[bot] 2d69aa4
fix import
TomWoodward File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
import { BookWithOSWebData, ArchiveTreeNode, ArchiveTree } from '../../src/app/content/types'; | ||
import { content } from '../../src/app/content/routes'; | ||
import { writeAssetFile } from './fileUtils'; | ||
import { stripIdVersion } from '../../src/app/content/utils/idUtils'; | ||
import { splitTitleParts } from '../../src/app/content/utils/archiveTreeUtils'; | ||
|
||
const quoteValue = (value?: string) => value ? `"${value.replace(/"/g, '""')}"` : '""'; | ||
|
||
export const renderAndSaveContentManifest = async( | ||
saveFile: (path: string, contents: string) => Promise<unknown>, | ||
books: BookWithOSWebData[] | ||
) => { | ||
|
||
const rows = books.map(book => getContentsRows(book, book.tree)) | ||
.reduce((result, item) => ([...result, ...item]), [] as string[][]); | ||
|
||
const manifestText = [ | ||
['id', 'title', 'text title', 'language', 'slug', 'url', 'toc type', 'toc target type'], | ||
...rows, | ||
].map(row => row.map(quoteValue).join(',')).join('\n'); | ||
|
||
await saveFile('/rex/content-metadata.csv', manifestText); | ||
}; | ||
|
||
function getContentsRows( | ||
book: BookWithOSWebData, | ||
node: ArchiveTree | ArchiveTreeNode, | ||
chapterNumber?: string | ||
): string[][] { | ||
const {title, toc_target_type} = node; | ||
const [titleNumber, titleString] = splitTitleParts(node.title); | ||
const textTitle = `${titleNumber || chapterNumber || ''} ${titleString}`.replace(/\s+/, ' ').trim(); | ||
const id = stripIdVersion(node.id); | ||
const tocType = node.toc_type ?? (id === book.id ? 'book' : ''); | ||
|
||
const urlParams = tocType === 'book' | ||
? [node.slug, ''] | ||
: 'contents' in node | ||
? ['', ''] | ||
: [node.slug, content.getUrl({book: {slug: book.slug}, page: {slug: node.slug}})]; | ||
|
||
const contents = 'contents' in node | ||
? node.contents.map(child => getContentsRows(book, child, titleNumber || chapterNumber)) | ||
.reduce((result, item) => ([...result, ...item]), [] as string[][]) | ||
: []; | ||
|
||
return [ | ||
[stripIdVersion(id), title, textTitle, book.language, ...urlParams, tocType, toc_target_type ?? ''], | ||
...contents, | ||
]; | ||
} | ||
|
||
|
||
// simple helper for local | ||
const writeAssetFileAsync = async(filepath: string, contents: string) => { | ||
return writeAssetFile(filepath, contents); | ||
}; | ||
export const renderContentManifest = async(books: BookWithOSWebData[]) => { | ||
return renderAndSaveContentManifest(writeAssetFileAsync, books); | ||
}; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -43,11 +43,13 @@ import { getBooksConfigSync } from '../../src/gateways/createBookConfigLoader'; | |
import createOSWebLoader from '../../src/gateways/createOSWebLoader'; | ||
import { readFile } from '../../src/helpers/fileUtils'; | ||
import { globalMinuteCounter, prepareBookPages } from './contentPages'; | ||
import { SerializedBookMatch, SerializedPageMatch } from './contentRoutes'; | ||
import { SerializedPageMatch } from './contentRoutes'; | ||
import createRedirects from './createRedirects'; | ||
import './logUnhandledRejectionsAndExit'; | ||
import renderManifest from './renderManifest'; | ||
import { SitemapPayload } from './sitemap'; | ||
import { SitemapPayload, renderAndSaveSitemapIndex } from './sitemap'; | ||
import { writeS3ReleaseXmlFile } from './fileUtils'; | ||
import { renderAndSaveContentManfiest } from './contentManifest'; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Import didn't get renamed |
||
|
||
const { | ||
ARCHIVE_URL, | ||
|
@@ -86,7 +88,6 @@ const sqsClient = new SQSClient({ region: WORK_REGION }); | |
|
||
type PageTask = { payload: SerializedPageMatch, type: 'page' }; | ||
type SitemapTask = { payload: SitemapPayload, type: 'sitemap' }; | ||
type SitemapIndexTask = { payload: SerializedBookMatch[], type: 'sitemapIndex' }; | ||
|
||
const booksConfig = getBooksConfigSync(); | ||
const archiveLoader = createArchiveLoader({ | ||
|
@@ -288,8 +289,7 @@ async function getQueueUrls(workersStackName: string) { | |
class Stats { | ||
public pages = 0; | ||
public sitemaps = 0; | ||
public sitemapIndexes = 0; | ||
get total() { return this.pages + this.sitemaps + this.sitemapIndexes; } | ||
get total() { return this.pages + this.sitemaps; } | ||
} | ||
|
||
function makePrepareAndQueueBook(workQueueUrl: string, stats: Stats) { | ||
|
@@ -347,11 +347,7 @@ function makePrepareAndQueueBook(workQueueUrl: string, stats: Stats) { | |
|
||
console.log(`[${book.title}] Sitemap queued`); | ||
|
||
// Used in the sitemap index | ||
return { | ||
params: { book: { slug: book.slug } }, | ||
state: { bookUid: book.id, bookVersion: book.version }, | ||
}; | ||
return book; | ||
}; | ||
} | ||
|
||
|
@@ -371,14 +367,8 @@ async function queueWork(workQueueUrl: string) { | |
`All ${stats.pages} page prerendering jobs and all ${stats.sitemaps} sitemap jobs queued` | ||
); | ||
|
||
await sendWithRetries(sqsClient, new SendMessageCommand({ | ||
MessageBody: JSON.stringify({ payload: books, type: 'sitemapIndex' } as SitemapIndexTask), | ||
QueueUrl: workQueueUrl, | ||
})); | ||
|
||
stats.sitemapIndexes = 1; | ||
|
||
console.log('1 sitemap index job queued'); | ||
renderAndSaveSitemapIndex(writeS3ReleaseXmlFile, books); | ||
renderAndSaveContentManfiest(writeS3ReleaseXmlFile, books); | ||
|
||
return stats; | ||
} | ||
|
@@ -463,8 +453,8 @@ async function finishRendering(stats: Stats) { | |
const elapsedMinutes = globalMinuteCounter(); | ||
|
||
console.log( | ||
`Prerender complete in ${elapsedMinutes} minutes. Rendered ${stats.pages} pages, ${ | ||
stats.sitemaps} sitemaps and ${stats.sitemapIndexes} sitemap index. ${ | ||
`Prerender complete in ${elapsedMinutes} minutes. Rendered ${stats.pages} pages, and ${ | ||
stats.sitemaps} sitemaps. ${ | ||
stats.total / elapsedMinutes}ppm` | ||
); | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised the real (html) title gets used, but maybe the textTitle is used as the display value in the reports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I doubt that the html will be used by anything but I threw it in there. the text title is intended to be used by reporting I added the context number in there for the eoc pages