Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Tutorial for Text-Mining Chinese #5809

Merged
merged 12 commits into from
Feb 28, 2025
Merged

Add Tutorial for Text-Mining Chinese #5809

merged 12 commits into from
Feb 28, 2025

Conversation

shiltemann
Copy link
Member

Tutorial by @Sch-Da, thanks a lot! cool! (ping also @bgruening as fyi)

@shiltemann shiltemann requested a review from a team as a code owner February 28, 2025 11:49
@@ -0,0 +1 @@
{"a_galaxy_workflow": "true", "annotation": "", "comments": [{"child_steps": [11], "color": "red", "data": {"title": "Results sorted from the most frequently appearing characters to those that appear the least"}, "id": 8, "position": [1732.6, 163.4], "size": [220, 362], "type": "frame"}, {"child_steps": [0, 1], "color": "red", "data": {"title": "Text upload"}, "id": 0, "position": [0, 91], "size": [239, 440], "type": "frame"}, {"child_steps": [2, 3], "color": "red", "data": {"title": "Text cleaning and layout unification of the uploaded texts for easier comparison"}, "id": 1, "position": [273, 13], "size": [255, 509], "type": "frame"}, {"child_steps": [4, 5], "color": "red", "data": {"title": "Comparison of the two texts"}, "id": 2, "position": [678, 0], "size": [285, 587], "type": "frame"}, {"child_steps": [6], "color": "red", "data": {"title": "Extraction of lines censored with symbol \u00d7 and their counterparts in the second text"}, "id": 3, "position": [990.6, 79.6], "size": [262, 516], "type": "frame"}, {"child_steps": [7], "color": "red", "data": {"title": "Conversion step for further processing"}, "id": 4, "position": [1273, 161], "size": [226, 367], "type": "frame"}, {"child_steps": [8], "color": "red", "data": {"title": "Select only uncensored characters for visualisation"}, "id": 5, "position": [1278.4, 558.6], "size": [224, 261], "type": "frame"}, {"child_steps": [9], "color": "red", "data": {"title": "Summarising and quantifying the results. How often did what symbol appear?"}, "id": 6, "position": [1507, 163], "size": [211, 363], "type": "frame"}, {"child_steps": [10], "color": "red", "data": {"title": "Visualisation of results in a wordcloud"}, "id": 7, "position": [1593.2, 549.2], "size": [280, 433], "type": "frame"}], "creator": [{"class": "Person", "identifier": "0000-0001-9536-5587", "name": "Daniela Schneider"}], "format-version": "0.1", "license": "CC-BY-4.0", "name": "Text Mining Differences in Chinese Newspaper Articles", "report": {"markdown": "\n# Workflow Execution Report\n\n## Workflow Inputs\n```galaxy\ninvocation_inputs()\n```\n\n## Workflow Outputs\n```galaxy\ninvocation_outputs()\n```\n\n## Workflow\n```galaxy\nworkflow_display()\n```\n"}, "steps": {"0": {"annotation": "Upload the censored text containing replacement characters like \u2018\u00d7\u2019.", "content_id": null, "errors": null, "id": 0, "input_connections": {}, "inputs": [{"description": "Upload the censored text containing replacement characters like \u2018\u00d7\u2019.", "name": "Input censored text"}], "label": "Input censored text", "name": "Input dataset", "outputs": [], "position": {"left": 21, "top": 180.6051261035269}, "tool_id": null, "tool_state": "{\"optional\": false, \"tag\": null}", "tool_version": null, "type": "data_input", "uuid": "dac86ce1-b481-46e0-ae1a-9d1943f1a328", "when": null, "workflow_outputs": []}, "1": {"annotation": "Upload the uncensored text without replacement characters.", "content_id": null, "errors": null, "id": 1, "input_connections": {}, "inputs": [{"description": "Upload the uncensored text without replacement characters.", "name": "Input uncensored text "}], "label": "Input uncensored text ", "name": "Input dataset", "outputs": [], "position": {"left": 21.7166748046875, "top": 354.55513831055816}, "tool_id": null, "tool_state": "{\"optional\": false, \"tag\": null}", "tool_version": null, "type": "data_input", "uuid": "7b604b7b-b700-4472-b574-8ec55dc59ba2", "when": null, "workflow_outputs": []}, "2": {"annotation": "This step uses Regular Expressions to delete all empty spaces (\\s) and show only one character per line (\\1\\n).\n\nThe result is a cleaned and reformatted text showing only one character per line.", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_line/9.3+galaxy1", "errors": null, "id": 2, "input_connections": {"infile": {"id": 0, "output_name": "output"}}, "inputs": [], "label": "Preprocessing of Text one", "name": "Replace Text", "outputs": [{"name": "outfile", "type": "input"}], "position": {"left": 303.51666259765625, "top": 166.8018061034782}, "post_job_actions": {"RenameDatasetActionoutfile": {"action_arguments": {"newname": "Preprocessed Text one"}, "action_type": "RenameDatasetAction", "output_name": "outfile"}}, "tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_line/9.3+galaxy1", "tool_shed_repository": {"changeset_revision": "86755160afbf", "name": "text_processing", "owner": "bgruening", "tool_shed": "toolshed.g2.bx.psu.edu"}, "tool_state": "{\"infile\": {\"__class__\": \"ConnectedValue\"}, \"replacements\": [{\"__index__\": 0, \"find_pattern\": \"\\\\r\", \"replace_pattern\": null}, {\"__index__\": 1, \"find_pattern\": \"\\\\n\", \"replace_pattern\": null}, {\"__index__\": 2, \"find_pattern\": \"\\\\s\", \"replace_pattern\": \"\"}, {\"__index__\": 3, \"find_pattern\": \"(.)\", \"replace_pattern\": \"\\\\1\\\\n\"}], \"__page__\": 0, \"__rerun_remap_job_id__\": null}", "tool_version": "9.3+galaxy1", "type": "tool", "uuid": "7a765a1c-be84-45ad-838e-58f9d703512c", "when": null, "workflow_outputs": []}, "3": {"annotation": "This step uses Regular Expressions to delete all empty spaces (\\s) and show only one character per line (\\1\\n).\n\nThe result is a cleaned and reformatted text showing only one character per line.", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_line/9.3+galaxy1", "errors": null, "id": 3, "input_connections": {"infile": {"id": 1, "output_name": "output"}}, "inputs": [], "label": "Preprocessing of Text two", "name": "Replace Text", "outputs": [{"name": "outfile", "type": "input"}], "position": {"left": 302.25, "top": 363.9517999999626}, "post_job_actions": {"RenameDatasetActionoutfile": {"action_arguments": {"newname": "Preprocessed Text two"}, "action_type": "RenameDatasetAction", "output_name": "outfile"}}, "tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_line/9.3+galaxy1", "tool_shed_repository": {"changeset_revision": "86755160afbf", "name": "text_processing", "owner": "bgruening", "tool_shed": "toolshed.g2.bx.psu.edu"}, "tool_state": "{\"infile\": {\"__class__\": \"ConnectedValue\"}, \"replacements\": [{\"__index__\": 0, \"find_pattern\": \"\\\\r\", \"replace_pattern\": null}, {\"__index__\": 1, \"find_pattern\": \"\\\\n\", \"replace_pattern\": null}, {\"__index__\": 2, \"find_pattern\": \"\\\\s\", \"replace_pattern\": \"\"}, {\"__index__\": 3, \"find_pattern\": \"(.)\", \"replace_pattern\": \"\\\\1\\\\n\"}], \"__page__\": 0, \"__rerun_remap_job_id__\": null}", "tool_version": "9.3+galaxy1", "type": "tool", "uuid": "85a1f86d-ff89-4cd7-b14d-0b4d113ff761", "when": null, "workflow_outputs": []}, "4": {"annotation": "The diff tool compares the two cleaned texts. This version of the output (raw output) is used for the further steps of the analysis. It is less intuitive for users. Therefore, the second diff below includes a more visual version of the output (HTML).", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/diff/diff/3.10+galaxy0", "errors": null, "id": 4, "input_connections": {"input1": {"id": 2, "output_name": "outfile"}, "input2": {"id": 3, "output_name": "outfile"}}, "inputs": [], "label": "Comparison with diff - computer version", "name": "diff", "outputs": [{"name": "diff_file", "type": "txt"}], "position": {"left": 724.0499877929688, "top": 126.28511298824382}, "post_job_actions": {"ChangeDatatypeActiondiff_file": {"action_arguments": {"newtype": "tabular"}, "action_type": "ChangeDatatypeAction", "output_name": "diff_file"}, "RenameDatasetActiondiff_file": {"action_arguments": {"newname": "Comparison - computer version"}, "action_type": "RenameDatasetAction", "output_name": "diff_file"}}, "tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/diff/diff/3.10+galaxy0", "tool_shed_repository": {"changeset_revision": "10ef1bf99074", "name": "diff", "owner": "bgruening", "tool_shed": "toolshed.g2.bx.psu.edu"}, "tool_state": "{\"input1\": {\"__class__\": \"ConnectedValue\"}, \"input2\": {\"__class__\": \"ConnectedValue\"}, \"report_format\": {\"report_format_select\": \"txt_columns\", \"__current_case__\": 1}, \"__page__\": null, \"__rerun_remap_job_id__\": null}", "tool_version": "3.10+galaxy0", "type": "tool", "uuid": "e3a3ad39-15bd-40c3-8226-a6f35a154d92", "when": null, "workflow_outputs": []}, "5": {"annotation": "The diff tool compares the two cleaned texts. This version (HTML version) creates an HTML file, which colour codes differences as additions (green) or extractions (red) when comparing the texts.", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/diff/diff/3.10+galaxy0", "errors": null, "id": 5, "input_connections": {"input1": {"id": 2, "output_name": "outfile"}, "input2": {"id": 3, "output_name": "outfile"}}, "inputs": [], "label": "Comparison with diff - user version", "name": "diff", "outputs": [{"name": "diff_file", "type": "txt"}, {"name": "html_file", "type": "html"}], "position": {"left": 725.2667846679688, "top": 355.64996337890625}, "post_job_actions": {"HideDatasetActiondiff_file": {"action_arguments": {}, "action_type": "HideDatasetAction", "output_name": "diff_file"}, "RenameDatasetActionhtml_file": {"action_arguments": {"newname": "Comparison - HTML version"}, "action_type": "RenameDatasetAction", "output_name": "html_file"}}, "tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/diff/diff/3.10+galaxy0", "tool_shed_repository": {"changeset_revision": "10ef1bf99074", "name": "diff", "owner": "bgruening", "tool_shed": "toolshed.g2.bx.psu.edu"}, "tool_state": "{\"input1\": {\"__class__\": \"ConnectedValue\"}, \"input2\": {\"__class__\": \"ConnectedValue\"}, \"report_format\": {\"report_format_select\": \"html\", \"__current_case__\": 2, \"unified\": \"3\", \"output_format\": \"side-by-side\"}, \"__page__\": null, \"__rerun_remap_job_id__\": null}", "tool_version": "3.10+galaxy0", "type": "tool", "uuid": "ef5da48f-601e-4108-8740-4778f83b5426", "when": null, "workflow_outputs": []}, "6": {"annotation": "This step selects all lines from the diff file that contain the censorship symbol \u00d7.\nThe condition \"ord(c1) == 215\" means that lines in column c1, which contain the censored text, are selected if they match \u00d7. The symbol \u00d7 is unspecific, therefore, the Unicode identifier of the character (215) is used for clarity in this condition.\n\nThis step does not show an output. If the filtered symbol is empty in the second text, this file lacks columns to compute the following steps. This is invisible for users but would cause a technical error. The compute step covers this and makes sure all necessary columns exist. It shows the output for both steps (Extracting and Compute) correctly.\n\nAdd another Unicode here if you want to select a different character, for example, '\u25a1' or '\u25b3'.\nYou can get the respective code, for example, on this website:\nhttps://www.mauvecloud.net/charsets/CharCodeFinder.html\nCopy the character you want to filter in the \"input\" window and select \"Decimal Character Codes\" as an output. If you do this for symbol \u00d7, you get 215.", "content_id": "Filter1", "errors": null, "id": 6, "input_connections": {"input": {"id": 4, "output_name": "diff_file"}}, "inputs": [], "label": "Extracting only censored passages", "name": "Filter", "outputs": [{"name": "out_file1", "type": "input"}], "position": {"left": 1022.4000244140625, "top": 286.73333740234375}, "post_job_actions": {"HideDatasetActionout_file1": {"action_arguments": {}, "action_type": "HideDatasetAction", "output_name": "out_file1"}}, "tool_id": "Filter1", "tool_state": "{\"__input_ext\": \"tabular\", \"chromInfo\": \"/opt/galaxy/tool-data/shared/ucsc/chrom/?.len\", \"cond\": \"ord(c1) == 215\", \"header_lines\": \"0\", \"input\": {\"__class__\": \"ConnectedValue\"}, \"__page__\": null, \"__rerun_remap_job_id__\": null}", "tool_version": "1.1.1", "type": "tool", "uuid": "46d04c11-f426-456d-9d97-7882a49ee20c", "when": null, "workflow_outputs": []}, "7": {"annotation": "This step unifies the formatting and adds potentially missing columns, should lines extracted before coming up empty in the second text. This ensures the proper number of columns and allows the smooth running of the next steps.", "content_id": "toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/2.1", "errors": null, "id": 7, "input_connections": {"input": {"id": 6, "output_name": "out_file1"}}, "inputs": [], "label": null, "name": "Compute", "outputs": [{"name": "out_file1", "type": "input"}], "position": {"left": 1280.7999877929688, "top": 288.5184503905876}, "post_job_actions": {"RenameDatasetActionout_file1": {"action_arguments": {"newname": "Censored lines"}, "action_type": "RenameDatasetAction", "output_name": "out_file1"}}, "tool_id": "toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/2.1", "tool_shed_repository": {"changeset_revision": "aff5135563c6", "name": "column_maker", "owner": "devteam", "tool_shed": "toolshed.g2.bx.psu.edu"}, "tool_state": "{\"__input_ext\": \"input\", \"avoid_scientific_notation\": false, \"chromInfo\": \"/opt/galaxy/tool-data/shared/ucsc/chrom/?.len\", \"error_handling\": {\"auto_col_types\": true, \"fail_on_non_existent_columns\": false, \"non_computable\": {\"action\": \"--non-computable-blank\", \"__current_case__\": 3}}, \"input\": {\"__class__\": \"ConnectedValue\"}, \"ops\": {\"header_lines_select\": \"no\", \"__current_case__\": 0, \"expressions\": [{\"__index__\": 0, \"cond\": \"c9\", \"add_column\": {\"mode\": \"R\", \"__current_case__\": 2, \"pos\": \"9\"}}]}, \"__page__\": null, \"__rerun_remap_job_id__\": null}", "tool_version": "2.1", "type": "tool", "uuid": "51d55250-60d0-4bd1-9f8c-0f45815befe0", "when": null, "workflow_outputs": []}, "8": {"annotation": "This step selects only column 9, which contains the uncensored characters from text two. The result is only one column with different rows of Chinese characters. \n\nThis step allows scaling words by frequency the word cloud in the next step. meaning characters that appear more often appear bigger, making the results evident at first sight.", "content_id": "Cut1", "errors": null, "id": 8, "input_connections": {"input": {"id": 7, "output_name": "out_file1"}}, "inputs": [], "label": null, "name": "Cut", "outputs": [{"name": "out_file1", "type": "tabular"}], "position": {"left": 1290.2000122070312, "top": 658.2000122070312}, "post_job_actions": {"RenameDatasetActionout_file1": {"action_arguments": {"newname": "Uncensored characters"}, "action_type": "RenameDatasetAction", "output_name": "out_file1"}}, "tool_id": "Cut1", "tool_state": "{\"columnList\": \"c9\", \"delimiter\": \"T\", \"input\": {\"__class__\": \"ConnectedValue\"}, \"__page__\": 0, \"__rerun_remap_job_id__\": null}", "tool_version": "1.0.2", "type": "tool", "uuid": "96a0b94b-6bb4-40ff-ae31-8db437d7026d", "when": null, "workflow_outputs": []}, "9": {"annotation": "", "content_id": "toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.8+galaxy0", "errors": null, "id": 9, "input_connections": {"in_file": {"id": 7, "output_name": "out_file1"}}, "inputs": [], "label": null, "name": "Datamash", "outputs": [{"name": "out_file", "type": "input"}], "position": {"left": 1511.7929886427692, "top": 288.52178870118314}, "post_job_actions": {"RenameDatasetActionout_file": {"action_arguments": {"newname": "Quantified Results"}, "action_type": "RenameDatasetAction", "output_name": "out_file"}}, "tool_id": "toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.8+galaxy0", "tool_shed_repository": {"changeset_revision": "4c07ddedc198", "name": "datamash_ops", "owner": "iuc", "tool_shed": "toolshed.g2.bx.psu.edu"}, "tool_state": "{\"__input_ext\": \"input\", \"chromInfo\": \"/opt/galaxy/tool-data/shared/ucsc/chrom/?.len\", \"grouping\": \"9\", \"header_in\": false, \"header_out\": false, \"ignore_case\": false, \"in_file\": {\"__class__\": \"ConnectedValue\"}, \"narm\": false, \"need_sort\": true, \"operations\": [{\"__index__\": 0, \"op_name\": \"count\", \"op_column\": \"9\"}], \"print_full_line\": false, \"__page__\": null, \"__rerun_remap_job_id__\": null}", "tool_version": "1.8+galaxy0", "type": "tool", "uuid": "e9c7b1a1-db93-4991-9b48-e9b3bda75edc", "when": null, "workflow_outputs": []}, "10": {"annotation": "This step shows, which characters were censored in the first text. The bigger the word, the more often it appeared in the text.", "content_id": "toolshed.g2.bx.psu.edu/repos/bgruening/wordcloud/wordcloud/1.9.4+galaxy0", "errors": null, "id": 10, "input_connections": {"text": {"id": 8, "output_name": "out_file1"}}, "inputs": [{"description": "runtime parameter for tool Generate a word cloud", "name": "fontfile"}, {"description": "runtime parameter for tool Generate a word cloud", "name": "mask"}, {"description": "runtime parameter for tool Generate a word cloud", "name": "stopwords"}], "label": null, "name": "Generate a word cloud", "outputs": [{"name": "output", "type": "png"}], "position": {"left": 1635.5499877929688, "top": 690.0499877929688}, "post_job_actions": {"RenameDatasetActionoutput": {"action_arguments": {"newname": "Wordcloud of censored characters"}, "action_type": "RenameDatasetAction", "output_name": "output"}}, "tool_id": "toolshed.g2.bx.psu.edu/repos/bgruening/wordcloud/wordcloud/1.9.4+galaxy0", "tool_shed_repository": {"changeset_revision": "54c2d8ebf0cf", "name": "wordcloud", "owner": "bgruening", "tool_shed": "toolshed.g2.bx.psu.edu"}, "tool_state": "{\"background\": \"#000000\", \"color_choice\": {\"color_option\": \"color\", \"__current_case__\": 0, \"color\": \"#00ff00\"}, \"colormap\": null, \"contour_color\": \"#000000\", \"contour_width\": \"0.0\", \"font_step\": \"1\", \"fontfile\": {\"__class__\": \"RuntimeValue\"}, \"height\": \"200\", \"include_numbers\": false, \"margin\": \"2\", \"mask\": {\"__class__\": \"RuntimeValue\"}, \"max_font_size\": null, \"max_words\": \"200\", \"min_font_size\": \"8\", \"min_word_length\": \"0\", \"mode\": null, \"no_collocations\": false, \"no_normalize_plurals\": false, \"prefer_horizontal\": \"1.0\", \"random_state\": \"10\", \"relative_scaling\": \"0.9\", \"repeat\": false, \"scale\": \"1.0\", \"stopwords\": {\"__class__\": \"RuntimeValue\"}, \"text\": {\"__class__\": \"ConnectedValue\"}, \"width\": \"400\", \"__page__\": 0, \"__rerun_remap_job_id__\": null}", "tool_version": "1.9.4+galaxy0", "type": "tool", "uuid": "da09e429-e183-4a05-abe2-ef64ade25fce", "when": null, "workflow_outputs": [{"label": "output_graphic", "output_name": "output", "uuid": "0e468eb9-b68c-4840-b5e7-12cbf29e0c9c"}]}, "11": {"annotation": "Sorts the quantified results from those appearing most to those appearing least ", "content_id": "sort1", "errors": null, "id": 11, "input_connections": {"input": {"id": 9, "output_name": "out_file"}}, "inputs": [], "label": null, "name": "Sort", "outputs": [{"name": "out_file1", "type": "input"}], "position": {"left": 1741.3931107130816, "top": 286.9218039599722}, "post_job_actions": {"RenameDatasetActionout_file1": {"action_arguments": {"newname": "Sorted Results"}, "action_type": "RenameDatasetAction", "output_name": "out_file1"}}, "tool_id": "sort1", "tool_state": "{\"__input_ext\": \"tabular\", \"chromInfo\": \"/opt/galaxy/tool-data/shared/ucsc/chrom/?.len\", \"column\": \"2\", \"column_set\": [], \"header_lines\": \"0\", \"input\": {\"__class__\": \"ConnectedValue\"}, \"order\": \"DESC\", \"style\": \"num\", \"__page__\": null, \"__rerun_remap_job_id__\": null}", "tool_version": "1.2.0", "type": "tool", "uuid": "450d11ba-80d4-4367-ac2a-56a18ffeb729", "when": null, "workflow_outputs": [{"label": "output_csv", "output_name": "out_file1", "uuid": "9dfd8862-e7d7-46ae-bbe2-3e333a017764"}]}}, "tags": ["Humanities", "comparison", "text", "diff", "published"], "uuid": "55cf86fe-d044-4ff1-9891-10d89f29b7dd", "version": 2}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [rdjsonl] <GTN:027> reported by reviewdog 🐶
This workflow is missing a test, which is now mandatory. Please see the FAQ on how to add tests to your workflows.

# You can remove the examples below


@book{Ng2022,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [rdjsonl] <GTN:012> reported by reviewdog 🐶
Missing both a DOI and a URL. Please add one of the two.

series = {Law in Context}
}

@thesis{Schneider2024,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [rdjsonl] <GTN:012> reported by reviewdog 🐶
Missing both a DOI and a URL. Please add one of the two.

location = {Freiburg}
}

@article{Anon..16.10.1938_5598,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [rdjsonl] <GTN:012> reported by reviewdog 🐶
Missing both a DOI and a URL. Please add one of the two.

userd = {5598}
}

@article{TKP.16.10.1938_18864,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [rdjsonl] <GTN:012> reported by reviewdog 🐶
Missing both a DOI and a URL. Please add one of the two.

@shiltemann shiltemann merged commit eacc45a into main Feb 28, 2025
2 of 3 checks passed
@shiltemann
Copy link
Member Author

shiltemann commented Feb 28, 2025

ok, this should appear online in about 15 minutes orso under https://training.galaxyproject.org/topics/statistics/tutorials/text_mining_chinese/tutorial.html

it is in draft mode for now, but not much would be needed to take it out of draft mode:

  • add DOIs/URLs for the citations
  • review the online version @Sch-Da and check that you are happy with how it looks
  • workflow test (but this can also be added after publication as far as I'm concerned)

@shiltemann shiltemann deleted the text-mining-chinese branch February 28, 2025 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants