-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add InfiniteBenchSum scenario and run spec #3409
Conversation
|
||
|
||
@run_spec_function("infinite_bench_sum") | ||
def get_infinite_bench_sum_spec(word_lower_bound: float = 0.0, word_upper_bound: float = 100e6) -> RunSpec: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1e8
instead of 100e6
(also isn't 1e7 enough?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes 1e7 is enough; re 1e8 vs 100e6, was meant to make it easier to understand (100e6=100M, 30e3=30k, etc), but am fine with both.
Currently: changed to 1e7 in the latest change
scenario_spec = ScenarioSpec( | ||
class_name="helm.benchmark.scenarios.infinite_bench_sum_scenario.InfiniteBenchSumScenario", | ||
args={ | ||
"word_lower_bound": word_lower_bound, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_words_lower_bound
or min_num_words
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to min_num_words
class_name="helm.benchmark.scenarios.infinite_bench_sum_scenario.InfiniteBenchSumScenario", | ||
args={ | ||
"word_lower_bound": word_lower_bound, | ||
"word_upper_bound": word_upper_bound, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_words_upper_bound
or max_num_words
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to max_num_words
class InfiniteBenchSumScenario(Scenario): | ||
"""InfiniteBenchSum | ||
|
||
InfiniteBenchbenchmark tailored for evaluating the capabilities of language models to process, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InfiniteBench benchmark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected
|
||
|
||
class InfiniteBenchSumScenario(Scenario): | ||
"""InfiniteBenchSum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a space between InfiniteBench and Sum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, corrected
dataset = dataset.map(lambda example: {"prompt": example["context"] + "\n\n" + example["input"]}) | ||
dataset = dataset.map(lambda example: {"prompt_wc": len(example["prompt"].split())}) | ||
dataset = dataset.filter(lambda example: self.word_lower_bound <= example["prompt_wc"] <= self.word_upper_bound) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you just do this chained
(
dataset.map()
.map()
.filter()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, changed to this
|
||
assert isinstance(dataset, Dataset) | ||
|
||
dataset = dataset.map(lambda example: {"prompt": example["context"] + "\n\n" + example["input"]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
assert isinstance(dataset, Dataset) | ||
|
||
dataset = dataset.map(lambda example: {"prompt": example["context"] + "\n\n" + example["input"]}) | ||
dataset = dataset.map(lambda example: {"prompt_wc": len(example["prompt"].split())}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def count_words:
return len(re.split(r"\s+", text.strip()))
then do
{"prompt_wc": count_words(example["context"]) + count_words(example["input"])}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
) | ||
|
||
# Load the dataset with the specified features | ||
dataset = load_dataset("xinrongzhang2022/InfiniteBench", split="longbook_sum_eng", features=ft) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pin revision revision="90f0394333616266d9fe85824ceaf505093cbaa5"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revision pinned
No description provided.