Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The weights of the tokens don't seem to be applied #33

Open
Against-the-Wind opened this issue Jan 26, 2025 · 1 comment
Open

The weights of the tokens don't seem to be applied #33

Against-the-Wind opened this issue Jan 26, 2025 · 1 comment

Comments

@Against-the-Wind
Copy link

In other_impls.py:
def token_weights(string, current_weight):
a = parse_parentheses(string)
out = []
for x in a:
weight = current_weight
if len(x) >= 2 and x[-1] == ")" and x[0] == "(":
x = x[1:-1]
xx = x.rfind(":")
weight *= 1.1
if xx > 0:
try:
weight = float(x[xx + 1 :])
x = x[:xx]
except:
pass
out += token_weights(x, weight)
else:
out += [(x, current_weight)]
return out
the weights of the tokens within the parentheses were mutiplied by 1.1.
But going through the whole repository, I don't seem to find where the weights were applied, reasons as follows:
In other_impls.py:
class ClipTokenWeightEncoder:
def encode_token_weights(self, token_weight_pairs):
tokens = list(map(lambda a: a[0], token_weight_pairs[0]))
out, pooled = self([tokens])
if pooled is not None:
first_pooled = pooled[0:1].cpu()
else:
first_pooled = pooled
output = [out[0:1]]
return torch.cat(output, dim=-2).cpu(), first_pooled
tokens = list(map(lambda a: a[0], token_weight_pairs[0])) means only passing the tokens to the variable tokens
And in sd3_infer.py:
def get_cond(self, prompt):
self.print("Encode prompt...")
tokens = self.tokenizer.tokenize_with_weights(prompt)
l_out, l_pooled = self.clip_l.model.encode_token_weights(tokens["l"])
g_out, g_pooled = self.clip_g.model.encode_token_weights(tokens["g"])
t5_out, t5_pooled = self.t5xxl.model.encode_token_weights(tokens["t5xxl"])
lg_out = torch.cat([l_out, g_out], dim=-1)
lg_out = torch.nn.functional.pad(lg_out, (0, 4096 - lg_out.shape[-1]))
return torch.cat([lg_out, t5_out], dim=-2), torch.cat(
(l_pooled, g_pooled), dim=-1
)
although using the function encode_token_weights, as mentioned before, the weights weren't used in the encoding process

@Against-the-Wind
Copy link
Author

I did experiments with weights ranging from 0.0 to 10.0, and as I expected, there wasn't any significant change in the generated images. This further proves that the weights weren't used when generating images.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant