-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make tokenize tests readable #1868
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1868
Note: Links to docs will display an error until the docs builds have been completed. ❌ 6 New Failures, 2 Cancelled Jobs, 2 Unrelated FailuresAs of commit d6cd349 with merge base 3ca0d30 (): NEW FAILURES - The following jobs have failed:
CANCELLED JOBS - The following jobs were cancelled. Please retry:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
cc: @RdoubleA @joecummings What do you think? With current lint formatting working with this tests is really awful. Pretty minor fix |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1868 +/- ##
==========================================
- Coverage 69.69% 67.78% -1.92%
==========================================
Files 308 309 +1
Lines 16147 16291 +144
==========================================
- Hits 11254 11043 -211
- Misses 4893 5248 +355 ☔ View full report in Codecov by Sentry. |
Lint CI at this point should be changed, if not the formating will be still really bad in case of expected_tokens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - I will look into the linter issue. My naive assumption was that noqa
should work.
Couple questions about possibly unintended formatting errors
messages = [ | ||
Message( | ||
role="user", | ||
content="Below is an instruction that describes a task. Write a response " | ||
"that appropriately completes the request.\n\n### Instruction:\nGenerate " | ||
"a realistic dating profile bio.\n\n### Response:\n", | ||
"that appropriately completes the request.\n\n### Instruction:\nGenerate " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are these changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, something change from my local linter changes probably, will fix
@@ -311,21 +147,17 @@ def test_tokenizer_vocab_size(self, tokenizer): | |||
assert tokenizer.vocab_size == 128257 | |||
|
|||
def test_tokenize_text_messages( | |||
self, tokenizer, user_text_message, assistant_text_message | |||
self, tokenizer, user_text_message, assistant_text_message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same is here
I assume that fixed |
Grr still failing. Mind if I take a look? |
Isn't it about lines with # noqa? |
Ah, I see: |
Fixed |
One more... |
fixed with flake |
@joecummings Sorry for such lint failures, but I could not able to run pre-commit run --all-files due to current fixes |
@joecummings Probably I found solution we need to use both # noqa and # fmt: skip. But I really don't like it |
Oh, I broke something... |
I don't know what is it, this tests are passing on my local and branch is up to date |
Isn't it related too #1886? Some weird fail with torchao |
Can we restart CI here? Or I'm not sure how to fix some torchao unrelated stuff |
@felipemello1 @RdoubleA Maybe you can comment how to fix this torchao thing? Really strange and probably just CI rerun can't help |
Context
What is the purpose of this PR? Is it to
Please link to any issues this PR addresses.
Changelog
What are the changes made in this PR?
Test plan
Please make sure to do each of the following if applicable to your PR. If you're unsure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.
pre-commit install
)pytest tests
pytest tests -m integration_test
UX
If your function changed a public API, please add a dummy example of what the user experience will look like when calling it.
Here is a docstring example
and a tutorial example
Will require changes in CI(pre-commit run makes expected_tokens lists unreadable)