Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT REVIEW YET] Extend paged attention #8237

Draft
wants to merge 39 commits into
base: master
Choose a base branch
from

Commits on Oct 2, 2024

  1. add the reference test.

    vanbasten23 committed Oct 2, 2024
    Configuration menu
    Copy the full SHA
    b1a26e7 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f712b34 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    cf6dcf5 View commit details
    Browse the repository at this point in the history

Commits on Oct 3, 2024

  1. Configuration menu
    Copy the full SHA
    43453da View commit details
    Browse the repository at this point in the history
  2. create a new extended_paged_attention api with a flag controlling if …

    …we call the kernel or the non-kernel
    vanbasten23 committed Oct 3, 2024
    Configuration menu
    Copy the full SHA
    bf71c8c View commit details
    Browse the repository at this point in the history

Commits on Oct 4, 2024

  1. Create a test that call both non-kernel extended_paged_attention and …

    …kernel versio and compare the result.
    vanbasten23 committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    da7150b View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f485878 View commit details
    Browse the repository at this point in the history
  3. add the original paged_attention to the torch_xla and made sure torch…

    …_xla can call into the local pallas kernel.
    vanbasten23 committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    7df596e View commit details
    Browse the repository at this point in the history

Commits on Oct 7, 2024

  1. modified the hardcode number in the test test_extended_paged_attentio…

    …n and the original paged_attention finishes successfully.
    vanbasten23 committed Oct 7, 2024
    Configuration menu
    Copy the full SHA
    3eb8e33 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    830388d View commit details
    Browse the repository at this point in the history

Commits on Oct 9, 2024

  1. Configuration menu
    Copy the full SHA
    83528de View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    fbca5cf View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    ba30b9b View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    43c2bf0 View commit details
    Browse the repository at this point in the history
  5. Implementing kernel v0

    vanbasten23 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    58fe257 View commit details
    Browse the repository at this point in the history

Commits on Oct 10, 2024

  1. finished implementing the v0. Also add a test that use 1 query token …

    …and verify the extend_paged_attention generate the same result as the original paged_attention.
    vanbasten23 committed Oct 10, 2024
    Configuration menu
    Copy the full SHA
    669d598 View commit details
    Browse the repository at this point in the history
  2. Something wrong with the test. Now the test test_extended_paged_atten…

    …tion_single_query succeeded.
    vanbasten23 committed Oct 10, 2024
    Configuration menu
    Copy the full SHA
    290ab57 View commit details
    Browse the repository at this point in the history

Commits on Oct 11, 2024

  1. added a few more tests.

    vanbasten23 committed Oct 11, 2024
    Configuration menu
    Copy the full SHA
    118fba5 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    54f0af1 View commit details
    Browse the repository at this point in the history

Commits on Oct 14, 2024

  1. revised v0 implementation. Add partly finished v1 impl. Also added mo…

    …re test and experiement which may be cleaned up later.
    vanbasten23 committed Oct 14, 2024
    Configuration menu
    Copy the full SHA
    3d9e359 View commit details
    Browse the repository at this point in the history

Commits on Oct 16, 2024

  1. Configuration menu
    Copy the full SHA
    d282b2e View commit details
    Browse the repository at this point in the history

Commits on Oct 17, 2024

  1. Configuration menu
    Copy the full SHA
    069ca31 View commit details
    Browse the repository at this point in the history

Commits on Oct 18, 2024

  1. Configuration menu
    Copy the full SHA
    8ce1bb3 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    2e839cb View commit details
    Browse the repository at this point in the history
  3. upload everything

    vanbasten23 committed Oct 18, 2024
    Configuration menu
    Copy the full SHA
    6645e7b View commit details
    Browse the repository at this point in the history
  4. fixed some syntax error

    vanbasten23 committed Oct 18, 2024
    Configuration menu
    Copy the full SHA
    fc0b345 View commit details
    Browse the repository at this point in the history

Commits on Oct 20, 2024

  1. Configuration menu
    Copy the full SHA
    afb97ae View commit details
    Browse the repository at this point in the history

Commits on Oct 21, 2024

  1. fix an error

    vanbasten23 committed Oct 21, 2024
    Configuration menu
    Copy the full SHA
    5a6ff8f View commit details
    Browse the repository at this point in the history
  2. added the causal mask

    vanbasten23 committed Oct 21, 2024
    Configuration menu
    Copy the full SHA
    d6b994a View commit details
    Browse the repository at this point in the history
  3. fixed the blocker issue that pltpu.repeat(acc_scale, acc_scale_repeat…

    …) due to the 2nd to last dimension is 4 instead of a multiple of 8.
    vanbasten23 committed Oct 21, 2024
    Configuration menu
    Copy the full SHA
    0cca110 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    c33da03 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    e8ccd04 View commit details
    Browse the repository at this point in the history
  6. fixed a bug

    vanbasten23 committed Oct 21, 2024
    Configuration menu
    Copy the full SHA
    92672e4 View commit details
    Browse the repository at this point in the history

Commits on Oct 22, 2024

  1. Configuration menu
    Copy the full SHA
    9834f06 View commit details
    Browse the repository at this point in the history

Commits on Oct 24, 2024

  1. most basic test passed.

    vanbasten23 committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    35a3c55 View commit details
    Browse the repository at this point in the history

Commits on Oct 25, 2024

  1. Configuration menu
    Copy the full SHA
    bb79ead View commit details
    Browse the repository at this point in the history
  2. when we write to o_ref, don't check @pl.when(kv_blk_idx == num_kv_blk…

    …s - 1). Sometimes, lengths[b] is very small kv_blk_idx may never reach (num_kv_blks-1) due to the check @pl.when(kv_blk_idx * compute_blk_size_kv < kv_len)
    vanbasten23 committed Oct 25, 2024
    Configuration menu
    Copy the full SHA
    8305949 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5d28a68 View commit details
    Browse the repository at this point in the history

Commits on Oct 26, 2024

  1. all test passed. finally

    vanbasten23 committed Oct 26, 2024
    Configuration menu
    Copy the full SHA
    0c49d5a View commit details
    Browse the repository at this point in the history