{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":92541759,"defaultBranch":"main","name":"benchmark","ownerLogin":"pytorch","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2017-05-26T19:21:12.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/21003710?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1727472679.0","currentOid":""},"activityList":{"items":[{"before":"0f05015140467c80395308f98748c8cb85aab4b6","after":"611bf702394bd24bb76df2ee288b3f3e2c0ce874","ref":"refs/heads/main","pushedAt":"2024-09-28T01:37:43.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Restore FlexAttention and FlashV3 backward (#2473)\n\nSummary: Pull Request resolved: https://github.com/pytorch/benchmark/pull/2473\n\nReviewed By: xuzhao9\n\nDifferential Revision: D63543625\n\nPulled By: bertmaher\n\nfbshipit-source-id: 1693e15875544bda0f5f6c69daa5597fffd80509","shortMessageHtmlLink":"Restore FlexAttention and FlashV3 backward (#2473)"}},{"before":"2edf80cba57a3be2c2752b15849a02b88d0f2d89","after":"0f05015140467c80395308f98748c8cb85aab4b6","ref":"refs/heads/main","pushedAt":"2024-09-28T01:03:44.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Fix bug #2458 (#2459)\n\nSummary:\nhttps://github.com/pytorch/benchmark/issues/2458\n\nPull Request resolved: https://github.com/pytorch/benchmark/pull/2459\n\nReviewed By: xuzhao9\n\nDifferential Revision: D63476542\n\nPulled By: kit1980\n\nfbshipit-source-id: 01e9db9cb03d34e82a773897417df2ccda410634","shortMessageHtmlLink":"Fix bug #2458 (#2459)"}},{"before":null,"after":"709a7e7ceef4962cc39c8ff7ba43309603a607ce","ref":"refs/heads/xz9/fix-k8s","pushedAt":"2024-09-27T21:31:19.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"xuzhao9","name":"Xu Zhao","path":"/xuzhao9","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/502017?s=80&v=4"},"commit":{"message":"Upgrade the NVIDIA driver","shortMessageHtmlLink":"Upgrade the NVIDIA driver"}},{"before":"a31c3fe9bca0cfc8e2807f5f185454a0cdbce32d","after":"2edf80cba57a3be2c2752b15849a02b88d0f2d89","ref":"refs/heads/main","pushedAt":"2024-09-26T17:27:32.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add nsys integration\n\nSummary: Add new metric `--metric nsys` to collect nsys trace.\n\nReviewed By: htyu\n\nDifferential Revision: D63274918\n\nfbshipit-source-id: 0536310df6290ea5f5a02d85cc0ad6d342d45dbd","shortMessageHtmlLink":"Add nsys integration"}},{"before":"8a690690cfb12343fbf773b3a3ff0877a1c02676","after":"a31c3fe9bca0cfc8e2807f5f185454a0cdbce32d","ref":"refs/heads/main","pushedAt":"2024-09-26T07:54:13.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Revert \"Trace enter/exit of TorchFunctionModes (#135422)\" (#136590)\n\nSummary:\nThis reverts commit 7743149b2be4a9eba7e0997ccdc6abe552bec266.\n\nReverts\n* https://github.com/pytorch/pytorch/pull/135503\n* https://github.com/pytorch/pytorch/pull/135502\n* https://github.com/pytorch/pytorch/pull/135422\n\nThis passes this test. Earlier, the getitem would stay like a getitem in the Fx graph. But now the fake tensor propagations fails saying that .item is called. It seems that torch function is not getting triggered while fake tensor propagation.\n\n```\nimport torch\nfrom torch.nn.attention.flex_attention import BlockMask, _mask_mod_signature, _score_mod_signature, flex_attention\nfrom torch._inductor.lowering import make_pointwise, register_lowering\nfrom torch._inductor.virtualized import ops\nfrom torch.nn.attention.flex_attention import create_block_mask\n\ntorch.set_default_device('cuda')\n\nflex_attention = torch.compile(flex_attention, dynamic=False)\n\nprefix_lengths = torch.arange(8)\ndef prefix_lm(b, h, q, kv):\n return prefix_lengths[b] >= kv\n\nmask = create_block_mask(prefix_lm, 8, None, 512, 512, _compile=True)\n```\n\nX-link: https://github.com/pytorch/pytorch/pull/136590\nApproved by: https://github.com/Chillee\n\nReviewed By: atalman\n\nDifferential Revision: D63431470\n\nPulled By: anijain2305\n\nfbshipit-source-id: 60915b30336121b845af71f423582c22a6c65c3f","shortMessageHtmlLink":"Revert \"Trace enter/exit of TorchFunctionModes (#135422)\" (#136590)"}},{"before":null,"after":"8a690690cfb12343fbf773b3a3ff0877a1c02676","ref":"refs/heads/juliagmt/test","pushedAt":"2024-09-25T19:45:17.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"juliagmt-google","name":"Julia Guo","path":"/juliagmt-google","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/153684546?s=80&v=4"},"commit":{"message":"Move FBGEM to 5c3d54f335b1617b5b169061c3b3d59b2a791ebb (#2471)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/benchmark/pull/2471\n\nMoving to 5c3d54f335b1617b5b169061c3b3d59b2a791ebb to enable the TMA kernel by default.\n\nPreviously D63294076 updated the corresponding hash in facebookresearch/FBGEMM. Now using pytorch/FBGEMM instead.\n\nReviewed By: xuzhao9\n\nDifferential Revision: D63363236\n\nfbshipit-source-id: 0758cef786e85fc9946b9e8cf761bec506632667","shortMessageHtmlLink":"Move FBGEM to 5c3d54f335b1617b5b169061c3b3d59b2a791ebb (#2471)"}},{"before":"93b5804775a07da101640e2f083350ace207f610","after":"8a690690cfb12343fbf773b3a3ff0877a1c02676","ref":"refs/heads/main","pushedAt":"2024-09-25T16:29:43.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Move FBGEM to 5c3d54f335b1617b5b169061c3b3d59b2a791ebb (#2471)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/benchmark/pull/2471\n\nMoving to 5c3d54f335b1617b5b169061c3b3d59b2a791ebb to enable the TMA kernel by default.\n\nPreviously D63294076 updated the corresponding hash in facebookresearch/FBGEMM. Now using pytorch/FBGEMM instead.\n\nReviewed By: xuzhao9\n\nDifferential Revision: D63363236\n\nfbshipit-source-id: 0758cef786e85fc9946b9e8cf761bec506632667","shortMessageHtmlLink":"Move FBGEM to 5c3d54f335b1617b5b169061c3b3d59b2a791ebb (#2471)"}},{"before":"7d30648bc4ef61b82aacbe6201e7937d362e3d28","after":"93b5804775a07da101640e2f083350ace207f610","ref":"refs/heads/main","pushedAt":"2024-09-25T12:54:16.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Do not treat user defined nn module attributes static for dynamic shape infra (#136516)\n\nSummary:\nFixes https://github.com/pytorch/pytorch/issues/136254\n\nTh regression was introduced in https://github.com/pytorch/pytorch/pull/132736 where originally we were trying to fix another regression. This PR and the offending PR together say - \"treat user defined nn module attributes as automatic dynamic, but for cudagraphs they will be considered static\". This avoid recompilations. This can lead to a cudagraph recording, which is ok. This also maintains the state before inline_inbuilt_nn_modules flag was introduced.\n\nX-link: https://github.com/pytorch/pytorch/pull/136516\nApproved by: https://github.com/williamwen42\n\nReviewed By: atalman\n\nDifferential Revision: D63351217\n\nPulled By: anijain2305\n\nfbshipit-source-id: 074bc2af485a7b73eb70876c205a8ef6b611abb3","shortMessageHtmlLink":"Do not treat user defined nn module attributes static for dynamic sha…"}},{"before":"deb2988e38adb307f13ee68a6cc1a20451b438c6","after":"7d30648bc4ef61b82aacbe6201e7937d362e3d28","ref":"refs/heads/main","pushedAt":"2024-09-25T02:45:05.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add release tests (#2470)\n\nSummary: Pull Request resolved: https://github.com/pytorch/benchmark/pull/2470\n\nReviewed By: xuzhao9\n\nDifferential Revision: D63359914\n\nPulled By: atalman\n\nfbshipit-source-id: 5d5dc34da4b72babbada66d4ec0ba49c89a17fba","shortMessageHtmlLink":"Add release tests (#2470)"}},{"before":"926fb10ab06ee93f1677d235f1372ed42e64ca7f","after":"deb2988e38adb307f13ee68a6cc1a20451b438c6","ref":"refs/heads/main","pushedAt":"2024-09-25T01:46:02.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add types to _dynamo/testing.py (#136402)\n\nSummary:\nX-link: https://github.com/pytorch/pytorch/pull/136402\nApproved by: https://github.com/jansel\n\nReviewed By: atalman\n\nDifferential Revision: D63319829\n\nPulled By: bobrenjc93\n\nfbshipit-source-id: c343ada52068d5d973757093c255726684fb6f70","shortMessageHtmlLink":"Add types to _dynamo/testing.py (#136402)"}},{"before":"0ab0e47e176cc363d88f9a909c7762ef85bb0213","after":"926fb10ab06ee93f1677d235f1372ed42e64ca7f","ref":"refs/heads/main","pushedAt":"2024-09-24T22:22:44.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add HSTU ragged attention operator (#2453)\n\nSummary:\nAs the title says.\n\nOn H100:\n```\n$ python run_benchmark.py triton --op ragged_attention\n\n x_val hstu_triton_ragged_attention-latency hstu_triton_ragged_attention_persistent-latency\n----------------- -------------------------------------- -------------------------------------------------\n(8, 4, 512, 2048) 0.0141706 0.0128713\n(8, 4, 512, 2048) 0.0187315 0.0171204\n(8, 4, 512, 2048) 0.0156807 0.0155399\n(8, 4, 512, 2048) 0.0165724 0.0154679\n(8, 4, 512, 2048) 0.0163886 0.0157738\n(8, 4, 512, 2048) 0.0173378 0.0155991\n(8, 4, 512, 2048) 0.0164874 0.0153128\n(8, 4, 512, 2048) 0.0203275 0.0172193\n(8, 4, 512, 2048) 0.0214526 0.0185414\n(8, 4, 512, 2048) 0.0172307 0.0169625\n```\n\nPull Request resolved: https://github.com/pytorch/benchmark/pull/2453\n\nReviewed By: manman-ren\n\nDifferential Revision: D62513596\n\nPulled By: xuzhao9\n\nfbshipit-source-id: 154ef0145ca94ecfeb0b075c9dec01b395683ef2","shortMessageHtmlLink":"Add HSTU ragged attention operator (#2453)"}},{"before":"6a089a45114add4a8044cc74592fe25abf5aed43","after":"0ab0e47e176cc363d88f9a909c7762ef85bb0213","ref":"refs/heads/main","pushedAt":"2024-09-24T16:14:07.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"bf16xint16_gemm operator: add --transpose option (#2466)\n\nSummary:\n`--transpose` will make this benchmark test a int16 x bf16 mm instead of a bf16 x int16.\n\nThis matters for H100, because the wgmma instruction can take registers only on the LHS. So int16 x bf16 is probably the easier one to support efficiently.\n\nPull Request resolved: https://github.com/pytorch/benchmark/pull/2466\n\nTest Plan:\nIn OSS: ran `python run_benchmark.py triton --op bf16xint16_gemm --transpose`\n\nInternally, ran `buck2 run mode/opt //pytorch/benchmark:triton -- --op bf16xint16_gemm --transpose`\n\nInternally, we run into the issue fixed by https://github.com/triton-lang/triton/pull/4695; but otherwise, they both run.\n\nReviewed By: aakhundov\n\nDifferential Revision: D63294109\n\nPulled By: davidberard98\n\nfbshipit-source-id: 3ea05bb09e62f51c405ae538726caf80e1ba0d63","shortMessageHtmlLink":"bf16xint16_gemm operator: add --transpose option (#2466)"}},{"before":"8d57e930622a25cd5811fe67510d83d0f276d490","after":"6a089a45114add4a8044cc74592fe25abf5aed43","ref":"refs/heads/main","pushedAt":"2024-09-24T00:22:30.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Move FBGEM to fcf36b445a1929c8a22f07c35d3a29b3697867a4 (#2467)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/benchmark/pull/2467\n\nMoving to fcf36b445a1929c8a22f07c35d3a29b3697867a4 to enable the TMA kernel by default.\n\nReviewed By: q10\n\nDifferential Revision: D63294076\n\nfbshipit-source-id: 1cc92615fda45ea966aa7a85a5e93ddce2d77455","shortMessageHtmlLink":"Move FBGEM to fcf36b445a1929c8a22f07c35d3a29b3697867a4 (#2467)"}},{"before":"f2f0b30f5622f83b56fafb95e80514f886dcac8f","after":"8d57e930622a25cd5811fe67510d83d0f276d490","ref":"refs/heads/main","pushedAt":"2024-09-23T22:10:55.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add _dynamo.config.suppress_errors logging (#136379)\n\nSummary:\nX-link: https://github.com/pytorch/pytorch/pull/136379\nApproved by: https://github.com/ezyang\n\nReviewed By: atalman\n\nDifferential Revision: D63250844\n\nPulled By: jovianjaison\n\nfbshipit-source-id: fe9d06fdd46a4b4310364446f8c80e94ec521afb","shortMessageHtmlLink":"Add _dynamo.config.suppress_errors logging (#136379)"}},{"before":"46ab2e2b63810436786291ad5ea95c9ebcb46dad","after":"f2f0b30f5622f83b56fafb95e80514f886dcac8f","ref":"refs/heads/main","pushedAt":"2024-09-22T03:31:04.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Run single iteration when collecting ncu traces\n\nSummary: We assume that NCU will handle the warmup and kernel repeat by itself, so we remove warmup and repeated runs in the Tritonbench framework when running with NCU.\n\nReviewed By: int3\n\nDifferential Revision: D62451609\n\nfbshipit-source-id: d61d8a58500b8009db9d7f93cef730b48b063667","shortMessageHtmlLink":"Run single iteration when collecting ncu traces"}},{"before":"9c7aacc40a132eab6655c66a202e9fce595543da","after":"46ab2e2b63810436786291ad5ea95c9ebcb46dad","ref":"refs/heads/main","pushedAt":"2024-09-20T01:44:56.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"make --dump-ir also dump sass (#2460)\n\nSummary:\nTriton has a `get_sass` utility that disassembles the cubin and gets the sass - in this PR, we try to use `get_sass` when running with `--dump-ir`, if we can find a cubin file.\n\nPull Request resolved: https://github.com/pytorch/benchmark/pull/2460\n\nReviewed By: xuzhao9\n\nDifferential Revision: D62925358\n\nPulled By: davidberard98\n\nfbshipit-source-id: 7aa1e66c43222b776c949ced3fdcabcee30fbb21","shortMessageHtmlLink":"make --dump-ir also dump sass (#2460)"}},{"before":"c1755f53c7e163c4629ce6ae5835265ef5310d00","after":"9c7aacc40a132eab6655c66a202e9fce595543da","ref":"refs/heads/main","pushedAt":"2024-09-19T16:07:03.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Log structured logging overhead to dynamo compile (kinda) (#2454)\n\nSummary:\nX-link: https://github.com/pytorch/pytorch/pull/136142\n\nPull Request resolved: https://github.com/pytorch/benchmark/pull/2454\n\nThis adds structured logging overhead at a per compile basis to compilation metrics.\n\nTo do so, we track the frame_id_frame_compile_id that trace_structured uses to categorize compiles, and use that as the key in our timing table.\n\nImplementation notes:\n- If there's times we call trace_structured without a compile id, the time won't be measured. Not really a good way around that today given the compile id framework of compilation metrics. Strobelight is still the best way to measure on a per job basis.\n- We don't actually measure the time it takes to log the compilation metrics itself. Fundamentally, it's not possible to log this properly if we're storing the logging number *in* compilation metrics, since there's no way to measure it before we do it(unless we want discrepancies between dynamo_compile and tlparse, which seems suboptimal). Hopefully for a large job, the cost of structured_logging compilation metrics itself is small.\n- I wanted to use frame_phase_timing here, but there's a bunch of ids to iron out, and I don't really want to deal with that headache. compilation_time_metrics is sort of what I want, but that isn't by frame/compile id, so it's also a bit off. Putting it into torch.logging as a separate thing so logging tracks its own overhead seems fine, though.\n\nReviewed By: oulgen\n\nDifferential Revision: D62643611\n\nfbshipit-source-id: 9353d1dbb323079e292b9b4786604fc377971e13","shortMessageHtmlLink":"Log structured logging overhead to dynamo compile (kinda) (#2454)"}},{"before":"c97859ebe9fbbc7694d7ded6acdf045cc1ef884b","after":"c1755f53c7e163c4629ce6ae5835265ef5310d00","ref":"refs/heads/main","pushedAt":"2024-09-18T23:49:08.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Increase multiplier to 3 for Inductor AMP FP16 benchmark correctness check (#135932)\n\nSummary:\nFix https://github.com/pytorch/pytorch/issues/135657.\nAligned with AMP BF16, using multiplier 3 for Inductor AMP FP16 benchmark correctness check\n\nX-link: https://github.com/pytorch/pytorch/pull/135932\nApproved by: https://github.com/CaoE, https://github.com/jgong5, https://github.com/jansel\n\nReviewed By: jeanschmidt\n\nDifferential Revision: D62980154\n\nfbshipit-source-id: e7fb9cd8faea933ff85fb1173620bf0d54ebf9f1","shortMessageHtmlLink":"Increase multiplier to 3 for Inductor AMP FP16 benchmark correctness …"}},{"before":"7e1ba8d5983e4ff31cbf79d0f5dec071d11370cd","after":"c97859ebe9fbbc7694d7ded6acdf045cc1ef884b","ref":"refs/heads/main","pushedAt":"2024-09-18T22:55:01.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"use npt.NDArray instead of np.ndarray in type annotations\n\nSummary:\nX-link: https://github.com/pytorch/pytorch/pull/136288\n\nTo facilitate PSS-2 upgrade, this uses `ndt.NDArray` instead of `nd.ndarray` in type annotations. In Numpy-1.19 (PSS-1) it's an alias to `nd.ndarray` -- a noop.\nIn Numpy-1.24, `ndt.NDArray` a proper generic type, and without this change uses of `nd.ndarray` generate this Pyre type error:\n```counterexample\n Invalid type parameters [24]: Generic type `np.ndarray` expects 2 type parameters.\n```\n\nReviewed By: kit1980\n\nDifferential Revision: D62977370\n\nfbshipit-source-id: f7bc1f621f643e1a1896def4651c5eb132e53f78","shortMessageHtmlLink":"use npt.NDArray instead of np.ndarray in type annotations"}},{"before":"6b1c40bc29de9cba4423bd9993f7fa8e06f67496","after":"7e1ba8d5983e4ff31cbf79d0f5dec071d11370cd","ref":"refs/heads/main","pushedAt":"2024-09-17T19:39:55.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Update mypy to 1.11.2 (#133816)\n\nSummary:\nUpdates mypy to 1.11.1 to improve type inference\n\nX-link: https://github.com/pytorch/pytorch/pull/133816\nApproved by: https://github.com/ezyang\n\nReviewed By: jeanschmidt\n\nDifferential Revision: D62873245\n\nfbshipit-source-id: 4c3e2f3166b2e63ef41405b0368cf8f3af4fe0f3","shortMessageHtmlLink":"Update mypy to 1.11.2 (#133816)"}},{"before":"c709128f583f371fd618e6c9ba88425e76d1be58","after":"6b1c40bc29de9cba4423bd9993f7fa8e06f67496","ref":"refs/heads/main","pushedAt":"2024-09-16T15:16:05.000Z","pushType":"push","commitsCount":3,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Remove ignored modes workaround (#135502)\n\nSummary:\nX-link: https://github.com/pytorch/pytorch/pull/135502\nApproved by: https://github.com/anijain2305\nghstack dependencies: #134732, #133137, #135443, #135444, #135422\n\nReviewed By: jeanschmidt\n\nDifferential Revision: D62737308\n\nPulled By: mlazos\n\nfbshipit-source-id: d436c5b5db309e64f17d51dad30a8936567f9a3e","shortMessageHtmlLink":"Remove ignored modes workaround (#135502)"}},{"before":"47413ed5c1aaebebee52c47d7ed274b2bc125cb6","after":"c709128f583f371fd618e6c9ba88425e76d1be58","ref":"refs/heads/main","pushedAt":"2024-09-16T11:39:07.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"SKIP llama for dynamic size testing (#135960)\n\nSummary:\nRunning Torchbench llama with dynamic size failed with\n```\n File \"/localdisk/leslie/torch_inductor_community/pytorch/torch/fx/experimental/symbolic_shapes.py\", line 4182, in produce_guards\n raise ConstraintViolationError(\ntorch.fx.experimental.symbolic_shapes.ConstraintViolationError: Constraints violated (L['inputs'][0].size()[0])! For more information, run with TORCH_LOGS=\"+dynamic\".\n - Not all values of RelaxedUnspecConstraint(L['inputs'][0].size()[0]) are valid because L['inputs'][0].size()[0] was inferred to be a constant (32).\n```\nSkip this model for marking dynamic dim.\n\nX-link: https://github.com/pytorch/pytorch/pull/135960\nApproved by: https://github.com/ezyang\n\nReviewed By: jeanschmidt\n\nDifferential Revision: D62737135\n\nfbshipit-source-id: 71ef5686e924cfebe0284c986e1cb412b3b499d0","shortMessageHtmlLink":"SKIP llama for dynamic size testing (#135960)"}},{"before":"195d1195914fdb07f83bc54266e637ea38d54233","after":"47413ed5c1aaebebee52c47d7ed274b2bc125cb6","ref":"refs/heads/main","pushedAt":"2024-09-13T14:14:31.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add remote cache time saved to compilation metrics (#2449)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/benchmark/pull/2449\n\nX-link: https://github.com/pytorch/pytorch/pull/135490\n\nRecord remote cache time saved via frame_phase_timing\n\nWe add to the \"phase\" when remote cache hits and saves us time, so that we have a 1:1 correspondence between a frame and time saved.\n\nReviewed By: aorenste\n\nDifferential Revision: D62106921\n\nfbshipit-source-id: 57f84c189fea7a40ad836c7f59f6801d22973c4f","shortMessageHtmlLink":"Add remote cache time saved to compilation metrics (#2449)"}},{"before":"1808fbca648c10552afa1927bfa48eb929411e48","after":"195d1195914fdb07f83bc54266e637ea38d54233","ref":"refs/heads/main","pushedAt":"2024-09-13T14:09:55.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Support fwd_no_grad mode\n\nSummary:\nSome Triton operators have different perf in no_grad mode, since they\ncan avoid saving intermediate results. We want to be able to benchmark them\nboth ways.\n\nReviewed By: xuzhao9\n\nDifferential Revision: D62304098\n\nfbshipit-source-id: 4cc3e6163596fa16570ebccdea38acc519fb5a91","shortMessageHtmlLink":"Support fwd_no_grad mode"}},{"before":"7acad50f066a13490d3f92533c287213152502c3","after":"1808fbca648c10552afa1927bfa48eb929411e48","ref":"refs/heads/main","pushedAt":"2024-09-12T21:19:27.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"ignore mark_dynamic() in export (#2451)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/benchmark/pull/2451\n\nPreviously we were accomodating `torch._dynamo.mark_dynamic()` for export's dynamic shapes. Here we clean things up and ignore it, requiring users to specify an export input for `dynamic_shapes`.\n\nNote: there's 4 decorators relevant to export, `mark_dynamic, maybe_mark_dynamic, mark_static, mark_unbacked`. User calls that involve export have only been `mark_dynamic()`, and we use `maybe_mark_dynamic` under the hood for `Dim.AUTO`, but we could start using others. One reason I decided to not warn and just silently ignore is these decorators cause the tensors to carry dynamic info, and it'll be hard to tell whether the markers are from export or user calls when re-exporting with the same inputs.\n\ncc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang rec\n\nX-link: https://github.com/pytorch/pytorch/pull/135536\n\nReviewed By: desertfire, avikchaudhuri\n\nDifferential Revision: D62404528\n\nPulled By: pianpwk\n\nfbshipit-source-id: 1da81cbc71c337b40cc336dfdb93292210ab66aa","shortMessageHtmlLink":"ignore mark_dynamic() in export (#2451)"}},{"before":"48cb7386a7f3da3e406d693f558ab7c528dd7c16","after":"0ae15fa45b4b1c29566cd0c9248afb6b6bfa2ac6","ref":"refs/heads/xz9/add-hstu-ragged","pushedAt":"2024-09-12T19:53:12.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add HSTU ragged attention operator (#2453)\n\nSummary:\nAs the title says.\n\nOn H100:\n```\n$ python run_benchmark.py triton --op ragged_attention\n\n x_val hstu_triton_ragged_attention-latency hstu_triton_ragged_attention_persistent-latency\n----------------- -------------------------------------- -------------------------------------------------\n(8, 4, 512, 2048) 0.0141706 0.0128713\n(8, 4, 512, 2048) 0.0187315 0.0171204\n(8, 4, 512, 2048) 0.0156807 0.0155399\n(8, 4, 512, 2048) 0.0165724 0.0154679\n(8, 4, 512, 2048) 0.0163886 0.0157738\n(8, 4, 512, 2048) 0.0173378 0.0155991\n(8, 4, 512, 2048) 0.0164874 0.0153128\n(8, 4, 512, 2048) 0.0203275 0.0172193\n(8, 4, 512, 2048) 0.0214526 0.0185414\n(8, 4, 512, 2048) 0.0172307 0.0169625\n```\n\n\nDifferential Revision: D62513596\n\nPulled By: xuzhao9","shortMessageHtmlLink":"Add HSTU ragged attention operator (#2453)"}},{"before":"3366be29c01a3135594325606dfdae2332859045","after":"48cb7386a7f3da3e406d693f558ab7c528dd7c16","ref":"refs/heads/xz9/add-hstu-ragged","pushedAt":"2024-09-12T18:03:00.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"xuzhao9","name":"Xu Zhao","path":"/xuzhao9","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/502017?s=80&v=4"},"commit":{"message":"Fix lints","shortMessageHtmlLink":"Fix lints"}},{"before":"248bc3acb55d806e5135e5eab277b9a1aef7ba38","after":"3366be29c01a3135594325606dfdae2332859045","ref":"refs/heads/xz9/add-hstu-ragged","pushedAt":"2024-09-12T16:44:04.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"xuzhao9","name":"Xu Zhao","path":"/xuzhao9","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/502017?s=80&v=4"},"commit":{"message":"Formatting issue","shortMessageHtmlLink":"Formatting issue"}},{"before":"5c33dece4c5262e74cafec259a6afbd3199c739c","after":"248bc3acb55d806e5135e5eab277b9a1aef7ba38","ref":"refs/heads/xz9/add-hstu-ragged","pushedAt":"2024-09-12T16:43:05.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"xuzhao9","name":"Xu Zhao","path":"/xuzhao9","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/502017?s=80&v=4"},"commit":{"message":"Formatting issue","shortMessageHtmlLink":"Formatting issue"}},{"before":"01b97cc5883f9246b92d270a79fd41973a3630a5","after":"5c33dece4c5262e74cafec259a6afbd3199c739c","ref":"refs/heads/xz9/add-hstu-ragged","pushedAt":"2024-09-12T16:40:50.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"xuzhao9","name":"Xu Zhao","path":"/xuzhao9","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/502017?s=80&v=4"},"commit":{"message":"Bugfix","shortMessageHtmlLink":"Bugfix"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"startCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0yOFQwMTozNzo0My4wMDAwMDBazwAAAATDI3V2","endCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0xMlQxNjo0MDo1MC4wMDAwMDBazwAAAAS0oRx-"}},"title":"Activity · pytorch/benchmark"}