0
0
Fork 0
mirror of https://github.com/matrix-construct/construct synced 2024-05-20 03:43:47 +02:00
Commit graph

26 commits

Author SHA1 Message Date
Jason Volk d377674748 ircd::simt: Split vector reduce_add to hadd. 2023-01-01 19:16:06 -08:00
Jason Volk 561be9973a ircd::simt::norm: Barrier for overlapping input and output buffers. (gfx1011) 2022-11-03 18:28:53 +00:00
Jason Volk e89703aa97 ircd::gpt::gpu: Limit dispatch to the number of control frame buffers. 2022-10-30 18:44:50 +00:00
Jason Volk 2609c21913 ircd::gpt::pipe: Enable mutable model; fixes for backpropagation; range stub. 2022-10-18 22:01:35 +00:00
Jason Volk 6a05fcefeb ircd::simt: Consolidate timestamp counter sampling into inline. 2022-10-12 18:53:53 -07:00
Jason Volk 54e3b8f5b4 ircd::simt: Consolidate portables and macros from units into headers. 2022-10-11 02:13:46 +00:00
Jason Volk 831141727b ircd::gpt::gpu: Add assert macro when trapping supported. 2022-10-09 03:29:29 +00:00
Jason Volk c1168fcc30 ircd::gpt: Resolve behavior of opts.limit: 0=analysis, -n=unlimited, n=limited. 2022-10-09 03:13:09 +00:00
Jason Volk 9682f406b3 ircd::gpt::gpu: Mute printf() on unsupporting platforms. 2022-10-06 22:26:19 +00:00
Jason Volk 442dad869d ircd::gpt: Resolve cycle count sampling; add debug log; fix count. 2022-10-06 22:01:41 +00:00
Jason Volk 0917a1f041 ircd::gpt::pipe: Resolve control page sync at ends of sample. 2022-10-06 18:54:29 +00:00
Jason Volk 33afa8a4fc ircd::gpt::gpu: Add global fence between attn and ffnn accumulations (gfx1011). 2022-10-05 20:10:31 +00:00
Jason Volk c4cceb425c ircd::gpt::gpu: Use explicit broadcast for local access. 2022-10-05 20:08:40 +00:00
Jason Volk 331a417656 ircd::gpt::gpu: Fix keywording for OpenCL 2.0+. 2022-10-02 01:30:10 +00:00
Jason Volk e85ed0e0dd ircd::gpt: Remove various cruft. 2022-09-24 16:40:39 -07:00
Jason Volk 6d2da3b4f1 ircd::gpt::task: Refactor generator interface to member functions. 2022-07-01 20:17:56 -07:00
Jason Volk 78848925ee ircd::gpt: Various refactoring. 2022-06-19 20:14:22 -07:00
Jason Volk b7b1328352 ircd::gpt::pipe: Reuse logsm buffer for logexp intermediate values. 2022-06-17 21:11:53 -07:00
Jason Volk 8f90e7c0cd ircd::gpt: Optimizations for matrix multiply. 2021-10-06 13:13:47 -07:00
Jason Volk aea6c79fc2 ircd::gpt: Add top N and target label result register control block. 2021-10-06 13:13:47 -07:00
Jason Volk 8bd78af128 ircd::gpt: Additional task header/interface simplification. 2021-10-06 13:13:47 -07:00
Jason Volk c1f3e580c3 ircd::gpt: Add top_p lmhead selector, quantized for now. 2021-10-06 13:13:47 -07:00
Jason Volk 8a3eeb46f9 ircd::gpt::pipe: Optimize pipeline to cache attention state for generations. 2021-10-06 13:13:47 -07:00
Jason Volk c5f159ad58 ircd::gpt: Cleanup/improve work item related prologues. 2021-10-06 13:13:47 -07:00
Jason Volk ce9abfb321 ircd::gpt::model: Optimize left-attention mask. 2021-10-06 13:13:47 -07:00
Jason Volk 20162fd7d5 ircd::gpt: Splits and renames; various reorg. 2021-09-15 01:44:36 -07:00
Renamed from ircd/gpt_cl.cl (Browse further)