Jason Volk
|
d377674748
|
ircd::simt: Split vector reduce_add to hadd.
|
2023-01-01 19:16:06 -08:00 |
|
Jason Volk
|
561be9973a
|
ircd::simt::norm: Barrier for overlapping input and output buffers. (gfx1011)
|
2022-11-03 18:28:53 +00:00 |
|
Jason Volk
|
e89703aa97
|
ircd::gpt::gpu: Limit dispatch to the number of control frame buffers.
|
2022-10-30 18:44:50 +00:00 |
|
Jason Volk
|
2609c21913
|
ircd::gpt::pipe: Enable mutable model; fixes for backpropagation; range stub.
|
2022-10-18 22:01:35 +00:00 |
|
Jason Volk
|
6a05fcefeb
|
ircd::simt: Consolidate timestamp counter sampling into inline.
|
2022-10-12 18:53:53 -07:00 |
|
Jason Volk
|
54e3b8f5b4
|
ircd::simt: Consolidate portables and macros from units into headers.
|
2022-10-11 02:13:46 +00:00 |
|
Jason Volk
|
831141727b
|
ircd::gpt::gpu: Add assert macro when trapping supported.
|
2022-10-09 03:29:29 +00:00 |
|
Jason Volk
|
c1168fcc30
|
ircd::gpt: Resolve behavior of opts.limit: 0=analysis, -n=unlimited, n=limited.
|
2022-10-09 03:13:09 +00:00 |
|
Jason Volk
|
9682f406b3
|
ircd::gpt::gpu: Mute printf() on unsupporting platforms.
|
2022-10-06 22:26:19 +00:00 |
|
Jason Volk
|
442dad869d
|
ircd::gpt: Resolve cycle count sampling; add debug log; fix count.
|
2022-10-06 22:01:41 +00:00 |
|
Jason Volk
|
0917a1f041
|
ircd::gpt::pipe: Resolve control page sync at ends of sample.
|
2022-10-06 18:54:29 +00:00 |
|
Jason Volk
|
33afa8a4fc
|
ircd::gpt::gpu: Add global fence between attn and ffnn accumulations (gfx1011).
|
2022-10-05 20:10:31 +00:00 |
|
Jason Volk
|
c4cceb425c
|
ircd::gpt::gpu: Use explicit broadcast for local access.
|
2022-10-05 20:08:40 +00:00 |
|
Jason Volk
|
331a417656
|
ircd::gpt::gpu: Fix keywording for OpenCL 2.0+.
|
2022-10-02 01:30:10 +00:00 |
|
Jason Volk
|
e85ed0e0dd
|
ircd::gpt: Remove various cruft.
|
2022-09-24 16:40:39 -07:00 |
|
Jason Volk
|
6d2da3b4f1
|
ircd::gpt::task: Refactor generator interface to member functions.
|
2022-07-01 20:17:56 -07:00 |
|
Jason Volk
|
78848925ee
|
ircd::gpt: Various refactoring.
|
2022-06-19 20:14:22 -07:00 |
|
Jason Volk
|
b7b1328352
|
ircd::gpt::pipe: Reuse logsm buffer for logexp intermediate values.
|
2022-06-17 21:11:53 -07:00 |
|
Jason Volk
|
8f90e7c0cd
|
ircd::gpt: Optimizations for matrix multiply.
|
2021-10-06 13:13:47 -07:00 |
|
Jason Volk
|
aea6c79fc2
|
ircd::gpt: Add top N and target label result register control block.
|
2021-10-06 13:13:47 -07:00 |
|
Jason Volk
|
8bd78af128
|
ircd::gpt: Additional task header/interface simplification.
|
2021-10-06 13:13:47 -07:00 |
|
Jason Volk
|
c1f3e580c3
|
ircd::gpt: Add top_p lmhead selector, quantized for now.
|
2021-10-06 13:13:47 -07:00 |
|
Jason Volk
|
8a3eeb46f9
|
ircd::gpt::pipe: Optimize pipeline to cache attention state for generations.
|
2021-10-06 13:13:47 -07:00 |
|
Jason Volk
|
c5f159ad58
|
ircd::gpt: Cleanup/improve work item related prologues.
|
2021-10-06 13:13:47 -07:00 |
|
Jason Volk
|
ce9abfb321
|
ircd::gpt::model: Optimize left-attention mask.
|
2021-10-06 13:13:47 -07:00 |
|
Jason Volk
|
20162fd7d5
|
ircd::gpt: Splits and renames; various reorg.
|
2021-09-15 01:44:36 -07:00 |
|