Lines Matching full:tile
136 "expected tile ID to be allocated before conversion to LLVM");
140 /// Creates an alloca matching the size of tile used by `tileOp`. The alloca is
149 // Create an alloca matching the tile size of the `tileOp`.
163 /// Finds or creates an alloca for a spill of a tile.
188 /// hardware tile ID) to ArmSME intrinsics. Currently, this works by assigning
189 /// the op to tile 0, then emitting a full tile swap between ZA and memory
190 /// before + after the tile op.
194 /// // Note: <IN MEMORY TILE> = tile ID >= 16.
195 /// arm_sme.tile_op { tile_id = <IN MEMORY TILE> }
226 /// TileRead, // Omit store after tile operation.
227 /// TileWrite, // Omit load before tile operation.
228 /// TileReadWrite, // Needs both tile load and store.
249 // Tile has a real (hardware) tile. No spills/reloads required.
254 "failed to allocate SME virtual tile to operation, tile value will go "
257 // Step 1. Create an alloca for the tile at the top of the function (if one
264 // Step 2. Assign the op a real tile ID.
265 // For simplicity, we always use tile 0 (which always exists).
277 // Step 3. Emit tile swaps before and after the op.
279 // touches (i.e. a single tile slice).
282 // Swap the contents of ZA and the in-memory tile before the op.
285 // Swap the tile back out to memory again after the op.
292 /// Extracts a pointer to a slice of an in-memory tile.
307 /// Emits an in-place swap of a slice of a tile in ZA and a slice of a
308 /// tile-sized memref (`tileAlloca`).
327 // Load the new tile slice back from memory into ZA.
331 // Store the current tile slice to memory.
337 /// Emits a full in-place swap of the contents of a tile in ZA and a
338 /// tile-sized memref (`tileAlloca`).
343 // Create an scf.for over all tile slices.
351 // Emit a swap for each tile slice.
362 /// spills and fills around ArmSME ops that use in-memory tile IDs. This can be
430 // Get the base mask for tile based on the element size.
431 // The base mask is just the mask to zero the first tile (of a size).
439 // Zeroing the 8-bit ZA0.B tile is equivalent to zeroing all eight
443 // Zeroing the 16-bit ZA0.H tile is equivalent to zeroing 64-bit
448 // Zeroing the 32-bit ZA0.S tile is equivalent to zeroing 64-bit
454 // setting the bit for that tile.
461 // The actual mask is just the base mask shifted by the tile ID.
462 // This will be folded to a constant after tile allocation.
464 // The shift is just derived from the layout of the tiles, and that the tile
465 // ID is the index of the tile. For example, looking at the 32-bit ZAx.S
469 // * Tile ID -> 0
472 // * Tile ID -> 1
475 // * Tile ID -> 2
478 // * Tile ID -> 3
481 // This holds for all tile sizes.
516 // Cast tile slice to i32 for intrinsic.
527 // Create 'arm_sme.intr.ld1*.(horiz|vert)' intrinsic to load ZA tile slice.
532 // the input tile to preserve dataflow.
555 // Create 'arm_sme.intr.st1*.horiz' intrinsic to store ZA tile slice.
562 // Cast tile slice to i32 for intrinsic.
598 // Cast tile slice from index to i32 for intrinsic.
610 // Create 'arm_sme.intr.write.(horiz|vert)' to write vector to tile slice.
625 // the input tile to preserve dataflow.
648 // Create an 'all true' predicate for the tile slice.
653 // Zero destination/fallback for tile slice extraction.
657 // Cast tile slice from index to i32 for intrinsic.
661 // Create 'arm_sme.intr.read.(horiz|vert)' to extract the tile slice.
770 // 'arm_sme.outerproduct' with the input tile to preserve dataflow.
815 // 'arm_sme.outerproduct' with the input tile to preserve dataflow.
920 // tile types after conversion.
931 op->emitOpError("unexpected operation with SME tile type after "
968 /* The following are used to lower tile spills/fills */