Documentation

Documentation

Apple Metal Documentation

Complete guide for setting up and optimizing ManifoldScript on Apple Silicon with Metal.

Prerequisites

  • • Mac with Apple Silicon (M1, M2, M3, or later)
  • • macOS 14.0 Sonoma or later
  • • Xcode 15.0 or later
  • • Metal Framework support
  • • At least 16GB of unified memory

Installation

# Install Xcode command line tools
xcode-select --install

# Install ManifoldScript with Metal support
curl -fsSL https://get.manifoldscript.dev/metal | bash

# Verify installation
manifoldscript --version
manifoldscript --check-metal

Metal Architecture

ManifoldScript Metal Architecture

graph TD A[ManifoldScript Source] --> B[Metal Frontend] B --> C[MSL Generation] C --> D[Metal Library] D --> E[Compute Pipeline] E --> F[GPU Execution] G[Memory Manager] --> H[Buffer Allocator] H --> I[Shared Memory] I --> J[Metal Kernels] K[Command Queue] --> L[Command Buffer] L --> M[Encoder] M --> F N[Threadgroup] --> O[Simd Groups] O --> P[Individual Threads] P --> F classDef frontend fill:#e1f5fe classDef compilation fill:#f3e5f5 classDef execution fill:#e8f5e9 classDef memory fill:#fff3e0 class B,C,D,E compilation class F,J execution class G,H,I memory class K,L,M,N,O,P frontend

Unified Memory Architecture

Apple Silicon Unified Memory with Metal

graph TB subgraph "Unified Memory Space" A[CPU Access] --> B[Memory Coherency] B --> C[GPU Access] C --> D[Cache System] D --> E[Memory Controller] end subgraph "ManifoldScript Runtime" F[Tensor Allocator] --> G[Page Tables] G --> H[Memory Mapping] H --> I[Zero-Copy Bridge] end subgraph "Metal Framework" J[MTLBuffer] --> K[MTLTexture] K --> L[MTLArgumentEncoder] L --> M[Compute Command] end I -->|Direct Access| J E -->|Physical Memory| J classDef unified fill:#e3f2fd classDef runtime fill:#f3e5f5 classDef metal fill:#e8f5e9 class A,B,C,D,E unified class F,G,H,I runtime class J,K,L,M metal

Performance Optimization

Memory Optimization

  • • Use shared storage mode for buffers
  • • Enable heap allocation for large tensors
  • • Use texture samplers for 2D data
  • • Optimize threadgroup memory usage

Kernel Optimization

  • • Maximize SIMD group utilization
  • • Use threadgroup memory for sharing
  • • Enable argument buffers
  • • Use atomic operations carefully

Code Example

(manifold metal_ops
  :requirements (:metal :apple_silicon)
  :types tensor - metal_tensor
  
  :action (neural_network 
    :parameters (?input ?weights ?output - tensor)
    :pre (:metal (?input :tensor) (?weights :tensor))
    :eff (:metal (?output :tensor)
      (metal_conv2d ?input ?weights ?output)
      (metal_activation ?output "relu")
    )
  )
)

;; Compile with Metal optimization
manifoldscript compile --target=metal --arch=arm64 metal_ops.ms