ManifoldScript Docs
Documentation
Getting Started with ManifoldScript
A comprehensive guide to installing, configuring, and running your first ManifoldScript programs across different GPU platforms.
Prerequisites
System Requirements
- Operating System: Linux (Ubuntu 20.04+), macOS (12.0+), or Windows 10+ (via WSL2)
- GPU: NVIDIA GPU with CUDA 11.0+, Apple Silicon with Metal 3, or AMD GPU with ROCm 5.0+
- Memory: Minimum 8GB RAM, 4GB GPU memory
- Compiler: GCC 9.0+, Clang 12.0+, or MSVC 2019+
Installation
NVIDIA CUDA Installation
Linux (Ubuntu/Debian)
bash
1# Install CUDA toolkit 2wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin 3sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600 4wget https://developer.download.nvidia.com/compute/cuda/12.0.0/local_installers/cuda-repo-ubuntu2004-12-0-local_12.0.0-525.60.13-1_amd64.deb 5sudo dpkg -i cuda-repo-ubuntu2004-12-0-local_12.0.0-525.60.13-1_amd64.deb 6sudo cp /var/cuda-repo-ubuntu2004-12-0-local/cuda-*-keyring.gpg /usr/share/keyrings/ 7sudo apt-get update 8sudo apt-get -y install cuda 9 10# Install ManifoldScript11curl -fsSL https://get.manifoldscript.dev/cuda | bash12 13# Verify installation14manifoldscript --version15manifoldscript --check-gpu
Apple Metal Installation
macOS (Apple Silicon)
bash
1# Install Xcode command line tools2xcode-select --install3 4# Install ManifoldScript5curl -fsSL https://get.manifoldscript.dev/metal | bash6 7# Verify installation8manifoldscript --version9manifoldscript --check-gpu
AMD ROCm Installation
Linux (Ubuntu/Debian)
bash
1# Install ROCm 2sudo apt update 3sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)" 4sudo usermod -a -G render,video $LOGNAME 5wget https://repo.radeon.com/amdgpu-install/5.0/ubuntu/focal/amdgpu-install_5.0.50000-1_all.deb 6sudo apt install ./amdgpu-install_5.0.50000-1_all.deb 7sudo apt update 8sudo apt install rocm-dev 9 10# Install ManifoldScript11curl -fsSL https://get.manifoldscript.dev/rocm | bash12 13# Verify installation14manifoldscript --version15manifoldscript --check-gpu
Your First Program
1. Create a Simple Tensor Program
Let's create a simple matrix multiplication program:
manifoldscript
1# Create a file called hello.ms 2# Simple matrix multiplication 3tensor A[1024, 1024] = random(1024, 1024); 4tensor B[1024, 1024] = random(1024, 1024); 5tensor C[1024, 1024] = A @ B; 6 7# Print result 8debug("Matrix multiplication completed"); 9debug("Result shape: " + shape(C));10debug("Result sum: " + sum(C));
2. Compile and Run
bash
1# Compile for your GPU platform 2manifoldscript compile hello.ms 3 4# Run the compiled program 5./hello 6 7# Expected output: 8# Matrix multiplication completed 9# Result shape: [1024, 1024]10# Result sum: 262144.0
Understanding ManifoldScript
Tensor Declaration
manifoldscript
1# Declare tensors with explicit shapes2tensor A[100, 200] = zeros(100, 200); # Zero matrix3tensor B[100, 200] = ones(100, 200); # Ones matrix4tensor C[100, 200] = random(100, 200); # Random values5tensor D[100, 200] = identity(100); # Identity matrix6 7# Infer shapes from expressions8tensor E = A + B; # Shape: [100, 200]9tensor F = A @ C'; # Shape: [100, 100]
Operations
manifoldscript
1# Element-wise operations 2tensor G = A + B; # Addition 3tensor H = A - B; # Subtraction 4tensor I = A * B; # Element-wise multiplication 5tensor J = A / B; # Element-wise division 6 7# Matrix operations 8tensor K = A @ B'; # Matrix multiplication 9tensor L = transpose(A); # Transpose10tensor M = inverse(A); # Matrix inverse11 12# Reduction operations13tensor sum_A = sum(A); # Sum of all elements14tensor max_A = max(A); # Maximum value15tensor mean_A = mean(A); # Mean value
GPU Optimization
Memory Management
manifoldscript
1# Optimize memory usage 2tensor A[1000, 1000] = random(1000, 1000); 3tensor B[1000, 1000] = random(1000, 1000); 4 5# Use temporary tensors efficiently 6temp T1 = A @ B; # Temporary tensor 7T1 = T1 + 1.0; # Reuse temporary 8 9# Explicit memory management10free(A); # Free memory when done11free(B);
Performance Tuning
manifoldscript
1# Use optimized operations2pragma optimize_level = 3;3pragma use_shared_memory = true;4pragma max_threads_per_block = 1024;5 6# Block operations for better performance7block_size = 32;8tensor C[1024, 1024] = blocked_matmul(A, B, block_size);
Common Patterns
Neural Network Layer
manifoldscript
1# Simple neural network layer 2function dense_layer(input[batch, in_features], 3 weights[in_features, out_features], 4 bias[out_features]) { 5 return input @ weights + bias; 6} 7 8# Usage 9tensor X[64, 784] = random(64, 784); # Input batch10tensor W[784, 256] = random(784, 256); # Weights11tensor b[256] = zeros(256); # Bias12tensor output = dense_layer(X, W, b);
Convolution Operation
manifoldscript
1# Convolution operation 2function conv2d(input[H, W, C_in], 3 filters[K, K, C_in, C_out], 4 stride = 1, 5 padding = 0) { 6 H_out = (H + 2 * padding - K) / stride + 1; 7 W_out = (W + 2 * padding - K) / stride + 1; 8 9 tensor output[H_out, W_out, C_out];10 11 # Convolution implementation12 for h = 0 to H_out-1:13 for w = 0 to W_out-1:14 for c_out = 0 to C_out-1:15 sum = 0.0;16 for k1 = 0 to K-1:17 for k2 = 0 to K-1:18 for c_in = 0 to C_in-1:19 sum += input[h*stride+k1, w*stride+k2, c_in] * 20 filters[k1, k2, c_in, c_out];21 output[h, w, c_out] = sum;22 23 return output;24}
Debugging and Testing
Debugging Tools
bash
1# Enable debug mode 2manifoldscript compile --debug hello.ms 3 4# Profile execution 5manifoldscript compile --profile hello.ms 6 7# Check memory usage 8manifoldscript compile --memory-profile hello.ms 9 10# Generate intermediate representations11manifoldscript compile --emit-ir --emit-asm hello.ms
Testing Your Code
manifoldscript
1# Test tensor operations 2test "matrix multiplication" { 3 tensor A[2, 2] = [[1, 2], [3, 4]]; 4 tensor B[2, 2] = [[5, 6], [7, 8]]; 5 tensor C = A @ B; 6 7 expected = [[19, 22], [43, 50]]; 8 assert(allclose(C, expected, 1e-6)); 9}10 11# Performance test12test "performance benchmark" {13 tensor A[1024, 1024] = random(1024, 1024);14 tensor B[1024, 1024] = random(1024, 1024);15 16 start_time = time();17 tensor C = A @ B;18 end_time = time();19 20 assert(end_time - start_time < 100.0); # Less than 100ms21}