Zero-Cost Abstractions
Zero-cost abstractions là một trong những principles quan trọng nhất của Rust: bạn có thể viết high-level code với abstractions mạnh mẽ, nhưng không phải trả giá về performance. Compiler tối ưu abstractions thành machine code hiệu quả như hand-written low-level code.
Nguyên tắc
“What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better.”
- Bjarne Stroustrup
Trong Rust:
- Abstractions không có runtime overhead
- Compiler inline và optimize tối đa
- High-level code = Low-level performance
Iterator Chains vs Loops
Loop thủ công
#![allow(unused)]
fn main() {
fn sum_of_squares_manual(numbers: &[i32]) -> i32 {
let mut sum = 0;
for i in 0..numbers.len() {
let n = numbers[i];
if n % 2 == 0 {
sum += n * n;
}
}
sum
}
}
Iterator chain (zero-cost!)
fn sum_of_squares_iterator(numbers: &[i32]) -> i32 {
numbers.iter()
.filter(|n| *n % 2 == 0)
.map(|n| n * n)
.sum()
}
fn main() {
let nums = vec![1, 2, 3, 4, 5, 6];
let manual = sum_of_squares_manual(&nums);
let iterator = sum_of_squares_iterator(&nums);
assert_eq!(manual, iterator); // 56
println!("Result: {}", iterator);
// Cả hai compile thành CÙNG machine code!
}
Compiler Optimization
Compiler Rust optimize iterator chains qua:
1. Inlining
#![allow(unused)]
fn main() {
// High-level code
let result: i32 = (1..=100)
.filter(|x| x % 2 == 0)
.map(|x| x * x)
.sum();
// Compiler tối ưu thành tương đương:
let mut result = 0;
for x in 1..=100 {
if x % 2 == 0 {
result += x * x;
}
}
}
2. Loop Fusion
#![allow(unused)]
fn main() {
// Nhiều operations
let result: Vec<_> = data
.iter()
.map(|x| x + 1)
.filter(|x| x % 2 == 0)
.map(|x| x * 2)
.collect();
// Compiler merge thành single loop
// Không có intermediate allocations!
}
Generic Functions - Zero Overhead
// Generic function
fn print_value<T: std::fmt::Display>(value: T) {
println!("{}", value);
}
fn main() {
print_value(42); // Monomorphized cho i32
print_value("hello"); // Monomorphized cho &str
print_value(3.14); // Monomorphized cho f64
// Compiler tạo 3 versions riêng biệt
// Mỗi version được inline và optimize
// Zero runtime cost!
}
Trait Objects vs Generics
Static Dispatch (zero-cost)
trait Process {
fn process(&self) -> i32;
}
struct AddOne;
struct Double;
impl Process for AddOne {
fn process(&self, value: i32) -> i32 {
value + 1
}
}
impl Process for Double {
fn process(&self, value: i32) -> i32 {
value * 2
}
}
// ✅ Static dispatch - zero cost
fn apply_static<T: Process>(processor: &T, value: i32) -> i32 {
processor.process(value) // Compile-time dispatch
}
fn main() {
let add = AddOne;
let result = apply_static(&add, 10); // Direct function call!
println!("{}", result);
}
Dynamic Dispatch (có runtime cost)
// ❌ Dynamic dispatch - runtime overhead
fn apply_dynamic(processor: &dyn Process, value: i32) -> i32 {
processor.process(value) // Virtual function call
}
fn main() {
let add = AddOne;
let result = apply_dynamic(&add as &dyn Process, 10);
println!("{}", result);
}
Newtype Pattern - Zero Cost
struct UserId(u64);
struct ProductId(u64);
fn get_user(id: UserId) -> String {
format!("User {}", id.0)
}
fn main() {
let user = UserId(123);
// Type safety với zero runtime cost
// Compiled code giống như truyền u64 trực tiếp!
println!("{}", get_user(user));
}
Smart Pointers
Box - Heap Allocation
fn main() {
let x = Box::new(5);
// Box chỉ là pointer wrapper
// Deref tự động - zero cost!
println!("{}", *x);
// Compiled thành simple pointer operations
}
Rc - Reference Counting
use std::rc::Rc;
fn main() {
let data = Rc::new(vec![1, 2, 3]);
let data2 = Rc::clone(&data);
// Reference counting có cost (increment/decrement)
// Nhưng efficient và predictable!
}
Data Processing Pipeline - Zero Cost
#[derive(Debug, Clone)]
struct Record {
id: u64,
value: f64,
active: bool,
}
fn process_records(records: &[Record]) -> Vec<f64> {
records.iter()
.filter(|r| r.active) // Zero-cost filter
.filter(|r| r.value > 100.0) // Chain filters
.map(|r| r.value * 1.1) // Transform
.collect() // Single allocation
}
fn main() {
let records = vec![
Record { id: 1, value: 150.0, active: true },
Record { id: 2, value: 50.0, active: true },
Record { id: 3, value: 200.0, active: false },
Record { id: 4, value: 120.0, active: true },
];
let results = process_records(&records);
println!("Processed: {:?}", results);
// [165.0, 132.0]
// Compiler optimize thành efficient loop
// Không có intermediate collections!
}
Benchmark: Iterator vs Loop
use std::time::Instant;
fn benchmark_iterator(data: &[i32]) -> i32 {
data.iter()
.filter(|&&x| x % 2 == 0)
.map(|&x| x * x)
.sum()
}
fn benchmark_loop(data: &[i32]) -> i32 {
let mut sum = 0;
for &x in data {
if x % 2 == 0 {
sum += x * x;
}
}
sum
}
fn main() {
let data: Vec<i32> = (1..=1_000_000).collect();
// Benchmark iterator
let start = Instant::now();
let result1 = benchmark_iterator(&data);
let duration1 = start.elapsed();
// Benchmark loop
let start = Instant::now();
let result2 = benchmark_loop(&data);
let duration2 = start.elapsed();
assert_eq!(result1, result2);
println!("Iterator: {:?}", duration1);
println!("Loop: {:?}", duration2);
// Performance gần như giống nhau!
}
Closure Optimization
Zero-cost closures
fn apply_operation<F>(data: &[i32], op: F) -> Vec<i32>
where
F: Fn(i32) -> i32,
{
data.iter().map(|&x| op(x)).collect()
}
fn main() {
let numbers = vec![1, 2, 3, 4, 5];
// Closure được inline!
let doubled = apply_operation(&numbers, |x| x * 2);
println!("{:?}", doubled);
// Compiled thành direct operations - zero overhead!
}
Const Generics - Compile-time Computation
struct Matrix<T, const ROWS: usize, const COLS: usize> {
data: [[T; COLS]; ROWS],
}
impl<T: Default + Copy, const ROWS: usize, const COLS: usize> Matrix<T, ROWS, COLS> {
fn new() -> Self {
Matrix {
data: [[T::default(); COLS]; ROWS],
}
}
}
fn main() {
// Size checked tại compile-time
let matrix1: Matrix<f64, 3, 3> = Matrix::new();
let matrix2: Matrix<f64, 5, 5> = Matrix::new();
// Zero runtime overhead cho size checks!
}
Option và Result - Zero Cost
Option
fn find_value(data: &[i32], target: i32) -> Option<usize> {
for (i, &value) in data.iter().enumerate() {
if value == target {
return Some(i);
}
}
None
}
fn main() {
let numbers = vec![10, 20, 30, 40];
match find_value(&numbers, 30) {
Some(index) => println!("Found at index: {}", index),
None => println!("Not found"),
}
// Option được optimize thành simple enum
// Không có heap allocation hoặc indirection!
}
Parallel Iterators với Rayon - Near Zero Cost
use rayon::prelude::*;
fn sum_sequential(data: &[i32]) -> i32 {
data.iter().map(|&x| x * x).sum()
}
fn sum_parallel(data: &[i32]) -> i32 {
data.par_iter().map(|&x| x * x).sum()
}
fn main() {
let data: Vec<i32> = (1..=10_000_000).collect();
let start = std::time::Instant::now();
let seq = sum_sequential(&data);
println!("Sequential: {:?}", start.elapsed());
let start = std::time::Instant::now();
let par = sum_parallel(&data);
println!("Parallel: {:?}", start.elapsed());
assert_eq!(seq, par);
// Parallel faster với minimal overhead!
}
Match Expression - Zero Cost
enum DataType {
Integer(i64),
Float(f64),
Text(String),
}
fn process_data(data: DataType) -> String {
match data {
DataType::Integer(n) => format!("Int: {}", n),
DataType::Float(f) => format!("Float: {:.2}", f),
DataType::Text(s) => format!("Text: {}", s),
}
}
fn main() {
let values = vec![
DataType::Integer(42),
DataType::Float(3.14),
DataType::Text("hello".to_string()),
];
for value in values {
println!("{}", process_data(value));
}
// Match compiled thành efficient jump table
// Zero overhead vs if-else chain!
}
Best Practices
1. Prefer iterators over manual loops
#![allow(unused)]
fn main() {
// ✅ Zero-cost và clear intent
let sum: i32 = data.iter().filter(|&&x| x > 0).sum();
// ❌ Verbose và dễ bugs
let mut sum = 0;
for &x in data {
if x > 0 {
sum += x;
}
}
}
2. Use generics cho reusable code
#![allow(unused)]
fn main() {
// ✅ Generic - monomorphized tại compile-time
fn process<T: std::fmt::Display>(value: T) {
println!("{}", value);
}
// ❌ Trait object - runtime overhead
fn process_dyn(value: &dyn std::fmt::Display) {
println!("{}", value);
}
}
3. Chain operations
#![allow(unused)]
fn main() {
// ✅ Chaining - compiler optimize
let result = data
.iter()
.filter(|x| x.is_valid())
.map(|x| x.transform())
.collect();
// ❌ Intermediate collections - allocations
let filtered: Vec<_> = data.iter().filter(|x| x.is_valid()).collect();
let mapped: Vec<_> = filtered.iter().map(|x| x.transform()).collect();
}
4. Leverage type system
#![allow(unused)]
fn main() {
// ✅ Newtype - compile-time safety, zero runtime cost
struct UserId(u64);
fn get_user(id: UserId) { }
// ❌ Primitive type - dễ nhầm lẫn
fn get_user_raw(id: u64) { }
}
Ví dụ thực tế: ETL Pipeline
#[derive(Debug, Clone)]
struct Customer {
id: u64,
name: String,
age: u32,
revenue: f64,
}
fn extract_transform_load(customers: &[Customer]) -> Vec<(String, f64)> {
customers
.iter()
.filter(|c| c.age >= 18) // Filter
.filter(|c| c.revenue > 1000.0) // Additional filter
.map(|c| (c.name.to_uppercase(), c.revenue * 1.1)) // Transform
.collect() // Load
}
fn main() {
let customers = vec![
Customer { id: 1, name: "Alice".to_string(), age: 25, revenue: 5000.0 },
Customer { id: 2, name: "Bob".to_string(), age: 17, revenue: 2000.0 },
Customer { id: 3, name: "Charlie".to_string(), age: 30, revenue: 500.0 },
Customer { id: 4, name: "Diana".to_string(), age: 28, revenue: 3000.0 },
];
let result = extract_transform_load(&customers);
for (name, revenue) in result {
println!("{}: ${:.2}", name, revenue);
}
// ALICE: $5500.00
// DIANA: $3300.00
// Entire pipeline compiled thành efficient loop
// Single pass qua data
// Minimal allocations
}
Assembly Comparison
Xem compiler output với cargo asm hoặc Compiler Explorer (godbolt.org):
#![allow(unused)]
fn main() {
pub fn iterator_sum(data: &[i32]) -> i32 {
data.iter().filter(|&&x| x > 0).sum()
}
pub fn loop_sum(data: &[i32]) -> i32 {
let mut sum = 0;
for &x in data {
if x > 0 {
sum += x;
}
}
sum
}
// Cả hai functions compile thành CÙNG assembly code!
}
Khi nào abstractions CÓ cost?
1. Dynamic dispatch
#![allow(unused)]
fn main() {
// Runtime overhead do virtual function calls
fn process(handler: &dyn Handler) {
handler.handle(); // Indirect call
}
}
2. Excessive allocations
#![allow(unused)]
fn main() {
// Mỗi step allocate new Vec
let step1: Vec<_> = data.iter().map(|x| x + 1).collect();
let step2: Vec<_> = step1.iter().map(|x| x * 2).collect();
let step3: Vec<_> = step2.iter().filter(|&&x| x > 10).collect();
}
3. Unnecessary clones
#![allow(unused)]
fn main() {
// Clone toàn bộ data mỗi iteration
for item in data.clone() { // ❌
// ...
}
}
Tổng kết
Zero-cost abstractions trong Rust:
- ✅ High-level code = Low-level performance
- ✅ Iterators nhanh như loops
- ✅ Generics không có runtime overhead
- ✅ Newtypes không có cost
- ✅ Smart abstractions được optimize tối đa
Best practices:
- Prefer iterators và chaining
- Use generics cho static dispatch
- Leverage type system
- Trust the optimizer
- Measure khi optimization quan trọng
Rust motto: “Zero-cost abstractions - you don’t pay for what you don’t use!”