Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions nghi-docs/00_roadmap.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Rust Learning Roadmap for Ruvector

Welcome to your journey of learning Rust through the `ruvector` codebase! This series of documents is designed to take you from a Rust beginner to understanding the complex systems within this repository.

## The Path

1. **[01_setup_and_basics.md](./01_setup_and_basics.md)**
* **Goal**: Get your environment ready and understand the basic syntax.
* **Topics**: Installation, Cargo, Variables, Functions, Basic Types.
* **Ruvector Context**: Building the project, looking at `ruvector-cli`.

2. **[02_core_concepts.md](./02_core_concepts.md)**
* **Goal**: Master the unique features of Rust.
* **Topics**: Ownership, Borrowing, Structs, Enums, Pattern Matching.
* **Ruvector Context**: Data structures in `ruvector-core`.

3. **[03_error_handling_and_traits.md](./03_error_handling_and_traits.md)**
* **Goal**: Write robust and reusable code.
* **Topics**: `Result`, `Option`, Traits, Generics, `thiserror`, `anyhow`.
* **Ruvector Context**: Error definitions and trait usage across crates.

4. **[04_async_and_concurrency.md](./04_async_and_concurrency.md)**
* **Goal**: Understand modern asynchronous programming.
* **Topics**: `async`/`await`, Tokio runtime, Shared State (`Arc`, `Mutex`).
* **Ruvector Context**: The server implementation and concurrent vector operations.

5. **[05_advanced_systems.md](./05_advanced_systems.md)**
* **Goal**: Interface with other languages and systems.
* **Topics**: FFI (Node.js bindings), WASM.
* **Ruvector Context**: `ruvector-node`, `ruvector-wasm`.

6. **[06_ruvector_architecture.md](./06_ruvector_architecture.md)**
* **Goal**: Put it all together.
* **Topics**: Workspace structure, Crate dependencies, Key data flows.

## Prerequisites

* A curiosity to learn!
* Basic programming knowledge in another language (like JavaScript, Python, or C++) is helpful but not strictly required.

Let's get started! Go to **[01_setup_and_basics.md](./01_setup_and_basics.md)**.
72 changes: 72 additions & 0 deletions nghi-docs/01_setup_and_basics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# 01. Setup and Basics

## 1. Setting Up Your Environment

Before we dive into code, let's ensure you have Rust installed.

### Install Rust
Run the following in your terminal:
```bash
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
```
Follow the on-screen instructions. After installation, restart your terminal and check:
```bash
rustc --version
cargo --version
```

### Clone and Build Ruvector
You are already in the `ruvector` repo. Let's make sure it builds.
```bash
# In the root of the repo
cargo build
```
This might take a while as it compiles all dependencies.

## 2. Rust Basics with Ruvector

Rust uses `cargo` as its build system and package manager.

### The `Cargo.toml` File
Open `Cargo.toml` in the root. This is the workspace definition.
- `[workspace]`: Defines that this repo contains multiple crates.
- `members`: Lists all the crates (e.g., `crates/ruvector-core`, `crates/ruvector-cli`).

### Variables and Mutability
In Rust, variables are immutable by default.
```rust
let x = 5;
// x = 6; // This would cause a compile error!
```
To make them mutable, use `mut`:
```rust
let mut y = 5;
y = 6; // This is okay
```

### Functions
Functions are declared with `fn`.
```rust
fn add(a: i32, b: i32) -> i32 {
a + b // No semicolon means this is the return value
}
```

### Looking at `ruvector-cli`
Let's look at a real example. Open `crates/ruvector-cli/src/main.rs` (or similar entry point).
You'll likely see a `main` function.
```rust
fn main() {
// ... code ...
}
```
This is the entry point of the CLI application. It likely parses arguments using `clap` (a popular CLI argument parser library).

## Challenge
1. Navigate to `crates/ruvector-cli`.
2. Try to run it: `cargo run -- --help`.
3. See what commands are available.

## Next Steps
Now that you can build the code and understand the basic syntax, let's dive into the most unique feature of Rust: Ownership.
Go to **[02_core_concepts.md](./02_core_concepts.md)**.
64 changes: 64 additions & 0 deletions nghi-docs/02_core_concepts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# 02. Core Concepts: Ownership, Structs, and Enums

## 1. Ownership and Borrowing
This is what makes Rust unique. It ensures memory safety without a garbage collector.

### The Rules
1. Each value in Rust has a variable that’s called its **owner**.
2. There can only be one owner at a time.
3. When the owner goes out of scope, the value will be dropped.

### Borrowing
You can access data without taking ownership by **borrowing** it using references (`&`).
- `&T`: Immutable reference (read-only).
- `&mut T`: Mutable reference (read-write).

**Rule**: You can have *either* one mutable reference *or* any number of immutable references.

### In Ruvector
Look at `crates/ruvector-core`. You will see functions taking `&self` or `&mut self`.
- `&self`: The method borrows the instance immutably.
- `&mut self`: The method borrows the instance mutably (to modify it).

## 2. Structs
Structs are custom data types.
```rust
struct Vector {
id: u64,
data: Vec<f32>,
}
```

### In Ruvector
Search for `struct` in `crates/ruvector-core`. You'll find the core data structures defining what a vector is, how the index is stored, etc.

## 3. Enums and Pattern Matching
Enums allow a value to be one of several variants.
```rust
enum Command {
Insert(Vector),
Delete(u64),
}
```

### Pattern Matching (`match`)
Rust's `match` is powerful.
```rust
match command {
Command::Insert(vec) => println!("Inserting vector {}", vec.id),
Command::Delete(id) => println!("Deleting vector {}", id),
}
```

### In Ruvector
Enums are often used for:
- **Errors**: `enum Error { ... }`
- **Configuration**: Different types of distance metrics (e.g., `Euclidean`, `Cosine`).

## Challenge
Find a `struct` definition in `crates/ruvector-core` and identify its fields.
Find a `match` statement and see how it handles different cases.

## Next Steps
Now that we understand how data is structured and owned, let's look at how to handle errors and define shared behavior.
Go to **[03_error_handling_and_traits.md](./03_error_handling_and_traits.md)**.
74 changes: 74 additions & 0 deletions nghi-docs/03_error_handling_and_traits.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# 03. Error Handling and Traits

## 1. Error Handling
Rust doesn't use exceptions. It uses the `Result` enum.
```rust
enum Result<T, E> {
Ok(T),
Err(E),
}
```

### Handling Results
You must handle the result.
```rust
match file.open("hello.txt") {
Ok(file) => println!("File opened!"),
Err(e) => println!("Error: {}", e),
}
```

### The `?` Operator
This is syntactic sugar for propagating errors.
```rust
fn read_username_from_file() -> Result<String, io::Error> {
let mut s = String::new();
File::open("hello.txt")?.read_to_string(&mut s)?;
Ok(s)
}
```

### In Ruvector
Ruvector uses libraries like `thiserror` and `anyhow` to simplify error handling.
- `thiserror`: Used in libraries (like `ruvector-core`) to define custom error types.
- `anyhow`: Used in applications (like `ruvector-cli`) for easy error reporting.

Check `crates/ruvector-core/src/error.rs` (if it exists) or look for `#[derive(Error)]`.

## 2. Traits
Traits are like interfaces in other languages. They define shared behavior.
```rust
trait Summary {
fn summarize(&self) -> String;
}
```

### Implementing Traits
```rust
impl Summary for Vector {
fn summarize(&self) -> String {
format!("Vector ID: {}", self.id)
}
}
```

### Generics
Traits are often used with Generics to write flexible code.
```rust
fn notify<T: Summary>(item: &T) {
println!("Breaking news! {}", item.summarize());
}
```

### In Ruvector
Traits are everywhere.
- **Storage**: A trait might define how vectors are stored (e.g., in-memory vs. disk).
- **Distance**: A trait might define how to calculate distance between two vectors.

## Challenge
Find a trait definition in the codebase (look for `trait`).
Find where that trait is implemented (`impl TraitName for TypeName`).

## Next Steps
Now that we can handle errors and abstract behavior, let's look at doing things in parallel.
Go to **[04_async_and_concurrency.md](./04_async_and_concurrency.md)**.
49 changes: 49 additions & 0 deletions nghi-docs/04_async_and_concurrency.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# 04. Async and Concurrency

## 1. Async/Await
Rust's async model is cooperative. Futures are lazy; they do nothing until polled.

```rust
async fn hello_world() {
println!("Hello, world!");
}

// In main
// block_on(hello_world());
```

### Tokio
Rust needs a runtime to execute async code. `ruvector` uses `tokio`.
Look at `Cargo.toml` dependencies. You'll see `tokio`.

### In Ruvector
The server component (`crates/ruvector-server`) heavily relies on async to handle multiple connections efficiently.
Handlers for API requests are likely `async fn`.

## 2. Shared State and Concurrency
When multiple threads (or async tasks) need to access the same data, we need synchronization.

### `Arc` (Atomic Reference Counting)
Allows multiple owners of the same data across threads.

### `Mutex` (Mutual Exclusion)
Allows only one thread to access the data at a time.

### The Pattern: `Arc<Mutex<T>>`
This is a common pattern to share mutable state.
```rust
let counter = Arc::new(Mutex::new(0));
let c = Arc::clone(&counter);
```

### In Ruvector
The vector index itself needs to be accessed by multiple readers (searches) and writers (inserts).
You might see `RwLock` (Read-Write Lock) used instead of `Mutex` to allow multiple concurrent readers.

## Challenge
Search for `Arc<` or `RwLock<` in the codebase.
See how the main index is shared between the API server and the background workers.

## Next Steps
We've covered the core Rust features. Now let's see how Rust interacts with the outside world.
Go to **[05_advanced_systems.md](./05_advanced_systems.md)**.
39 changes: 39 additions & 0 deletions nghi-docs/05_advanced_systems.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# 05. Advanced Systems: FFI and WASM

Rust is great because it can go where other languages can't, or it can speed them up.

## 1. FFI (Foreign Function Interface) with Node.js
`ruvector` has a Node.js binding (`crates/ruvector-node`).
This allows JavaScript code to call Rust functions directly.

### NAPI-RS
The project uses `napi-rs` to build these bindings.
Look for `#[napi]` macros in `crates/ruvector-node`.
These macros automatically generate the glue code needed for Node.js to understand Rust structs and functions.

## 2. WASM (WebAssembly)
Rust can compile to WebAssembly to run in the browser.
Check `crates/ruvector-wasm`.

### `wasm-bindgen`
This library facilitates communication between WASM and JavaScript.
Look for `#[wasm_bindgen]`.

## 3. Unsafe Code
Rust guarantees memory safety, but sometimes you need to bypass checks (e.g., for performance or FFI).
This is done in `unsafe` blocks.
```rust
unsafe {
// scary raw pointer manipulation
}
```
`ruvector` likely minimizes this, but it might exist in performance-critical sections (like SIMD vector operations).

## Challenge
1. Go to `crates/ruvector-node`.
2. Find a function marked with `#[napi]`.
3. Imagine how you would call this from JavaScript.

## Next Steps
You have the tools. Now let's look at the map of the castle.
Go to **[06_ruvector_architecture.md](./06_ruvector_architecture.md)**.
37 changes: 37 additions & 0 deletions nghi-docs/06_ruvector_architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# 06. Ruvector Architecture

Now that you know Rust, let's understand how `ruvector` is put together.

## The Workspace
`ruvector` is a Cargo Workspace. It consists of multiple crates that work together.

### Key Crates

* **`ruvector-core`**: The brain. Contains the vector index implementation, data structures, and core logic.
* **`ruvector-server`**: The interface. Wraps the core in an HTTP/gRPC server (likely using `axum` or `tonic`).
* **`ruvector-cli`**: The tool. A command-line interface to interact with the database.
* **`ruvector-node` / `ruvector-wasm`**: The bridges. Bindings for other environments.

## Data Flow: Inserting a Vector

1. **Request**: A request comes in (via CLI, HTTP, or Node.js).
2. **Parsing**: The request is parsed into a Rust struct (e.g., `InsertRequest`).
3. **Core**: The `ruvector-core` crate takes over.
* It might validate the vector dimensions.
* It adds the vector to the storage (WAL - Write Ahead Log).
* It updates the HNSW (Hierarchical Navigable Small World) index for fast searching.
4. **Response**: A success result is returned up the chain.

## Data Flow: Searching

1. **Query**: A query vector is received.
2. **Index Search**: The HNSW index is traversed to find the nearest neighbors.
3. **Distance Calculation**: SIMD instructions (via `simsimd` crate) might be used to calculate distances (Euclidean, Cosine) extremely fast.
4. **Filtering**: Results might be filtered based on metadata.
5. **Result**: The top K matches are returned.

## Conclusion
You now have a high-level understanding of `ruvector` and the Rust concepts that power it.
The best way to learn more is to start hacking! Pick a small issue or try to add a tiny feature.

**Happy Coding!**
Loading