lightdb

Computationally focused database using pluggable stores

Provided Stores

Store	Type	Embedded	Persistence	Read Perf	Write Perf	Concurrency	Transactions	Full-Text Search	Notes
HaloDB	KV Store	✅	✅	✅	✅✅	🟡 (Single-threaded write)	🟡 (Basic durability)	❌	Fast, simple write-optimized store
ChronicleMap	Off-Heap Map	✅	✅ (Memory-mapped)	✅✅	✅✅	✅✅	❌	❌	Ultra low-latency, off-heap storage
LMDB	KV Store (B+Tree)	✅	✅	✅✅✅	✅	🟡 (Single write txn)	✅✅ (ACID)	❌	Read-optimized, mature B+Tree engine
MapDB	Java Collections	✅	✅	✅	✅	✅	✅	❌	Easy Java-native persistence
RocksDB	LSM KV Store	✅	✅	✅✅	✅✅✅	✅	✅	❌	High-performance LSM tree
Redis	In-Memory KV Store	🟡 (Optional)	✅ (RDB/AOF)	✅✅✅	✅✅	✅	✅	❌	Popular in-memory data structure store
Lucene	Full-Text Search	✅	✅	✅✅	✅	✅	❌	✅✅✅	Best-in-class full-text search engine
SQLite	Relational DB	✅	✅	✅	✅	🟡 (Write lock)	✅✅ (ACID)	✅ (FTS5)	Lightweight embedded SQL
H2	Relational DB	✅	✅	✅	✅	✅	✅✅ (ACID)	❌ (Basic LIKE)	Java-native SQL engine
DuckDB	Analytical SQL	✅	✅	✅✅✅	✅	✅	✅	❌	Columnar, ideal for analytics
PostgreSQL	Relational DB	❌ (Server-based)	✅	✅✅✅	✅✅	✅✅	✅✅✅ (ACID, MVCC)	✅✅ (TSVector)	Full-featured RDBMS

Legend

✅: Supported / Good
✅✅: Strong
✅✅✅: Best-in-class
🟡: Limited or trade-offs
❌: Not supported

In-Progress

Tantivy (https://github.com/quickwit-oss/tantivy) - Working on creating a wrapper around Rust's extremely fast alternative to Apache Lucene (See https://github.com/outr/scantivy)

SBT Configuration

To add all modules:

libraryDependencies += "com.outr" %% "lightdb-all" % "4.12.0-SNAPSHOT"

For a specific implementation like Lucene:

libraryDependencies += "com.outr" %% "lightdb-lucene" % "4.12.0-SNAPSHOT"

Videos

Watch this Java User Group demonstration of LightDB

Getting Started

This guide will walk you through setting up and using LightDB, a high-performance computational database. We'll use a sample application to explore its key features.

NOTE: This project uses Rapid (https://github.com/outr/rapid) for effects. It's somewhat similar to cats-effect, but with a focus on virtual threads and simplicity. In a normal project, you likely wouldn't be using .sync() to invoke each task, but for the purposes of this documentation, this is used to make the code execute blocking.

Prerequisites

Ensure you have the following:

Scala installed
SBT (Scala Build Tool) installed

Setup

Add LightDB to Your Project

Add the following dependency to your build.sbt file:

libraryDependencies += "com.outr" %% "lightdb-all" % "4.12.0-SNAPSHOT"

Example: Defining Models and Collections

Step 1: Define Your Models

LightDB uses Document and DocumentModel for schema definitions. Here's an example of defining a Person and City:

import lightdb._
import lightdb.id._
import lightdb.store._
import lightdb.doc._
import fabric.rw._

case class Person(
  name: String,
  age: Int,
  city: Option[City] = None,
  nicknames: Set[String] = Set.empty,
  friends: List[Id[Person]] = Nil,
  _id: Id[Person] = Person.id()
) extends Document[Person]

object Person extends DocumentModel[Person] with JsonConversion[Person] {
  override implicit val rw: RW[Person] = RW.gen

  val name: I[String] = field.index("name", _.name)
  val age: I[Int] = field.index("age", _.age)
  val city: I[Option[City]] = field.index("city", _.city)
  val nicknames: I[Set[String]] = field.index("nicknames", _.nicknames)
  val friends: I[List[Id[Person]]] = field.index("friends", _.friends)
}

case class City(name: String)

object City {
  implicit val rw: RW[City] = RW.gen
}

Step 2: Create the Database Class

Define the database with stores for each model:

import lightdb.sql._
import lightdb.store._
import lightdb.upgrade._
import java.nio.file.Path

object db extends LightDB {
  override type SM = CollectionManager
  override val storeManager: CollectionManager = SQLiteStore
   
  lazy val directory: Option[Path] = Some(Path.of(s"docs/db/example"))
   
  lazy val people: Collection[Person, Person.type] = store(Person)

  override def upgrades: List[DatabaseUpgrade] = Nil
}

Using the Database

Step 1: Initialize the Database

Initialize the database:

db.init.sync()

Step 2: Insert Data

Add records to the database:

val adam = Person(name = "Adam", age = 21)
// adam: Person = Person(
//   name = "Adam",
//   age = 21,
//   city = None,
//   nicknames = Set(),
//   friends = List(),
//   _id = StringId("zdX8DTpGZyn3MkGJhHaKnUz1cu9ZUdrK")
// )
db.people.transaction { implicit txn =>
  txn.insert(adam)
}.sync()
// res1: Person = Person(
//   name = "Adam",
//   age = 21,
//   city = None,
//   nicknames = Set(),
//   friends = List(),
//   _id = StringId("zdX8DTpGZyn3MkGJhHaKnUz1cu9ZUdrK")
// )

Step 3: Query Data

Retrieve records using filters:

db.people.transaction { txn =>
  txn.query.filter(_.age BETWEEN 20 -> 29).toList.map { peopleIn20s =>
    println(s"People in their 20s: $peopleIn20s")
  }
}.sync()
// People in their 20s: List(Person(Adam,21,None,Set(),List(),StringId(IDmTU51mzoBQCEyaxBuHrwtLEcmHTags)), Person(Adam,21,None,Set(),List(),StringId(KGrBn5aofL4Nr9U3rhfv3dFHFiZQLBBp)), Person(Adam,21,None,Set(),List(),StringId(zKsjLb0Oh67NU7cXuCqefzuYqEkLNYou)), Person(Adam,21,None,Set(),List(),StringId(YtDDj7Lf0ys2sVAl5KbaGwYX1cRJdV41)), Person(Adam,21,None,Set(),List(),StringId(JzoJoBINhzejipsrAYzdaUGVvlxEFW5g)), Person(Adam,21,None,Set(),List(),StringId(5o9UsGhDtjTKVOLvHZCg0Y9CYjoh5g7C)), Person(Adam,21,None,Set(),List(),StringId(SpOTvdzPy3w302cWeQXRvtuVrJFDm13Z)), Person(Adam,21,None,Set(),List(),StringId(9WD5mBb0Y5IXtF2vuDa7fi8Y0pSw0Da0)), Person(Adam,21,None,Set(),List(),StringId(1gh7JtBVdDNqjihBDogvU4NNRGPJsXkb)), Person(Adam,21,None,Set(),List(),StringId(Xa6wUoSrdhjLP2vkbKyiyUjlyWBAz4kD)), Person(Adam,21,None,Set(),List(),StringId(iT74rK8QvkrRf6DrevkvgwQcHRgFuoUE)), Person(Adam,21,None,Set(),List(),StringId(QLvnBifleraDeNmCHkKeIPqyzhnib2Eg)), Person(Adam,21,None,Set(),List(),StringId(SJAjOvPNYLRg5wQ00zxEZUOUES7tCxcP)), Person(Adam,21,None,Set(),List(),StringId(zdX8DTpGZyn3MkGJhHaKnUz1cu9ZUdrK)))

Features Highlight

Transactions: LightDB ensures atomic operations within transactions.
Indexes: Support for various indexes, like tokenized and field-based, ensures fast lookups.
Aggregation: Perform aggregations such as min, max, avg, and sum.
Streaming: Stream records for large-scale queries.
Backups and Restores: Backup and restore databases seamlessly.
Prefix-Scanned File Storage (chunked blobs): Store file metadata under file:<id> and data chunks under data::<id>::<chunk>. Requires a prefix-capable store: RocksDB, LMDB, or MapDB (B-Tree).

Advanced Queries

Aggregations

db.people.transaction { txn =>
  txn.query
    .aggregate(p => List(p.age.min, p.age.max, p.age.avg, p.age.sum))
    .toList
    .map { results =>
      println(s"Results: $results")
    }
}.sync()
// Results: List(MaterializedAggregate({"ageMin": 21, "ageMax": 21, "ageAvg": 21.0, "ageSum": 294},repl.MdocSession$MdocApp$Person$@184126b2))

Grouping

db.people.transaction { txn =>
  txn.query.grouped(_.age).toList.map { grouped =>
    println(s"Grouped: $grouped")
  }
}.sync()
// Grouped: List(Grouped(21,List(Person(Adam,21,None,Set(),List(),StringId(IDmTU51mzoBQCEyaxBuHrwtLEcmHTags)), Person(Adam,21,None,Set(),List(),StringId(KGrBn5aofL4Nr9U3rhfv3dFHFiZQLBBp)), Person(Adam,21,None,Set(),List(),StringId(zKsjLb0Oh67NU7cXuCqefzuYqEkLNYou)), Person(Adam,21,None,Set(),List(),StringId(YtDDj7Lf0ys2sVAl5KbaGwYX1cRJdV41)), Person(Adam,21,None,Set(),List(),StringId(JzoJoBINhzejipsrAYzdaUGVvlxEFW5g)), Person(Adam,21,None,Set(),List(),StringId(5o9UsGhDtjTKVOLvHZCg0Y9CYjoh5g7C)), Person(Adam,21,None,Set(),List(),StringId(SpOTvdzPy3w302cWeQXRvtuVrJFDm13Z)), Person(Adam,21,None,Set(),List(),StringId(9WD5mBb0Y5IXtF2vuDa7fi8Y0pSw0Da0)), Person(Adam,21,None,Set(),List(),StringId(1gh7JtBVdDNqjihBDogvU4NNRGPJsXkb)), Person(Adam,21,None,Set(),List(),StringId(Xa6wUoSrdhjLP2vkbKyiyUjlyWBAz4kD)), Person(Adam,21,None,Set(),List(),StringId(iT74rK8QvkrRf6DrevkvgwQcHRgFuoUE)), Person(Adam,21,None,Set(),List(),StringId(QLvnBifleraDeNmCHkKeIPqyzhnib2Eg)), Person(Adam,21,None,Set(),List(),StringId(SJAjOvPNYLRg5wQ00zxEZUOUES7tCxcP)), Person(Adam,21,None,Set(),List(),StringId(zdX8DTpGZyn3MkGJhHaKnUz1cu9ZUdrK)))))

Backup and Restore

Backup your database:

import lightdb.backup._
import java.io.File

DatabaseBackup.archive(db.stores, new File("backup.zip")).sync()
// res5: Int = 15

Restore from a backup:

DatabaseRestore.archive(db, new File("backup.zip")).sync()
// res6: Int = 15

File Storage (prefix, chunked)

Prefix-capable stores only: RocksDB, LMDB, MapDB (B-Tree). Metadata lives at file:<id>, chunks at data::<id>::<chunk>, enabling ordered streaming by chunk index.

import lightdb.file.FileStorage
import lightdb.rocksdb.RocksDBStore    // or LMDBStore / MapDBStore
import lightdb.KeyValue
import rapid.Stream
import java.nio.file.Path

object fileDb extends LightDB {
  override type SM = RocksDBStore.type
  override val storeManager: RocksDBStore.type = RocksDBStore
  override val directory = Some(Path.of("db/files"))
  override def upgrades = Nil
}

fileDb.init.sync()

// Use a dedicated KeyValue store for files (prefix-capable manager required)
val fs = FileStorage(fileDb, "_files")

// Write (chunk size = 4 bytes)
val meta = fs.put("hello.txt", Stream.emits(Seq("Hello RocksDB!".getBytes("UTF-8"))), chunkSize = 4).sync()

// Read back
val bytes = fs.readAll(meta.fileId).sync().flatten
println(new String(bytes, "UTF-8")) // Hello RocksDB!

// List and delete
fs.list.sync().map(_.fileName)
fs.delete(meta.fileId).sync()

Full-Text Search (Lucene)

import lightdb._
import lightdb.lucene.LuceneStore
import lightdb.doc._
import lightdb.id.Id
import fabric.rw._
import java.nio.file.Path

case class Note(text: String, _id: Id[Note] = Id()) extends Document[Note]
object Note extends DocumentModel[Note] with JsonConversion[Note] {
  implicit val rw: RW[Note] = RW.gen
  val text = field.tokenized("text", _.text)
}

object luceneDb extends LightDB {
  type SM = LuceneStore.type
  val storeManager = LuceneStore
  val directory = Some(Path.of("db/lucene"))
  val notes = store(Note)
  def upgrades = Nil
}

luceneDb.init.sync()
luceneDb.notes.transaction(_.insert(Note("the quick brown fox"))).sync()
// res8: Note = Note(
//   text = "the quick brown fox",
//   _id = StringId("jG4vbbyuCaioTXX3GpJV8pxfkcVMbxez")
// )
val hits = luceneDb.notes.transaction { txn =>
  txn.query.search.flatMap(_.list)
}.sync()
// hits: List[Note] = List(
//   Note(
//     text = "the quick brown fox",
//     _id = StringId("KFZdfuQF6mqDk4l1xVggnZduOBEqCkdD")
//   ),
//   Note(
//     text = "the quick brown fox",
//     _id = StringId("mioULpUq2rDeJOGqdeGxckoN8HpWL1rY")
//   ),
//   Note(
//     text = "the quick brown fox",
//     _id = StringId("jG4vbbyuCaioTXX3GpJV8pxfkcVMbxez")
//   )
// )

Spatial Queries

import lightdb._
import lightdb.doc._
import lightdb.id.Id
import lightdb.spatial.Point
import lightdb.distance._
import lightdb.sql.SQLiteStore
import fabric.rw._
import java.nio.file.Path

case class Place(name: String, loc: Point, _id: Id[Place] = Id()) extends Document[Place]
object Place extends DocumentModel[Place] with JsonConversion[Place] {
  implicit val rw: RW[Place] = RW.gen
  val name = field("name", _.name)
  val loc  = field.index("loc", _.loc) // index for spatial queries
}

object spatialDb extends LightDB {
  type SM = SQLiteStore.type
  val storeManager = SQLiteStore
  val directory = Some(Path.of("db/spatial"))
  val places = store(Place)
  def upgrades = Nil
}

spatialDb.init.sync()
spatialDb.places.transaction(_.insert(Place("NYC", Point(40.7142, -74.0119)))).sync()
// res10: Place = Place(
//   name = "NYC",
//   loc = Point(latitude = 40.7142, longitude = -74.0119),
//   _id = StringId("vPweysyUCPYjwqC4ubdkaaqCHaYJFwl1")
// )
// Distance filters are supported on spatial-capable backends; example filter:
val nycFilter = Place.loc.distance(Point(40.7, -74.0), 5_000.meters)
// nycFilter: Filter[Place] = Distance(
//   fieldName = "loc",
//   from = Point(latitude = 40.7, longitude = -74.0),
//   radius = Distance(5000.0)
// )

Graph Traversal (Edges)

import lightdb._
import lightdb.doc._
import lightdb.graph.{EdgeDocument, EdgeModel}
import lightdb.id.Id
import fabric.rw._
import java.nio.file.Path

case class GPerson(name: String, _id: Id[GPerson] = Id()) extends Document[GPerson]
object GPerson extends DocumentModel[GPerson] with JsonConversion[GPerson] {
  implicit val rw: RW[GPerson] = RW.gen
  val name = field("name", _.name)
}

case class Follows(_from: Id[GPerson], _to: Id[GPerson]) extends EdgeDocument[Follows, GPerson, GPerson] {
  override val _id: EdgeId[Follows, GPerson, GPerson] = EdgeId(_from, _to)
}
object Follows extends EdgeModel[Follows, GPerson, GPerson] with JsonConversion[Follows] {
  implicit val rw: RW[Follows] = RW.gen
}

object graphDb extends LightDB {
  type SM = lightdb.store.hashmap.HashMapStore.type
  val storeManager = lightdb.store.hashmap.HashMapStore
  val directory = None
  val people = store(GPerson)
  val follows = store(Follows)
  def upgrades = Nil
}

graphDb.init.sync()

Split Collection (storage + search)

import lightdb._
import lightdb.doc._
import lightdb.store.split.SplitStoreManager
import lightdb.rocksdb.RocksDBStore
import lightdb.lucene.LuceneStore
import fabric.rw._
import java.nio.file.Path

case class Article(title: String, body: String, _id: Id[Article] = Id()) extends Document[Article]
object Article extends DocumentModel[Article] with JsonConversion[Article] {
  implicit val rw: RW[Article] = RW.gen
  val title = field.index("title", _.title)
  val body  = field.tokenized("body", _.body)
}

object splitDb extends LightDB {
  type SM = SplitStoreManager[lightdb.rocksdb.RocksDBStore.type, lightdb.lucene.LuceneStore.type]
  val storeManager = SplitStoreManager(RocksDBStore, LuceneStore)
  val directory = Some(Path.of("db/split"))
  val articles = store(Article)
  def upgrades = Nil
}

Sharded / MultiStore

import lightdb._
import lightdb.doc._
import lightdb.store.hashmap.HashMapStore
import fabric.rw._

case class TenantDoc(value: String, _id: Id[TenantDoc] = Id()) extends Document[TenantDoc]
object TenantDoc extends DocumentModel[TenantDoc] with JsonConversion[TenantDoc] {
  implicit val rw: RW[TenantDoc] = RW.gen
  val value = field("value", _.value)
}

object shardDb extends LightDB {
  type SM = HashMapStore.type
  val storeManager = HashMapStore
  val directory = None
  def upgrades = Nil
  val shards = multiStore[String, TenantDoc, TenantDoc.type](TenantDoc, key => s"tenant_$key")
}

val tenantA = shardDb.shards("tenantA")
// tenantA: HashMapStore[TenantDoc, TenantDoc] = lightdb.store.hashmap.HashMapStore@159cebb8
tenantA.transaction(_.insert(TenantDoc("hello"))).sync()
// res12: TenantDoc = TenantDoc(
//   value = "hello",
//   _id = StringId("WHldMxQ6dxcjgHfafu92XV4qGuGpWks8")
// )

Stored Values (config flags)

import lightdb._
import fabric.rw._

object cfgDb extends LightDB {
  type SM = lightdb.store.hashmap.HashMapStore.type
  val storeManager = lightdb.store.hashmap.HashMapStore
  val directory = None
  def upgrades = Nil
}

cfgDb.init.sync()
val featureFlag = cfgDb.stored[Boolean]("featureX", default = false)
// featureFlag: StoredValue[Boolean] = StoredValue(
//   key = "featureX",
//   store = lightdb.store.hashmap.HashMapStore@42502db4,
//   default = repl.MdocSession$MdocApp$$Lambda/0x0000000053769240@4d171618,
//   persistence = Stored
// )
featureFlag.set(true).sync()
// res14: Boolean = true

SQL Stores (DuckDB / SQLite)

import lightdb._
import lightdb.doc._
import lightdb.id.Id
import lightdb.sql.SQLiteStore
import fabric.rw._
import java.nio.file.Path

case class Row(value: String, _id: Id[Row] = Id()) extends Document[Row]
object Row extends DocumentModel[Row] with JsonConversion[Row] {
  implicit val rw: RW[Row] = RW.gen
  val value = field("value", _.value)
}

object sqlDb extends LightDB {
  type SM = SQLiteStore.type
  val storeManager = SQLiteStore
  val directory = Some(Path.of("db/sqlite-example"))
  val rows = store(Row)
  def upgrades = Nil
}

sqlDb.init.sync()
sqlDb.rows.transaction(_.insert(Row("hi sql"))).sync()
// res16: Row = Row(
//   value = "hi sql",
//   _id = StringId("RuOBLLb87vBewnsUDQ1nqE6Cq2sXzULT")
// )

Reindex / Optimize / Upgrades

store.reIndex() and store.optimize() give backends a chance to rebuild or compact data.
Database upgrades: implement upgrades: List[DatabaseUpgrade] and add migration steps; LightDB runs them on init.

Clean Up

Dispose of the database when done:

db.dispose.sync()

Conclusion

This guide provided an overview of using LightDB. Experiment with its features to explore the full potential of this high-performance database. For advanced use cases, consult the API documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 1,305 Commits
.github		.github
all/src/test		all/src/test
benchmark		benchmark
chronicleMap/src		chronicleMap/src
core/src		core/src
docs		docs
duckdb/src		duckdb/src
h2/src		h2/src
halodb/src		halodb/src
lmdb/src		lmdb/src
lucene/src		lucene/src
mapdb/src		mapdb/src
postgresql/src		postgresql/src
project		project
redis/src/main/scala/lightdb/redis		redis/src/main/scala/lightdb/redis
rocksdb/src		rocksdb/src
sql/src		sql/src
sqlite/src		sqlite/src
.gitattributes		.gitattributes
.gitignore		.gitignore
.jvmopts		.jvmopts
.travis.yml		.travis.yml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt
publish.sh		publish.sh
run_benchmarks.sh		run_benchmarks.sh
why-lightdb.md		why-lightdb.md

License

outr/lightdb

Folders and files

Latest commit

History

Repository files navigation