Everything in one binary
Time-series, graph, geospatial, vector and full-text search, on-device AI, APIs, security and ML — unified in a single ~3.5 MB self-contained binary. One engine, one transaction, one programming model. Here is what GreyCat does.
Unified storage
A durable, transactional store that holds time-series, ordered indexes, sparse lists and spatial data side by side — no separate databases to stitch together.
Native time-series
nodeTime with point-in-time "as-of"
resolveAt queries and server-side sampling — temporal storage built into the
engine.
Ordered key index
nodeIndex gives you an ordered key index with
O(log n) lookups for fast range and sorted access.
Large sparse lists
nodeList handles large, sparse lists efficiently without
paying for the empty space.
Native spatial index
nodeGeo is a native spatial index — geospatial data lives
in the same store as everything else.
Durable transactional store
LMDB-backed and transactional, with online defrag plus full and incremental (delta) backups.
Schema migration
ABI-aware schema migration evolves your data model safely as your types change over time.
Graph & traversal
Model your domain as typed objects and walk the relationships directly — no impedance mismatch between storage, logic and API.
Typed persisted objects
Typed objects are persisted to disk and traversed with dot-notation — the graph is your object model.
No JOINs, no SQL/Cypher
Traverse relationships with a single dot — there are no JOINs and no separate SQL or Cypher query language to learn.
Geospatial
First-class geographic types and geometry, co-located with the rest of your data.
Geo type & calculations
A native geo type with distance, bearing and geohash
operations built in.
Geometry primitives
GeoCircle, GeoPoly and GeoBox
geometry for spatial regions and containment.
Co-located with your data
Spatial data sits alongside time-series, graph and vectors — queried in the same engine, same transaction.
Vector & embeddings
A built-in vector index and on-device embeddings, so similarity search and AI features need no external vector database or embedding API.
Built-in vector index
VectorIndex uses HNSW approximate nearest-neighbor search
with cosine, L2 and squared-L2 distances.
Tensors with FFT & SIMD
An n-dimensional Tensor type with FFT and SIMD
acceleration for numeric workloads.
On-device embeddings
Generate text embeddings in-process via embed /
embed_batch on a statically-linked llama.cpp — data never leaves the box.
On-device LLM generation roadmap
On-device LLM text generation and chat are defined in the API and on the roadmap. Today the live AI surface is embeddings and vector search.
Full-text & hybrid search
A complete search engine in a library — keyword, vector and hybrid, all C-accelerated and validated by thousands of tests.
One TextIndex, 15 modes
The text_search library's TextIndex covers
BM25/BM25F, boolean, exact, fuzzy, phonetic, phrase, proximity, prefix, wildcard, semantic
(vector) and hybrid search.
Hybrid with RRF
Hybrid search fuses keyword and vector results with Reciprocal Rank Fusion for the best of both worlds.
33 languages & RAG chunking
33-language tokenization and RAG-style document chunking, C-accelerated and validated by 2,400+ tests.
AI & agents
Turn any function into an agent tool and let external LLMs call GreyCat over a native protocol.
Built-in MCP server
A native Model Context Protocol server — two annotations turn any function into a callable MCP tool.
Auto-generated OpenAPI v3
OpenAPI v3 specs are generated automatically from your function signatures — no hand-written schemas.
Open skills marketplace
An open AI-agent skills marketplace at github.com/datathings/marketplace.
Serving & APIs
Ship the API, the web app and the typed client SDKs from the same binary.
HTTP server & RPC
An HTTP server with JSON-RPC 2.0 and path-RPC, gzip and keep-alive,
plus a /files upload API.
Serve your web app
Static web serving from a webroot — ship a full web app from the same binary.
One-command typed SDKs
Generate typed client SDKs for TypeScript, Python, Java, Rust and C with a single command.
Security
Identity, access control and cryptography built in — not bolted on.
RBAC & token auth
Role-based access control via @permission /
@role, token authentication and per-user file grants.
Enterprise OIDC SSO
Enterprise OpenID Connect single sign-on with Authorization Code PKCE and JWKS verification.
Crypto toolbox
SHA-256, HMAC, RSA and UUID primitives, backed by mbedTLS.
Operations
Scheduling, parallelism, telemetry and reflection for running real workloads.
Scheduler & parallel jobs
A cron-like scheduler and parallel jobs with transactional merge strategies.
Telemetry & reflection
Runtime telemetry and reflection to introspect and observe your running application.
CSV schema inference
Auto-infer CSV schemas with Csv::analyze /
Csv::generate.
Analytics & ML
Streaming statistics in the standard library and a full algebra library for machine learning and pattern detection.
Streaming statistics
Streaming Gaussian, Histogram, quantizers and sliding windows in the standard library.
Neural networks & PCA
The algebra library adds PCA, k-means and neural networks (regression, classification, autoencoder; Dense/LSTM/GRU/Conv2D layers).
Time-series patterns
DTW and SAX time-series pattern detection for motifs and similarity.
Connectors
Native connectors for the systems industrial digital twins depend on.
Data & messaging
PostgreSQL, Apache Kafka, OPC-UA, MQTT and S3 connectors out of the box.
Transfer & mail
SMTP, SFTP/FTP, OpenStreetMap and IFC/BIM connectors for transfer, mail, mapping and building models.
Domain libraries
Weather, solar and power-grid domain libraries for industrial and energy use cases.
Developer experience
A single binary with the tooling you expect from a modern language.
One self-contained binary
A single ~3.5 MB self-contained binary with reproducible per-project version pinning.
Analyzer, LSP & formatter
A static analyzer and LSP (VS Code, JetBrains, Zed) plus an opinionated formatter.
First-class testing
A first-class @test framework built into the language.
Performance & sovereignty
Fast where it counts, scales where you need it, and yours to run.
SIMD/C-accelerated
SIMD/C-accelerated hot paths, ingesting CSV at roughly 1.7 million rows/second documented.
From Raspberry Pi to terabytes
Scales from ARM/Raspberry Pi to terabytes and billions of nodes on the same engine.
Self-hosted & sovereign
Fully self-hosted with on-device AI — built in Luxembourg (EU).
Many-worlds branching coming soon
A many-worlds branching capability for what-if simulation is coming soon.