Georgy Gerganov, Author of llama.cpp and the Acoustic Keylogger
A profile of Bulgarian developer Georgi Gerganov, whose open-source llama.cpp library powers most local LLM inference but remains overshadowed by wrappers like Ollama. The article explores his 30+ projects spanning AI inference, acoustic keylogging, sound-based data transmission, and more.
Many users know Ollama as a tool for running local AI models, unaware that it is essentially a simple wrapper around the open-source llama.cpp C library that performs the actual inference. Despite this fundamental dependence, Ollama receives credit that should go to Georgi Gerganov's underlying engine.
The Ollama Attribution Problem
Meta's recent announcement of multimodal LLaMA support thanked "partners in the AI community" including Ollama but failed to mention llama.cpp. Similarly, VSCode's GitHub Copilot local model support credits Ollama rather than the underlying engine doing the actual work. Gerganov ironically noted this oversight but did not voice complaints.
More seriously, Ollama violates MIT license terms by failing to credit llama.cpp authors. The broader LLM enthusiast community criticizes Ollama for:
- Minimal contribution back to the parent project
- Misleading marketing claims suggesting full ChatGPT-scale models run on phones (actually only tiny models with slow inference)
- Poor default configurations (originally 2048-token context windows, inadequate for most tasks)
- Improper model naming, marketing inadequate distillates as full LLaMA models
- Forking open protocols like OCI while making them incompatible with standard tools
llama.cpp: The Core Technology
Gerganov began development in September 2022, following his earlier whisper.cpp project for OpenAI's Whisper speech recognition model. llama.cpp is a C/C++ library enabling LLM inference on CPUs without specialized hardware. It allows modern large language models to run on regular computers and Android devices without requiring specialized GPUs.
The library leverages GGML, a universal tensor algebra library inspired by Fabrice Bellard's LibNC. Key technical achievements include:
- Multi-platform support: x86, ARM, CUDA, Metal, Vulkan, and SYCL
- Quantization: Pre-quantization using specialized instruction sets (AVX, AVX2, AVX-512 for x86; Neon for ARM)
- GGUF format (introduced August 2023): Universal binary format storing tensors and metadata, supporting 2-8 bit quantization, standard floating-point formats, and even exotic 1.56-bit quantization
In March 2024, hacker Justin Tanny contributed optimized matrix multiplication kernels improving FP16 and 8-bit quantized performance. Tanny also created llamafile, which combines models and llama.cpp into single cross-platform executables.
Quantization and the GGUF Format
Quantization in neural networks encodes parameters as 8-bit (or lower) integers, dramatically reducing memory requirements and accelerating computation on resource-constrained devices. GGUF's format design prioritizes fast model loading while supporting various precision levels essential for edge deployment.
The Acoustic Keylogger and Other Projects
Gerganov has created over 30 projects spanning a remarkable range of domains, demonstrating extraordinary productivity and creativity.
AI/ML Projects
- whisper.cpp: High-performance inference for OpenAI's Whisper speech recognition model
- GPT-J: CPU inference implementation
- keytap2 & keytap3: Acoustic keyboard eavesdropping via frequency n-grams — the "acoustic keylogger" referenced in the title. These tools analyze the sound of keyboard typing to determine which keys were pressed
Audio and Communications
- ggwave: Sound-based data transmission library
- Waver: Ultrasonic file and message exchange (iOS and Android apps available)
- r2t2: Data transmission through computer speakers
- wave-share: Browser-based file sharing via sound
- Spectrogram: Real-time audio spectrum visualization
- GGMorse: Real-time Morse code decoding from audio
Text Interfaces
- ImTui: Immediate-mode text UI library
- slack (tui): Terminal Slack client
- hnterm: Hacker News console viewer
- wtf-tui: Dashboard utility interface
Games and Interactive Projects
- hnguessr: Guess Hacker News headlines game
- typing-battles: Multiplayer typing competition
- keytap-challenge: Guess typed text challenge
- the-story: Collaborative word-voting narrative
- wordle-bg: Bulgarian Wordle clone
- Diff Challenge: Bash-based puzzle game
Miscellaneous
- @tweet2btc: Bitcoin price prediction via Twitter polls
- @tweet2doom: Twitter bot playing Doom
- morse-meme: Morse code meme generator
- dot-to-ascii: Graphviz to ASCII converter
- lottery-check: Bulgarian lottery statistics tool
- ImGui-WS: Dear ImGui over WebSockets
Georgi Gerganov remains one of the most impactful yet under-recognized developers in the modern AI ecosystem. His work forms the invisible foundation upon which millions of users interact with large language models locally, even if most of them have never heard his name.