Welcome!
Contact :: Socials
Highlights :: Recent
Orange Words is a hacker news search application. It's a small hobby project, where I tinker with the combination of hacker news data, search, rag, and machine learning. (2023/2024)
[How are Embeddings Affecting Traditional Text Search?]
In this post, I provide an informal explanation of lexical, semantic, and hybrid search for text documents. (Spring 2024)
[Venra]
Venra is a python package which provides a simple, high-level api for vespa.ai. It targets subsets of Vespa's query, document, and system apis. It aims to encapsulate the complexity of dealing with the Vespa http interfaces, response behaviors, and json responses for common client tasks. (2023/2024)
Older Projects, Writing, and Code
[ski - A text based skiing game]
A number of years ago, the code for a variation of this game was published in an apple basic magazine. We eagerly typed it in and played the game for hours on an apple IIgs. This small python script attempts to recreate that game.
[Acid-NLU]
A collection of intent datasets for natural language understanding (Summer of 2020)
[TensorFlow Serving Model Status Probe]
A kubernetes probe which checks model status for a TensorFlow Serving model. (Winter of 2020)
[sine wave model on Adafruit EdgeBadge]
Tinkering with the Adafruit EdgeBadge and Tensorflow Lite (Winter of 2020)
[wer - A word error rate util for golang]
wer is a golang package which provides a function for calculating word error rate and word accuracy. It expects a pair of pre-tokenized and optionally pre-processed strings. (Winter of 2020)
[Similarity Search and Hashing for Text Documents]
This is a high level overview of similarity hashing and search for text, circa 2015. It was written while working at Catalyst, in the domain of ediscovery and document search. These techniques have mostly been superseded by learned, dense vector representations ala BERT and other self supervised language models based on transformer architectures. (Spring of 2015)
[Netropy]
Netropy is a python interface to the NIST Randomness Beacon (Spring of 2014)
For a time, with amazing partners, I helped run a local, digital newspaper for Post, Texas and Garza County. The site was built on Django and related technology. It is now preserved as a basic html archive. (Summer 2011 through Winter 2012)
[Booster]
Booster is an xquery module which provides an http interface to a portion of the MarkLogic API for administrative tasks. It is intended to reside within the default Admin app server and provide a remotely accessible hook for automated configuration. (Spring of 2011)
[Ocelog]
Ocelog is an experimental http gateway to syslog. Clients can make http post requests to the ocelog server and those requests will be authenticated, validated, and then written to syslog on the local host in standard syslog format. (Winter of 2010)
[Selecting a Language Detection Toolkit]
This is an analysis and writeup we completed as a team at Catalyst, around the Spring of 2009. We were exploring language detection tools for application against large corpus of multi-lingual documents in the domain of ediscovery. (Spring of 2009)
Soemwhere around 1998, I stumbled on the expired telnet.org domain and was able to register it. I had nothing to do with the development of telnet, but this was an exciting era when I was learning Linux and related tech. Telnet was still in common use as server and client software. I've run the site and hosted shell accounts ever since, mostly out of nostalgia. (~1998-today)