Filter by:

Death of Generalized Tools? Vector Embeddings and the Future of AI

The Case For and Against: “The Death of Generalized Tools”

The Case For: Generalized tools are relics of a one-size-fits-all era. Their inefficiencies—whether due to bloated features or lack of user empathy—alienate businesses. Hyper-specific solutions win by deeply embedding themselves in niche workflows, unlocking not just loyalty but also pricing power. Take modular ERP systems like Toolkit as an example: they empower SMBs to redefine their processes from scratch, something SAP could never achieve.

The Case Against: Generalized tools survive for a reason: scale and interconnectedness. While niches are attractive, fragmentation introduces complexity. Businesses relying on dozens of niche solutions face “integration fatigue,” creating bottlenecks and inefficiencies that negate their initial benefits. Moreover, niches rarely provide the scale needed for venture-backed returns, leaving them vulnerable to consolidation by larger players.

What I Believe: The tension between generalists and niches isn’t a zero-sum game. The winners will be those who embrace “modular consolidation”—tools that feel hyper-specific but integrate seamlessly into larger ecosystems. These businesses will dominate by offering the adaptability of niche solutions with the scale of generalized systems.

Unstructured Data

Current state of using unstructured data

  • What is Unstructured Data?

    • Information that does not conform to a predefined data model or schema.
    • Comprises 80-90% of all new data generated, offering immense value if harnessed effectively.
    • Its complexity and lack of structure, however, challenge traditional data infrastructure stacks.
      • There’s sometimes a misconception that investing in unstructured data infrastructure is unnecessary because AI models can learn directly from raw data.
        • Models trained on noisy or irrelevant data produce unreliable results
        • Preprocessing steps like data cleansing and normalization are essential in improving model accuracy and reducing computational costs.
          • Preprocessing reduces dimensionality and complexity, leading to faster training and lower resource consumption
    • As organizations increasingly recognize its potential, a new unstructured data stack has emerged, consisting of three core components: data extraction and ingestiondata processing, and data management.
  • 1. Data Extraction and Ingestion

    • This step captures, extracts, transforms, and optimizes unstructured data for storage and further use.
    • Strawman Argument: “Traditional ETL processes are sufficient for handling unstructured data”
    • Rebuttal: This perspective underestimates the complexities involved in extracting meaningful information from unstructured sources
    • A. Capture and Extract:
      • Sources include social media, customer feedback, emails, and beyond.
      • Techniques: web scraping, API integrations, file parsing.
      • Teams may create custom extractors or rely on pre-built solutions to achieve high extraction accuracy.
      • Tech:
        • Web Scraping and APIs:
          • Tools like Scrapy and BeautifulSoup facilitate web scraping
          • Headless browsers like Puppeteer can handle dynamic content
        • File Parsing:
          • Handling diverse file formats (PDFs, DOCX, images) requires specialized parsers
          • Libraries like Apache Tika provide content detection and extraction
        • Advanced Extraction Tools:
          • Unstructured.io: Uses machine learning to parse complex documents
          • Lume AI: Specializes in natural language understanding to extract insights from textual data
        • Computer Vision in Data Extraction:
          • New startups use advanced computer vision to extract data from visual content
        • Unlike older Intelligent Document Processing (IDP) services using OCR, these modern tools leverage vision models to improve parsing accuracy, particularly for text-dominant modalities used by large language models (LLMs).
    • Partition and Optimize:
      • Data is semantically partitioned into smaller, logical units for contextual relevance.
        • Eg. Semantic Segmentation: Topic modeling and clustering algorithms partition data into coherent units
      • Results are formatted in machine-readable structures (e.g., JSON), enabling preprocessing tasks like cleaning and embedding generation.
    • Storage Destination:
      • Extracted data is stored in scalable systems like object storage data lakes or databases, ready for use in applications such as Retrieval-Augmented Generation (RAG).
        • Object Storage Systems: Solutions like Amazon S3 or Apache Hadoop’s HDFS provide scalable storage
        • Databases Optimized for Unstructured Data: NoSQL databases like MongoDB or Elasticsearch offer flexible schemas and powerful querying
    • Key Considerations:
      • Extraction Accuracy: Incorporating feedback loops and human-in-the-loop mechanisms can enhance accuracy
      • Performance: Parallel processing and hardware acceleration can address performance bottlenecks
      • Multimodal Support: Handling different data types in a unified pipeline is increasingly important
  • 2. Data Processing

    • Unstructured data undergoes further transformation and analysis to unlock its full utility.
    • Strawman Argument: “Once the data is extracted, processing unstructured data is no different from processing structured data”
    • Rebuttal: This overlooks the unique challenges posed by unstructured data during processing
    • Transformation and Cleansing:
      • Cleansing ensures data consistency, while normalization prepares it for downstream applications.
        • Data Cleansing: Spell correction, stop-word removal, tokenization for text data
        • Normalization: Converting data into a consistent format
        • Feature Engineering: Word embeddings and contextual embeddings transform textual data for machine learning
    • Processing Engines:
      • Categorized by their focus (structured vs. unstructured data), scalability (single-node vs. distributed), and languages (SQL vs. Python)
        • Horizontal Scaling: Distributing workloads across multiple nodes
        • Hardware Acceleration: Utilizing GPUs, TPUs, or FPGAs to accelerate computationally intensive tasks
        • Real-Time Processing: Stream processing systems like Apache Flink or Kafka Streams handle continuous data flows
        • Distributed Computing: Leveraging frameworks for parallel processing
      • Popular engines like Spark, Dask, and Modin cater primarily to structured data, but emerging tools like Daft are gaining attention for their ability to handle multimodal data efficiently in distributed environments.
    • Scalability Challenges:
      • Memory Management: Data streaming and on-the-fly processing can mitigate memory constraints
      • Compute Optimization: Hardware accelerators and optimized algorithms can address compute-intensive tasks
  • 3. Data Management

    • Strawman Argument: “Data management principles are universal; the same strategies used for structured data apply to unstructured data”
    • Rebuttal: Unstructured data introduces complexities in storage optimization, metadata management, and governance
    • The backbone of the unstructured data stack, data management encompasses the organization, storage, and governance of unstructured data.
      • Key Functions:
        • Organizing and storing data to ensure easy retrieval and analysis.
          • Metadata Management: Robust metadata schemas using JSON Schema etc
          • Indexing: Inverted indices for rapid retrieval of unstructured text data
        • Implementing data governance policies for compliance, security, and privacy.
          • Access Control: Role-based and attribute-based access controls
          • Audit Trails: Logging data access and modifications for compliance and forensics
      • Regulatory and Privacy Safeguards:
        • Policies control data access and usage, safeguarding sensitive information while empowering data-driven decision-making.
      • File Formats and Challenges:
        • Apache Parquet, a widely adopted column-oriented format, is prevalent in object storage systems but has limitations:
          • Full-page loading for single-row lookups is inefficient for random, single-row lookups common in unstructured data
          • Handling wide columns typical of unstructured data is resource-intensive.
          • Limited encoding options and metadata constraints at the page level hinder performance.
  • Conclusion
    • The unstructured data stack is still in its infancy. It needs to work and will eventually work as companies can transform this untapped resource into a competitive advantage. The stack’s evolution will undoubtedly shape the future of data infrastructure.

Tomorrow's Commons

An Innovation Cascade….

The Cutting-Edge Innovation Layer has historically been represented by expensive, institutional-level technology:

  • 1940s: ENIAC at $7M ($100M+ adjusted)
  • 1960s: IBM Mainframes at millions per unit
  • 1970s: Early minicomputers at hundreds of thousands

Today this layer is represented by closed-source cloud AI (OpenAI, Anthropic) and is characterized by:

  • Highest performance capabilities
  • Highest operational costs
  • Limited accessibility
  • First-mover advantage in new capabilities

The Commercial Adaptation Layer has historical parallels in:

  • 1980s: Business-grade minicomputers
  • 1990s: Enterprise software solutions
  • 2000s: Early cloud services

Today this layer is represented by open-source cloud AI (Llama, Mistral) and is characterized by:

  • Slightly behind cutting edge
  • More economical pricing
  • Broader accessibility
  • Proven technological approaches

The Mass Adoption Layer has historical examples including:

  • 1990s: Personal computers
  • 2000s: Open source software
  • 2010s: Mobile computing

Today this layer is represented by local inference and is characterized by:

  • Mature technology
  • Minimal operational costs
  • Universal accessibility
  • Maximum deployment flexibility

A key historical pattern shows that technology inevitably flows from expensive/exclusive to affordable/accessible:

  • Mainframes → Personal computers
  • Private networks → Internet
  • Premium software → Open source

Performance gaps between layers decrease over time:

  • Example: Modern $300 smartphone exceeds 1990s supercomputer
  • Example: Free Linux matches/exceeds commercial Unix

In the end state:

  • Mass adoption layer typically ends up with the majority of capabilities
  • Democratization of technology leads to greatest total impact
  • Innovation cycle continues with new cutting-edge developments

Great Election Conversations on Metaculus

I like seeing probability weights on different outcomes on Metaculus and recently been following the 2024 US Presidential Election Winner question.

I’m not sure how accurate it is, but it’s fun to watch.

Unlike the other prediction markets – polymarket, kalshi, etc – this one seems more grounded and the comment section feels like a bunch of datasceinstists debating about how each incident affects the market without the chaos from either party – at least relatively.

Others remind me of betting on sports or gamified gambling on politics without substance.

Try it out and read the comments!

Radiooo Project

The theory goes: limited (curation mattered but it was solo dumping, not essential )→ universal consensus voting (multiple options and choose based on actual democracy) → age of too many good options (tell me what I like)

I BUILT AN ONLINE RADIO THAT I UPDATE EVERY 2 DAYS. I love sharing music w friends and wanted a corner on the internet to do that.

aava.club/songs

Thesis: there is something to be said about curation in the day we live in. We grew up on it. Music discovery was MTV and the billboard top 100. I was plugged into the radio growing up to just know what was considered “good”/”cool”.

Now it’s more general public consensus. TikTok allows for true virality of a snippet of a song. Twitter/Instagram/YouTube allow for open conversations on what’s good/not. What’s hot and what’s not.

Radio Collection

Current state of streaming and radio: Streaming revolutionised music listening by giving users on-demand access to everything — the opposite of radio. But, over time, streaming has started to look a lot like its predecessor. Streaming services now push algorithmically-generated playlists and ready-made mixes to soundtrack activities, like working out and cooking. Spotify’s AI-powered voice DJ is a lot like listening to a radio DJ provide context on their curated mix of songs. We even have streaming stations”! Where is all this heading?

Given the noise, the true winners can be picked. They’re generally agreed on. It became democratic.

But it isn’t inherently democratic. Sometimes we need the “real ones” or the “cultural curators” to tell us what’s good and what isn’t.

NOTICE: GenZ didn’t know MTV or Radio. What good was decided with consensus instead of up down.

The recent incredible growth of “Youtube Reaction Channels” is an indication of that. Which leads me to…

We need new methods of content recommendation/curation that’s based on the curator’s taste.

Derrick Gee: he’s a TikTok previous radio show host that has very respectable and professions (but still loving) insight on music. People started flooding into hearing what he proposed. He sort of started becoming a tastemaker for people that wanted to escape the current musical bubble.

He started playlists (including other Spoitify playlist makers that became professional discovery helpers).

This was the initial trigger for an inspiration that I’me believed for a while. it’s not anything that’s new.

One thing that I know I’m good at is galvanizing a direction so that people but shit that’s cool.

  • Taste + Momentum + Leading

Radio Image

New twist on online listening:

True Radio (just online format that allows for discovery)

Radio that took true online form

  • The Lot Radio
  • [Dave & Central Cee pass through the booth for a special episode of Victory Lap Balamii](https://www.balamii.com/editorial/dave-central-cee-pass-through-the-booth-for-a-special-episode-of-victory-lap)
  • Lower Grand Radio

Personally, it feels very rewarding as it feels like the intersection of all the things that I like: Imagine being Marty – the founder of poolsuite – and saying this. This is exactly all of my worlds colliding.

Radio Image

I think there is something to be said about the original style (retro desktop) vibe that poolsuite.fm had created.

It could be vinyl, cassette, or CD players. Or it can randomly simulate other stuff. I think it’s incredible.

One thing about this musical experience is I want it to be as fun as humanly possible and interpreted on a computer as it can possibly become. FUN AND CULTURED.

The theory goes: limited (curation mattered but it was solo dumping, not essential )→ universal consensus voting (multiple options and choose based on actual democracy) → age of too many good options (tell me what I like)

Better Incremental Response Modeling

Better Incremental Response Modeling

The article introduces the simplified X-Learner (Xs-Learner), a streamlined approach to uplift modeling that is easier to understand and implement than the traditional X-Learner. Uplift modeling goes beyond average treatment effects from A/B testing by estimating how a treatment’s effect varies across different users. Using the Lenta dataset, the author compares meta-learners like the S-Learner, T-Learner, and both versions of the X-Learner, demonstrating that the simplified Xs-Learner often performs as well or better in practice. The article includes Python code examples and evaluates models using Qini curves, concluding that the simplified X-Learner offers practical advantages.

Meta-learners like S-Learner, T-Learner, and X-Learner are some of the most widely used approaches for Uplift modeling. When learning about these approaches, I find that most often find the X-learner model somewhat confusing to understand. In this post, I describe a modified approach I call simplified X-learner (Xs-learner) that is easier to understand, faster to implement, and in my experience often works as well or better in practice.

Uplift Modeling

A/B testing is a common method used at tech companies to make informed decisions. For example, imagine you want to send out a coupon to users and you want to know how much it will increase the chances of them completing their first order with your service. By running an A/B test, you can determine on average how effective the coupon is. However, you may also want to know which users the coupon will help you generate higher profits and which users the coupon will cause you to lose money.

Uplift modeling is a technique that lets us go beyond learning the average effect of a treatment and instead helps us understand how the effect of the treatment varies across your users. This allows us to more efficiently decide which treatment to send to each user.

Meta-learners

Some of the most common approaches for solving uplift problems are known as meta-learners, because they are ways to take existing supervised learning algorithms and using their predictions in order to make estimates of the treatment effect for each user.

I’ll be demonstrating each of these approaches using a dataset from Lenta, a large Russian grocery store that sent out text messages to their users and saw whether it would increase their probability of making a purchase. In each of the examples I will be using the following notation:

  • Y: Did the user make a purchase (the outcome variable)
  • T: Did the user receive a text message (the treatment variable)
  • X: All the other information we know about the user, e.g. age, gender, purchase history. (The Lenta dataset has almost 200 features describing each user)
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
from xgboost import XGBClassifier, XGBRegressor
from sklift.datasets import fetch_lenta
from sklift.viz import plot_qini_curve
from numpy.random import default_rng
rng = default_rng()

We’ll use the sklift package, which has a useful function that helps download the data for the Lenta uplift experiment and do some basic processing of the data.

data = fetch_lenta()
Y = data['target_name']
X = data['feature_names']
df = pd.concat([data['target'], data['treatment'], data['data']], axis=1)
gender_map = {'Ж': 0, 'М': 1}
group_map = {'test': 1, 'control': 0}
df['gender'] = df['gender'].map(gender_map)
df['treatment'] = df['group'].map(group_map)
T = 'treatment'

# Split our data into a training and an evaluation sample
df_train, df_test = train_test_split(df, test_size=0.3, random_state=42)

S-Learner

S-learner is the simplest and easiest to understand of these approaches. With S-learner you fit a single machine learning model using all of your data, with the treatment variable (did you get a text message) as one of the features. You can then use this model to predict “what would happen if the user got the text” and “what would happen if the user did not get the text”. The difference between these two predictions is your estimate of the treatment effect of the text message on the user.

In all my examples, I use XGBoost as a simple and effective baseline ML model that is fast to train and generally works well on many problems. In any real world problem you should be testing more than one type of model and should be doing cross validation to find hyperparameters that work well for your particular problem.

slearner = XGBClassifier()
slearner.fit(df_train[X+[T]], df_train[Y])

# Calculate the difference in predictions when T=1 vs T=0
# This is our estimate of the effect of the coupon for each user in our data
slearner_te = slearner.predict_proba(df_test[X].assign(**{T: 1}))[:, 1] \
            - slearner.predict_proba(df_test[X].assign(**{T: 0}))[:, 1]

One downside of the S-learner model is that there is nothing that tells the model to give special attention to the treatment variable. This means that often your machine learning model will focus on other variables that are stronger predictors of the outcome and end up ignoring the effect of the treatment. This means that on average your estimates of the treatment will be biased towards 0.

S-learner treatment effect distribution

T-learner

T-learner uses two separate models. The first model looks only at the users who did not receive the coupon. The second model looks only at the users who did receive the coupon. To predict the treatment effect, we take the difference between the predictions of these two models. T-learner essentially forces your models to pay attention to the treatment variable since you make sure that each of the models only focuses on either the treated or untreated observations in your data.

tlearner_0 = XGBClassifier()
tlearner_1 = XGBClassifier()

# Split data into treated and untreated
df_train_0 = df_train[df_train[T] == 0]
df_train_1 = df_train[df_train[T] == 1]

# Fit the models on each sample
tlearner_0.fit(df_train_0[X], df_train_0[Y])
tlearner_1.fit(df_train_1[X], df_train_1[Y])

# Calculate the difference in predictions
tlearner_te = tlearner_1.predict_proba[df_test[X]](:, 1) \
            - tlearner_0.predict_proba[df_test[X]](:, 1)

T-learner treatment effect distribution

Simplified X-learner (Xs-learner)

The simplified X-learner use 3 models to form its predictions. The first two are exactly the same models we used for T-learner: one model trained only using the treated observations, and the other model trained using only the untreated observations.

With T-learner we formed our treatment effect estimates by taking the difference between the predictions of these two models (predicted outcome when treated minus predicted outcome when untreated). The Xs-learner takes the actual outcome of the user under the treatment their received and compares that to the predicted outcome if they received the other treatment (actual outcome minus predicted outcome).

# We could also just reuse the models we made for the T-learner
xlearner_0 = XGBClassifier()
xlearner_1 = XGBClassifier()

# Split data into treated and untreated
df_train_0 = df_train[df_train[T] == 0]
df_train_1 = df_train[df_train[T] == 1]

# Fit the models on each sample
xlearner_0.fit(df_train_0[X], df_train_0[Y])
xlearner_1.fit(df_train_1[X], df_train_1[Y])

# Calculate the difference between actual outcomes and predictions
xlearner_te_0 = xlearner_1.predict_proba[df_train_0[X]](:, 1) - df_train_0[Y]
xlearner_te_1 = df_train_1[Y] - xlearner_0.predict_proba[df_train_1[X]](:, 1)

We can’t use these differences directly, because we would not be able to make predictions for any new users since we wouldn’t know the actual outcomes for these new users. So we need to train one more model. This model predicts the treatment effect as a function of the X variables.

# Even though the outcome is binary, the treatment effects are continuous
xlearner_combined = XGBRegressor()

# Fit the combined model
xlearner_combined.fit(
  # Stack the X variables for the treated and untreated users
  pd.concat[[df_train_0, df_train_1]](X),
  # Stack the X-learner treatment effects for treated and untreated users
  pd.concat([xlearner_te_0, xlearner_te_1])
)

# Predict treatment effects for each user
xlearner_simple_te = xlearner_combined.predict(df_test[X])

Simplified X-learner treatment effect distribution

Full X-learner

The simplified X-Learner required 3 ML models. The full X-learner as originally proposed by Künzel et al. requires 5 ML models.

Instead of fitting one combined model that predicts the treatment effects for everyone, the full X-learner uses two separate models, one for the treated users and one for the untreated users. This gives us two difference models that can predict treatment effects for new users. Künzel et al. recommend taking a weighted average of the two models, with the weights determined by a final propensity score model that predicts the probability of receiving the treatment.

# Define the new models that are not used in the simple version
xlearner_te_model_0 = XGBRegressor()
xlearner_te_model_1 = XGBRegressor()
xlearner_propensity = XGBClassifier()

xlearner_te_model_0.fit(df_train_0[X], xlearner_te_0)
xlearner_te_model_1.fit(df_train_1[X], xlearner_te_1)

# Calculate predictions from both models
xlearner_te_model_0_te = xlearner_te_model_0.predict(df_test[X])
xlearner_te_model_1_te = xlearner_te_model_1.predict(df_test[X])

# Calculate the propensity scores
xlearner_propensity.fit(df_train[X], df_train[T])
xlearner_propensities = xlearner_propensity.predict_proba[df_test[X]](:, 1)

# Calculate the treatment effects as propensity weighted average
xlearner_te = xlearner_propensities * xlearner_te_model_0_te + \
              (1 - xlearner_propensities) * xlearner_te_model_1_te

Radio Image

Comparing the Results

We can compare the performance of each of these models using our held-out test set data. Here I am using Qini plots, which are a common approach for comparing the performance of Uplift models. Similar to an ROC curve, the higher the model’s line goes above the diagonal, the better the performance.

fig, ax = plt.subplots(figsize=(20, 10))

def plot_qini_short(model, label, color, linestyle):
    plot_qini_curve(df_test[Y], model, df_test[T], name=label, 
                    ax=ax, perfect=False, color=color, linestyle=linestyle)

plot_qini_short(slearner_te, 'Slearner', 'blue', 'solid')
plot_qini_short(tlearner_te, 'Tlearner', 'red', 'solid')
plot_qini_short(xlearner_simple_te, 'Xlearner Simple', 'purple', 'solid')
plot_qini_short(xlearner_te, 'Xlearner', 'green', 'solid')
ax.legend(loc='lower right');

For this particular dataset, the simplified X-Learner had the best overall performance.

We shouldn’t draw any strong conclusions about the relative performance of difference algorithms from this single example. In my experience, which algorithm works best varies a lot depending on the specific problem you are working on. However, I do think that this example demonstrates that the simplified X-Learner (Xs-learner) is one more approach worth considering when working on Uplift problems.

References

  • Athey, Susan, and Guido W. Imbens. “Machine learning for estimating heterogeneous causal effects.” №3350. 2015. Link
  • Künzel, Sören R., et al. “Metalearners for estimating heterogeneous treatment effects using machine learning.” Proceedings of the national academy of sciences 116.10 (2019): 4156–4165. Link
  • Gutierrez, Pierre, and Jean-Yves Gérardy. “Causal inference and uplift modelling: A review of the literature.” International conference on predictive applications and APIs. PMLR, 2017. Link

Modern Data Stack this Modern Data Stack that

We’ve arrived at a point where the data landscape is a maze of tools, each serving a very specific purpose but often leading to a tangled web of integrations.

  • The result? An overwhelming number of back-office processes that need to be managed, maintained, and understood just to keep things running.

In traditional data workflows, data cleanup and structuring often happen as a back-office process—an expensive, time-consuming endeavor that demands constant attention.

  • But what if we could flip the script? What if the messy, unstructured data could be cleaned, transformed, and structured the moment it enters your system, right at the edge?

Instead of building a complex ecosystem of tools that need constant upkeep, what if we frontloaded more of these processes directly into our applications?

  • By simplifying the architecture and placing the emphasis on front-loaded processes, we can create a more direct path from data to decision-making—without the detour through a dozen different platforms.

What if, rather than relying on a mess of back-office data tools, we designed our systems to handle data transformation and integration closer to the user-facing side of things?

We all should rethink our approach and bring data processes closer to the application layer, we can cut through the clutter and complexity of the data tool market.

Founders <> Open Water Swimming <> Ventures

  • Startups, Open Water Swimming, and Ventures

    • It’s interesting how a lack of resources can reveal who truly has what it takes.
      • When capital is plentiful, it’s easy to mistake luck for skill, or to think a solid business model is the reason behind success when it’s just favorable conditions.
      • The startups that endure are led by founders who not only survive but thrive amid adversity.
        • Providing too much early funding is like handing out boats—it speeds up the journey but hides who’s actually steering.
          • They might reach the next milestone faster, but we lose sight of who’s navigating. Is it a resourceful leader making wise decisions, or someone who would struggle the moment they have to swim on their own?
          • The path for startups, especially those seeking significant venture returns, demands more than a quick ride over calm waters.
            • Boats are helpful for short distances; most of the journey requires genuine swimming.
  • Becoming a Better Swimmer (Founder)

    • This involves:
      • Training: Mastering the techniques, understanding the currents.
      • Mentality: Having the courage to dive in, even when the waters are rough.
      • Experience: Building resilience from overcoming previous challenges.
      • Gear: While sometimes necessary, often it’s the mindset and endurance that matter most.
  • Evaluating Founders and Investments

    • Founders need to ask themselves honestly: Am I ready for this? Who will support me when the seas get rough? A VC like Benchmark isn’t just providing capital; they’re willing to swim alongside you if needed.
    • For investors, the challenge is to discern whether this person can “swim” through their specific market, considering the competition and potential hazards.
      • It takes deep knowledge of the “waters” to make the right judgment.
  • Testing in Calm Waters Before Facing the Open Sea

    • We look for founders who have proven they can swim in smaller, controlled environments—a local lake—before venturing into the vast ocean with them.
    • Programs like YC act as training grounds, transitioning founders from the safety of a pool to the unpredictability of open waters, but the real sea is a different realm entirely.
      • Some adapt seamlessly to the larger challenges, while others find themselves unprepared.
  • Cofounders: The Essential Crew

    • A cofounder with technical expertise is like having a seasoned swimmer who knows when to adjust their stroke and how to navigate changing tides.
    • While it’s possible to go it alone, a cofounder who understands your strengths and weaknesses from the start is invaluable.
      • They know when to shed unnecessary weight, hold onto what’s essential, and keep you afloat when the waves become overwhelming.
        • In the end, survival hinges on wisdom, skill, and time—lessons that only experience can teach.

Robust SQL Query Generator with Substrate

Building a Natural Language to SQL Query Generator

Purpose: To build a system that generates syntactically and contextually correct SQL queries from natural language inputs.

This is my experiment to play around with Substrate that reduces the complexity of multi-model systems by supporting a graph SDK.

Why Do I Love Substrate?

I think Substrate has several compelling advantages:

  • There should be a platform that takes open source models, optimizes them relentlessly, provides an API, and offers the most competitive pricing with great uptime
  • Long-term benefits from economies of scale with GPUs and optimization processes
  • High demand exists currently, with many users requiring high API volumes
  • Potential to train specialized, less powerful models optimized for cost/latency to counter foundation model companies focused primarily on capability

Counterpoints to Consider

While promising, there are some concerns:

  • Sustainability question: Will large model builders become quickly commoditized? Many startups may compete for the same developer dollars
  • Community optimization might outpace proprietary optimizations, similar to creating custom optimized PHP versions in 2001 - technical possibility but challenging business case

Implementation Details

Writing SQL with LLMs presents multiple challenges with hallucinations, not necessarily due to SQL generation itself, but due to contextual misuse.

With larger context windows, the problem becomes more pronounced as dumping all rows and context to prompts consumes excessive tokens for even simple queries.

The idea is to find a combination of Syntax and Context that’s both robust and efficient through:

  1. Mapping of the table being used
  2. Providing NLP-style SQL objects to combine for syntax

Setting Up the Environment

First, let’s set up our development environment by installing the necessary Python packages. We’ll use Pydantic for data validation and schema definition.

pip install pydantic
from pydantic import BaseModel, Field
from typing import Optional, Union, List
from enum import Enum

Defining Column Types and Enumerations

Let’s define enumerations for our database columns and SQL operations to ensure type safety:

class Departments(str, Enum):
    IT = "IT"
    SALES = "SALES"
    ACCOUNTING = "ACCOUNTING"
    CEO = "CEO"

class EmpLevel(str, Enum):
    JUNIOR = "JUNIOR"
    SEMISENIOR = "SEMISENIOR"
    SENIOR = "SENIOR"

class column_names(str, Enum):
    EMPLOYEE_ID = "employee_id"
    FIRST_NAME = "first_name"
    LAST_NAME = "last_name"
    DEPT_ID = "dept_id"
    MANAGER_ID = "manager_id"
    SALARY = "salary"
    EXPERTISE = "expertise"

class TableColumns(BaseModel):
    employee_id: Optional[int] = Field(None, title="Employee ID", description="The ID of the employee")
    first_name: Optional[str] = Field(None, title="First Name", description="The first name of the employee")
    last_name: Optional[str] = Field(None, title="Last Name", description="The last name of the employee")
    dept_id: Optional[Departments] = Field(None, title="Department ID", description="The department ID of the employee")
    manager_id: Optional[int] = Field(None, title="Manager ID", description="The ID of the manager")
    salary: Optional[int] = Field(None, title="Salary", description="The salary of the employee")
    expertise: Optional[EmpLevel] = Field(None, title="Expertise Level", description="The expertise level of the employee")

Defining SQL Syntax Models

Next, we’ll define models for SQL operations, comparisons, logic operators, and ordering:

class sql_type(str, Enum):
    SELECT = "SELECT"
    INSERT = "INSERT"
    UPDATE = "UPDATE"
    DELETE = "DELETE"

class sql_compare(str, Enum):
    EQUAL = "="
    NOT_EQUAL = "!="
    GREATER = ">"
    LESS = "<"
    GREATER_EQUAL = ">="
    LESS_EQUAL = "<="

class sql_logic_operator(str, Enum):
    AND = "AND"
    OR = "OR"

class sql_order(str, Enum):
    ASC = "ASC"
    DESC = "DESC"

class sql_comparison(BaseModel):
    column: column_names = Field(..., title="Table Column", description="Column in the Table")
    compare: sql_compare = Field(..., title="Comparison Operator", description="Comparison Operator")
    value: Union[str, Departments, EmpLevel] = Field(..., title="Value", description="Value to Compare")

class sql_logic_condition(BaseModel):
    logic: sql_logic_operator = Field(..., title="Logic Operator", description="Logic Operator")
    comparison: sql_comparison = Field(..., title="Comparison", description="Comparison")

class SQLQuery(BaseModel):
    sql: sql_type = Field(..., title="SQL Type", description="SQL Type")
    columns: list[column_names] = Field(..., title="Columns", description="Columns to Select")
    table: str = Field(..., title="Table", description="Table Name")
    conditions: List[sql_logic_condition] = Field(..., title="Conditions", description="List of Conditions with Logic")
    order: Optional[sql_order] = Field(None, title="Order", description="Order")
    limit: Optional[int] = Field(None, title="Limit", description="Limit")

Generating SQL Query Structure

Now we’ll create a function to generate the SQL query structure using OpenAI’s GPT-3.5 model:

pip install openai
import openai
import json

openai.api_key = 'your-api-key-here'

def generate_sql_json(question: str) -> dict:
    prompt = f"""
    Generate a JSON structure for an SQL query based on the following question:
    {question}

    Use the following JSON schema:
    {json.dumps(SQLQuery.model_json_schema(), indent=2)}

    Respond only with the JSON structure, nothing else.
    """

    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant that generates SQL query structures in JSON format."},
            {"role": "user", "content": prompt}
        ]
    )

    return json.loads(response.choices[0].message['content'])

# Example usage
question = "Can you provide me with the amount of employee id and salary in the Account department that has a salary greater than 50000 in descending order?"
json_response = generate_sql_json(question)

# Parse and validate the response
query_formatted = SQLQuery(**json_response)

Formatting the SQL Query

Finally, let’s create a function to format the SQLQuery object into a proper SQL string:

def format_sql_query(query: SQLQuery) -> str:
    # Generate the initial Base Query with no comparisons
    generated_query = f"{query.sql} {', '.join([col.value for col in query.columns])} FROM {query.table}"

    # Check for additional conditions
    if query.conditions:
        # Replace first logical operator with WHERE
        generated_query += " WHERE "

        # For each condition, append it the query in the correct format
        for i, condition in enumerate(query.conditions):
            if i > 0:
                generated_query += f" {condition.logic} "
            generated_query += f"{condition.comparison.column} {condition.comparison.compare} '{condition.comparison.value}'"

    # if there is an ordering rule, then format and append
    if query.order:
        generated_query += f" ORDER BY {', '.join([col.value for col in query.columns])} {query.order}"

    # if there is a limit, then format and append
    if query.limit:
        generated_query += f" LIMIT {query.limit}"

    return generated_query

# Generate the final SQL query
final_query = format_sql_query(query_formatted)
print(final_query)

This system allows us to generate SQL queries from natural language inputs in a structured and type-safe manner. By using Pydantic models, we ensure that our generated queries adhere to the correct format and data types.

Production Considerations

When implementing this system in a production environment, remember to:

  • Handle potential errors, such as invalid inputs or API failures
  • Add more complex query capabilities (JOINs, nested queries) as needed
  • Implement proper error logging and monitoring
  • Consider rate limiting and API usage optimization
  • Add proper security measures for SQL injection prevention

Retrofitting Access

It blows my mind that people can’t project future AI progress onto existing workflows.

Tools like Cursor are already crazy useful.

Now apply the next frontier of models, 10M+ token context windows, 1M token output, triple the tokens/sec, etc.

We’re just getting started.

LLM for Any Website

I played around with building a python based Streamlit app that let’s you chat to any website through RAGs

https://replit.com/@AmentiKumera/websitechatter#main.py

https://github.com/amenti4k/summary

Anomaly Detection in Timeseries Data

Description

For general use of for internal needs for a prospecting tool monitoring time-series trends, identifying inflections/anomalies, and present filters. I imagine it’s a tool that let’s you put in a metric or query and it generates a basic time-series prediction and then can notify you if its out of bounds given input bounds and alert thresholds. I was playing around with the idea last night (attached) and hosting it on streamlit for easy interactions.

Timeseries_Anomaly_Detection_to_Streamit (1).ipynb

PS: (1) used data like DAU, Funding Rounds, Acquisition Amounts, NYC taxi riders etc as I thought they’re similar/relevant to the type of data most of the readers here ingest…

IPYNB Files

[Detection_through_ADTK_+Isolation_Forest (4).ipynb](https://nbviewer.org/github/amenti4k/timeseries-anomaly-detection/blob/main/Detection_through_ADTK+Isolation_Forest(4).ipynb)

Detection_through_Darts.ipynb

Detection_through_lagllama.ipynb

Detection_through_Transformers.ipynb

Mockup

Screenshot 2024-04-25 at 6.21.57 PM.png

Gists for Visibility

https://gist.github.com/amenti4k/43da3e70407c7933ca2833667455bb18

https://gist.github.com/amenti4k/988d73dc8a0dd4fc535427b789827cbe

https://gist.github.com/amenti4k/dc286ae3dd187f4414a2c4c99b6deac0

https://gist.github.com/amenti4k/9a251ca1f1eed1dc79eb1b698175d97f

https://gist.github.com/amenti4k/63f48863ebc49246b938589e1e8f37c4

Weekly Reading Roll

Weekly Reading Roll

Last Updated: "Week Ending": 03-15-2024

Mountain Dew’s Twitch AI Raid

  • I’m split on how I feel about this. Incredible way of marketing to the right audience by cornering true fans. However, I worry how intrusive this could get.
  • Are we entering a new era of affiliate marketing and product placements?
  • “During the live period, the RAID AI will crawl all concurrent livestreams tagged under Gaming looking solely for MTN DEW products and logos. Once it identifies the presence of MTN DEW, selected streamers will get a chat asking to opt-in to join the RAID. Once you accept, the RAID AI will keep monitoring your stream for the presence of MTN DEW, if you remove your DEW, you’ll be prompted to bring it back on camera, if you don’t, you’ll be removed from our participating streamers.”

[Abstractions Rule Everything Around Me](https://benjaminschneider.ch/writing/aream.html) - Benjamin Schneider

  • “I realized that people came up with some of the abstractions most impactful in our everyday lives without ever referring to either! The more you notice all the abstractions you interact with, the more coming up with useful abstractions starts to look something humans are just generally interested in — and pretty good at.”

[Yudkowsky vs Hanson on FOOM: Whose Predictions Were Better?](https://www.lesswrong.com/posts/gGSvwd62TJAxxhcGh/yudkowsky-vs-hanson-on-foom-whose-predictions-were-betterhttps://www.lesswrong.com/posts/gGSvwd62TJAxxhcGh/yudkowsky-vs-hanson-on-foom-whose-predictions-were-better) - 1a3orn

  • I alternate between worried/excited with all the recent ai this/that debates — esp. around agi or interpretability voids. It was fun looking back at debates in the course of ML over years in the rationalist community and what they got right/wrong. This is a good summary of Eliezer and Hanson’s predictions.

[Are you serious?](https://visakanv.substack.com/p/are-you-serious) - Visakan Veerasamy

  • “So the point is to take the work seriously but you don’t take yourself too seriously. There’s a riff about this in Stephen Pressfield’s War of Art, where he talks about how amateurs are too precious with their work: ’The professional has learned, however, that too much love can be a bad thing. Too much love can make him choke. The seeming detachment of the professional, the cold-blooded character to his demeanor, is a compensating device to keep him from loving the game so much that he freezes in action.’”
  • “I’m still publishing. That’s the litmus test. Are you publishing, whatever publishing means to you? I want to see it!”

[Resignation Letter](https://www.espn.com/pdf/2016/0406/nba_hinkie_redact.pdf) - Sam Hinkie

  • clarity, brevity, and specificity in summarizing his objectives
  • “A competitive league like the NBA necessitates a zig while our competitors comfortably zag. We often chose not to defend ourselves against much of the criticism, largely in an effort to stay true to the ideal of having the longest view in the room.”

[Why Generative AI Is Mostly A Bad VC Bet](https://investinginai.substack.com/p/why-generative-ai-is-mostly-a-bad) - Rob May

  • Surprisingly early (Jan 7) call on why LLM Startups might not be the move. + I like Rob

When the cost of something trends towards zero because of new technology:

  1. You will get an explosion of that good.
  2. That good will decline in value and defensibility
  3. The economic complements to that good that see increased demand as a result of the explosion in the original good, will be the place to invest.

[THE NEXT ACT OF THE GVASALIA BROTHERS CIRCUS:](https://www.sz-mag.com/news/2023/07/op-ed-the-next-act-of-the-gvasalia-brothers-circus/) Eugene Rabkin

  • “It sounds bizarre, like a desperate couture attempt at streetwear, or worse, like a Marie Antoinette playing-at-shepherdess scenario.”

    “This is just the latest chapter in the Gvasalia circus, which, sadly, the fashion commentariat cannot get enough of.”

[a Nirav or a Naval](https://auren.substack.com/p/a-nirav-or-a-naval-that-is-the-question) - Auren Hoffman

It’s very important to realize what you’re changing or chasing. You have the ability to revolutionize a bunch of things as you’re deffo an outsider. Never discredit that. And don’t let the fact that you sometimes appear as an insider to gain clout, make you inherently an insider that’s un-opinionated/dull/and unable to influence a tectonic change.

[Superliner Returns](http://paulgraham.com/superlinear.html) - Paul Graham

  • “always be learning. If you’re not learning, you’re probably not on a path that leads to superlinear returns.”

[Why Do Rich People In Movies Seem So Fake?](https://sundogg.substack.com/p/why-do-rich-people-in-movies-seem) - Michella Jia

  • “If you are excellent in the first way, it behooves you to control the contexts in which you perform — and if you can control these contexts well, you also come off well. As for the second form of excellence, it often appears latent until catastrophe or circumstance forces a change of context. In this sense, the second type of excellence is much more difficult to spot.”

[Telomeres: Everything You Always Wanted To Know](https://www.notion.so/Daily-Log-Fall-2023-ee985cd122004f9fb8e4dabd25ee4b69?pvs=21) - Nintil

  • “The usual function ascribed to telomeres is as an anti-cancer mechanism: if we cell begins dividing too much then its telomeres will progressively shorten and it will stop dividing (or die). To overcome this, cancers end up reactivating telomerase to keep their telomere length.”

[An Extremely Opinionated Annotated List of My Favorite Mechanistic Interpretability Papers](https://www.neelnanda.io/mechanistic-interpretability/favourite-papers) - Neel Nanda

  • “The core thing to take away from it is the perspective of networks having legible(-ish) internal representations of features, and that these may be connected up into interpretable circuits. The key is that this is a mindset for thinking about networks in general, and all the discussion of image circuits is just grounding in concrete examples. On a deeper level, understanding why these are important and non-trivial claims about neural networks, and their implications.”

Are LLM Eval Benchmarks Pseudo-Science?

LLM Evaluation Platforms and Methodologies

Core Components of Evaluation

The modern LLM evaluation landscape consists of three key elements:

  • Evaluation runs to measure model performance
  • Adversarial testing sets designed to break models
  • Capability to generate new adversarial test sets
  • Benchmarking against other models

A particular focus is placed on testing models in real-world scenarios for regulated industries where error tolerance is minimal, with the goal of becoming “a trusted third party when it comes to evaluating models.”

The Challenge

Current challenges with LLM evaluation stem from several factors:

  1. Non-deterministic Behavior
    • LLMs don’t guarantee consistent outputs for identical inputs
    • Companies need rigorous testing for:
      • Topic adherence
      • Result reliability
      • Hallucination monitoring
      • PII detection
      • Unsafe behavior identification
  2. Enterprise Requirements
    • Raw LLMs don’t generate revenue in their current form
    • Need substantial tech & domain training for business alignment
    • Enterprise clients willing to pay for business-aligned solutions

Evaluation Dimensions

Traditional Testing Methods

  • Academic benchmarks
  • Human evaluations

Key Areas of Focus

  • Sideways testing of normal modes
  • High-priority harm areas:
    • Self-harm
    • Physical harm
    • Illegal items
    • Fraud
    • Child abuse

Major Evaluation Platforms

1. Open LLM Leaderboard / HELM

Maintainer: Hugging Face & Stanford

  • Provides sortable model comparisons
  • Focuses on academic benchmarks
  • Covers core scenarios: Q&A, MMLU, MATH, GSM8K
  • Used primarily by general AI developers
  • Shows some community disillusionment with academic eval metrics

2. Hallucinations Leaderboard

Maintainer: Hugging Face

  • Evaluates hallucination propensity across various tasks
  • Includes comprehensive assessment areas:
    • Open-domain QA
    • Summarization
    • Reading Comprehension
    • Instruction Following
    • Fact-Checking

3. Chatbot Arena

Maintainer: Together AI, UC-Berkeley, Stanford

  • Features anonymous, randomized model battles
  • Provides dynamic, head-to-head comparisons
  • Praised by experts like Karpathy for real-world testing

4. MTEB Leaderboard

Maintainer: Hugging Face & Cohere

  • Focuses on embedding tasks
  • Covers 58 datasets and 112 languages
  • Essential for RAG applications

5. Artificial Analysis

Maintainer: Independent startup

  • Comprehensive benchmarking across providers
  • Helps with model and hosting provider selection
  • Considers cost, quality, and speed tradeoffs

6. Martian’s Provider Leaderboard

Maintainer: Martian

  • Daily metrics collection
  • Focus on inference provider performance
  • Optimizes for cost vs. rate limits vs. throughput

7. Enterprise Scenarios Leaderboard

Maintainer: Hugging Face

  • Evaluates real-world enterprise use cases
  • Covers Finance, Legal, and other sectors
  • Currently in nascent stage

8. ToolBench

Maintainer: SambaNova Systems

  • Focuses on tool manipulation
  • Evaluates real-world task performance
  • Valuable for plugin implementation

Key Evaluation Considerations

Prompt Engineering

  • Prompt sensitivity varies by model
  • Selection process affects comparison validity
  • Documentation crucial for reproducibility

Output Evaluation

  • Generated text vs. probability distribution
  • Implications for different stakeholders:
    • Researchers
    • Product developers
    • AGI developers

Data Contamination

  • Training vs. evaluation data relationship
  • Impact on generalization assessment
  • Limited access to training data complicates evaluation

Future Directions

Multimodal Evaluation

  • Growing need for mixed-modality benchmarks
  • Long-context evaluation challenges
  • Need for innovative methodologies

Recommendations

  1. Adopt “multiple needles-in-haystack” setup
  2. Develop automatic metrics for complex reasoning
  3. Focus on real-world application scenarios
  4. Balance automatic and human evaluation methods

Evaluation Landscape Insights

  1. Market Control: Benchmark control influences market direction
  2. Early Stage: Field remains highly dynamic and undefined
  3. Complexity: Evaluation complexity approaches model complexity
  4. Human Factor: Evaluations subject to human preferences
  5. Stakeholder Diversity: Different needs for researchers vs. practitioners

Conclusion

The LLM evaluation landscape continues to evolve rapidly. Success requires balancing multiple approaches and considering various stakeholder needs. The field presents significant opportunities for innovation in evaluation methodologies and benchmarks.

References

Beyond Prompts

  • All I want to do is steering towards acceptable results rather than just tweaking prompts and hoping for the best — a judicious balance of constraints and freedom.
  • Finding interaction patterns that give more calibrated control could be key. How can we discover interfaces that unlock deeper and more tailored integrations between users and generative models beyond sentence prompts? This could significantly augment creative and knowledge work.

The Curse of Indirection

  • Current interfaces for working with generative AI models are indirect—we manipulate models mainly through text prompts and conversations. This adds friction and distance between the user’s intent and the model’s output.
  • Current text prompts place generative models at arm’s length, like trying to steer a car from the passenger seat. More integrated, direct ways of manipulating models could improve workflows. More integrated interactions could provide proper driver’s seats for precise guidance.
  • Quoting Kate Compton’s Casual Creators theory “the possibility space should be narrow enough to exclude broken artifacts… but broad enough to contain surprising artifacts as well. The surprising quality of the artifacts motivates the user to explore the possibility space in search of new discoveries, a motivation which disappears if the space is too uniform”
  • Context menus inside documents that let users branch out of their current vertical by highlighting texts/keywords could be way to overcome indirection. **
    • Example of process that might generate better value: [Hyperlinks on results](https://www.notion.so/re-engineering-prompt-interfaces-6a572f1089c64981884d0558338a0f7b?pvs=21). Clicking helping to expand the topic, based on the prompt being discussed, then clicking back minimizing it. Word exploration through clickable words that function as ever expanding tree toggles

      right?!

  • Even within the space of text based interaction, we want to keep the lineage of information that changed over time instead of overwriting the fact. Forexample using Rich Hickey’s perspective on information updating theory, “If my favorite color was red and now it’s blue, we don’t go back and change the fact that my favorite color was red to be blue – that’s wrong. Instead, we add a new, updated fact that my favorite color is now blue, but the old fact remains historically true.”

Exploring Latent Spaces

  • Moving through latent spaces quickly and viscerally is an alternative to conversational prompts. This ties to the idea from that we lack tactile, direct ways to guide text generation. New interaction techniques that let users directly manipulate directions and vectors in latent space could unlock more creative possibilities. Traversing latent idea spaces via prompts resembles blind navigation through textual adventure games. New interaction paradigms could make exploration more immersive, like being in the drivers seat of the tools using these vectors to explore/discover in latent space.
    • What most traditional apps that sit on top of large amounts of data do is, how do you take a commodity in a database and layer curation and recommendation in ways that are more usable and friendly than just giving people a search box and pushing them out of the door? Is there a way to add horizontal expansion to search instead of vertical digging that requires reformatting inserts? How do you break out of hierarchical directories that don’t scale (ie. Yahoo’s directory) — even when the hierarchies are just ranked search results from the users’ prompts?
  • I keep referring back to the covid times where I started using Roam Research for my note-taking. I was in college and had time. Back-propagation by directly playing around with interfaces. I didn’t start out being a programmer so I’ve always wondered about how to intuitively control end products to change the source code. Further extending this with what I said about language, how can we use the newfound abilities of coding on command to back-propagate information processing? So like instead of going on my weather app and searching through when it’s warm enough to leave my apt in cold nyc Dec without a jacket, moving the temperature slider to a higher degree to back-propagate the dates.

I keep thinking of what the nested knowledge graphs of roam.research look like if they were autogenerated instead of us manually generating interlinks. Learning would be awesome!

Screenshot 2023-12-05 at 9.09.20 PM.png

  • This is my roam pages networked graphics during covid when I had the time to interlink notes. It was fun and useful, but never ended up working for me due to the intensive writing process whenever typing to include “[[ ]]” whenever trying to interlink topics and having to manually remember what to even link.
  • Especially useful when the tool picks up on notes that are proper names that need to be clarified further…
    • It can help me notice connections between ideas in my notes that I wouldn’t have even thought to make myself, even if I were trying to find interesting notes to link together. With a smarter system, a similar interface could even automatically discover and show links from your notes to high-quality articles or online sources that you may not have seen yet, automatically crawling the web on your behalf.

Going back to the analogy of driving cars, in addition to giving you the seat and a steering wheel, it’s allowing you to have a windshield to look across and see where you want to maneuver!!


Balancing Guardrails and Possibilities

  • As noted in the initial thoughts, providing guardrails for safety while preserving expansive capability spaces is an important challenge. At the end of the day, we need windshields for a reason! Permitting expansive possibility spaces risks accidents or misuse. Even the drifting of attention. However, back-propagating user edits to tune outputs that could strike this balance.

Anyways, tying back this to current professional parsing tools I think finance, legal, and medical sectors deal with highly complex and structured data sets. Outside of the commonly thought of reasons on working with these data structures (like the high stakes decisions, regulatory compliance and precision needed, and room for automation/personalization), I think the complexity offers fertile ground for experimenting with innovative interaction models to manage, interpret, and manipulate such data effectively given their pre-given formatting.

Let me know if you would like me to elaborate or focus on any part of this synthesis further! I’ll leave with this: Static information media severely limits what ideas we can express and manipulate. We’re limited by how much we can conveniently represent, and so much thinking is still trapped inside the head. Dynamic, interactive media could empower entirely new realms of thinking.

Anecdotally, one summer at fb messenger my main role was on message search and ability to surface it well. I look at what is the best way to give the prompter something they want — even before they realize they want it. It all started with looking at simple descriptive stats about usage for in chat searches. People commonly searched for numbers, emails, passwords, or dates/locations. What if there’s a way to use people’s current usage flow, to add layers that guide to more discovery instead of just waiting for the user guided flow.

Obviously the worry here is not to overwhelm the user by providing buttons/flows they didn’t ask for. But I believe there is a world where it can empathetically be done!

Another worry might be tools like Harvey or Hebbia using users VDR to bring up knowledge graphs and predefined prompts that might seem intrusive and not-secure. I hope the only things standing between our current state and when this becomes the norm is some time and better enterprise ai security systems.

This is just wonderings I’ve had just written down to help me visualize my thoughts. Regardless, let me know what you think or lines/topics you’d want to explore further.

Early AI Meditations 3

Tweet on my mind

https://twitter.com/blader/status/1640387925912477698

  • 🧠 AI Memory - LLMs are great reasoning engines, not great at memory. Major opportunity for players to provide infra to help with this. Likely will be verticalized
    • Problem
      • In-context learning works, however you need to elegantly select the right context you’d like your model to have.
      • Similarity search only goes so far. Most solutions only do top-N results. Lack of connecting ideas.
    • Solutions today
      • Similarity search via Pinecone, Weaviate, etc.
    • Hypothesis
      • Different verticals will need different knowledge graph expertise. Law vs medical vs sales vs product vs user research. Verticalized players will likely emerge
    • Notes:
      • OpenAI mentions better memory on their plugin’s next steps - “Integrating more optional services, such as summarizing documents or pre-processing documents before embedding them, could enhance the plugin’s functionality and quality of retrieved results. These services could be implemented using language models and integrated directly into the plugin, rather than just being available in the scripts.”
  • 🏗️ LLM Coordinators (ex: LangChain) - Organizing, customizing and providing modularity to LLM applications
    • Problem
      • Developers needs ways to customize how their product consumes and instructs language models.
      • All developers run through the same friction when building apps. Prompt templating, retries, parsing output.
    • Solution today
    • Notes
      • Libraries like LangChain make it easier to work with LMMs. It’s unclear how much OpenAI and other companies will strategically build product into the space. Ex: LangChain and LLama index are great at document loading. Developers now need to choose if they load docs through them or use an OpenAI Plugin.
    • Model swapping, finer tuned control over agents, definitely needed.
  • 🌆 Internal company APIs - Proprietary Plugins for internal company use
    • Notes
      • Plugins could are a beautiful way for LLMs to chat with external facing apps. A cute and demo worth example of this is ChatGPT booking a dinner reservation.
    • Hypothesis
      • My hypothesis is that companies will have an internal LLM that carries out instructions with internal facing apps and plugins.
      • While large enterprise might do this themselves to start, my hypothesis is that Mid market/SMB will outsource this to products that do it for them
    • Example applications
      • Some companies are so massive that it’s difficult to know what is going on around the org. It would be great if there was an LLM that was watching a feed and only alerted me of what I needed
      • Trained specifically on a company’s code base and could make recommendations
      • Could train product marketing to better articulate how code works
      • Keep up technical documentation up to date
      • This will be similar
  • 🤖 No code ways to make your own apps - Big opportunity to empower people to make their own apps powered by AI.
    • Problem
      • Non-technical people have great ideas, but can’t build apps to execute them
    • Hypothesis
      • Low-code and no-code has already been around, but the barrier to entry is still too high. As english becomes a programming language more SMB owners will build apps that have a solid use case
      • Micro-SaaS acquisition could likely heat up here. If not to purchase a company, then for start up that can execute better to run with their idea.
  • 🎯 Offshoots of Plugin Store - Apple AI App Store
    • Notes
      • OpenAI decided to use an open API specification format, where every API provider hosts a text file on their website saying how to use their API.
      • This means even this plugin ecosystem isn’t a closed off that only a first mover controls
    • Hypothesis
      • Most of the infrastructure and support we see around the apple app store will likely follow the plugin store
  • 🔐 LLM Privacy - The Signal of LLMs
    • Notes
      • The company to crack a private LLM (Ex: Get the reasoning power of an LLM but with complete privacy) will gain massive traction.
    • Hypothesis
      • This is a horizontal feature that would likely be extremely attractive to OpenAI and other providers
  • Can we reduce the security threats present in the way we treat llms

WTF Are GPUs?

A deep dive into the GPU supply and demand bottleneck that’s shaking up AI development.

All credit goes to Clay Pascal and the GPU Utils team for their research, analysis, and excellent article.


Last week, a fantastic article dropped, exploring the wild world of GPU supply and demand. If you’ve got the time, I highly recommend giving it a full read. But if you’re strapped for time or just want the highlights (with my own twist), stick around.

Curious about what the community thinks? Check out this Hacker News thread for some lively discussions.

GPU Journey


Why Should You Care?

Let’s cut to the chase: GPUs are the backbone of modern AI. They’re the horsepower behind the Large Language Models (LLMs) that are revolutionizing everything from chatbots to data analysis. Without these powerful chips, we’d be stuck in the digital Stone Age.

But here’s the kicker—GPUs are running out. This shortage isn’t just a minor hiccup; it’s a massive roadblock that’s:

  1. Slowing Down Innovation: Fewer GPUs mean slower progress in building and training new AI models.
  2. Creating Gatekeepers: Access becomes limited to a privileged few, effectively gatekeeping AI development.

We’re caught in a vicious cycle where GPU scarcity leads to hoarding, which in turn exacerbates the shortage. This is dangerous territory, pushing innovation costs sky-high and concentrating power in the hands of tech giants like the FAANG companies.

The startup scene is buzzing like never before, and while that’s awesome, we need to keep our eyes on the real-world constraints. Otherwise, we risk pouring time and energy into endeavors hamstrung by hardware limitations.


The GPU Supply and Demand Journey Through the Lens of ChatGPT

  1. ChatGPT Is a Hit: Users can’t get enough of it. It’s probably raking in over $500 million in annual recurring revenue.
  2. Powered by GPT-4 and GPT-3.5: These are the brains behind ChatGPT.
  3. GPUs Fuel These Models: And they need a ton of them. OpenAI wants to roll out more features but is hitting a wall due to GPU shortages.
  4. Shopping for GPUs: OpenAI buys loads of Nvidia GPUs via Microsoft Azure, specifically the Nvidia H100—the latest and greatest.
  5. Making the H100 GPUs:
    • Fabrication: Nvidia uses TSMC for manufacturing.
    • Packaging: Utilizes TSMC’s CoWoS (Chip on Wafer on Substrate) tech.
    • Memory: Relies primarily on SK Hynix for HBM3 (High Bandwidth Memory).

But OpenAI isn’t the only player in the game. Other companies are jumping on the AI bandwagon, eager to train their own large models. Some have legit use cases; others are riding the hype train. This surge in demand is pushing GPU prices skyward and leading to hoarding behavior.

The Rush to Build New LLMs

  1. The Big Idea: Companies recognize huge opportunities in AI. They want to train LLMs on their own data, either for internal use or to sell access.
  2. The GPU Quest: They know they need GPUs—lots of them.
  3. Cloud Hopping: They approach the big clouds (Azure, Google Cloud, AWS) but find out that getting a substantial GPU allocation is like finding a unicorn.
  4. Plan B: They turn to other providers like CoreWeave, Oracle, Lambda, and FluidStack. Some even consider buying GPUs outright.
  5. Acquiring the Goods: They manage to secure GPUs one way or another.
  6. Chasing Product-Market Fit: Now comes the hard part—making something people actually want.

The problem? Unlike OpenAI, which achieved product-market fit with smaller models before scaling up, these companies need to outdo OpenAI right off the bat. That means needing more GPUs from the get-go.


The Demand Side: What’s Fueling the Frenzy?

Key Insights and Questions

  • Is There a GPU Shortage?
    Absolutely. For companies looking to snag hundreds or thousands of H100s, Azure and GCP are basically tapped out. AWS isn’t far behind.

  • Who’s Hoarding Thousands of GPUs?
    • Startups Training LLMs: OpenAI (via Azure), Anthropic, Inflection (via Azure and CoreWeave), Mistral AI.
    • Cloud Service Providers (CSPs): The big three—Azure, GCP, AWS—as well as Oracle, CoreWeave, and Lambda.
    • Big Players: Tesla, Jane Street, ByteDance, Tencent, and others.
  • Which GPUs Are in Demand?
    Mainly the Nvidia H100s. They’re the fastest for both training and inference of LLMs. Specifically, the 8-GPU HGX H100 SXM servers.

  • What’s the Damage ($$$)?
    One DGX H100 (with 8 H100 GPUs) will set you back about $460,000, including $100k in required support.

  • How Many GPUs Are We Talking About?
    • GPT-4 Training: Likely used between 10,000 to 25,000 A100s.
    • Meta: Around 21,000 A100s.
    • Tesla: Approximately 7,000 A100s.
    • Stability AI: About 5,000 A100s.
    • Falcon-40B Model: Trained on 384 A100s.
    • Inflection: Used 3,500 H100s for a GPT-3.5 equivalent model; aiming for 22,000 by December.
    • Cloud Capacities:
      • GCP: ~25,000 H100s.
      • Azure: Probably between 10,000–40,000 H100s, mostly allocated to OpenAI.
      • CoreWeave: Around 35,000–40,000 H100s based on bookings.
  • What Are Startups Ordering?
    For fine-tuning: dozens to low hundreds. For full-scale training: thousands.

Why Not Use AMD GPUs or Other AI Chips?

Great question. George Hotz (the guy who hacked the first iPhone and was briefly a Twitter intern) argues that the only way to start an AI chip company is by starting with the software. Many AI chip companies failed because they didn’t develop decent frameworks for their hardware. Nvidia, on the other hand, built CUDA—a robust software stack that became the industry standard.

AMD is playing catch-up but is accelerating its efforts. Companies like MosaicML are exploring training LLMs with AMD GPUs: Training LLMs with AMD MI250 GPUs and MosaicML.

How Many H100s Do Companies Want?

  • OpenAI: Might want around 50,000.
  • Inflection: Wants 22,000.
  • Meta: Possibly 25,000, maybe even 100,000+.
  • Big Clouds: Approximately 30,000 each for Azure, GCP, AWS, and Oracle.
  • Private Clouds: Lambda, CoreWeave, and others might want a combined 100,000.
  • Other Startups: Anthropic, Helsing, Mistral, Character.ai might want around 10,000 each.

Totaling that up gets us to about 432,000 H100s. At roughly $35,000 a pop, we’re talking about $15 billion worth of GPUs. And that doesn’t even include Chinese giants like ByteDance, Baidu, and Tencent, who will want plenty of H800s (the Chinese-market version of the H100).

Financial firms like Jane Street, JP Morgan, Two Sigma, and Citadel are also entering the fray, deploying anywhere from hundreds to thousands of GPUs.


The Supply Side: Why Can’t We Just Make More GPUs?

Key Insights and Questions

  • Can Nvidia Use Other Manufacturers Besides TSMC?
    Not in the short term. They’ve partnered with Samsung in the past, but for H100s, it’s TSMC or bust. Future collaborations with Intel or Samsung might happen, but they won’t solve today’s shortages.

  • How Long Does Production Take?
    Roughly six months from the start of production to a ready-to-ship H100.

  • Who’s Making the Memory?
    Nvidia primarily uses HBM3 memory from SK Hynix. Samsung might be in the mix, but Micron is likely not a player for H100s.

  • What Other Components Are Bottlenecks?
    • Metals: Copper, Tantalum, Gold, Aluminum, Nickel, Tin, Indium, Palladium.
    • Silicon: The cornerstone of semiconductors.
    • PCBs: The backbone that holds all components together.
  • Who Sells H100s?
    • OEMs: Dell, HPE, Lenovo, Supermicro, and Quanta.
    • GPU Clouds: CoreWeave and Lambda buy from OEMs and rent to startups.
    • Hyperscalers: Azure, GCP, AWS, and Oracle work more directly with Nvidia but still involve OEMs.
  • Build or Colocate?
    Building your own datacenter is time-consuming, expensive, and requires specialized expertise. Most opt for colocation to get up and running faster.

  • How Do the Big Clouds Stack Up?
    • Oracle: Less reliable infrastructure but offers more hands-on support.
    • Networking Differences: AWS and GCP have been slower to adopt InfiniBand, which is crucial for high-performance computing.
    • Availability: Azure’s H100s are mostly dedicated to OpenAI; GCP is struggling to get H100s.
  • Nvidia’s Allocation Preferences
    Nvidia seems to prefer not to give large allocations to companies developing competing chips (like AWS Inferentia and Tranium, Google TPUs, or Azure’s Project Athena).

  • Who Uses Which Cloud?
    • OpenAI: Azure.
    • Inflection: Azure and CoreWeave.
    • Anthropic: AWS and GCP.
    • Cohere: AWS and GCP.
    • Hugging Face: AWS.
    • Stability AI: CoreWeave and AWS.
    • Character.ai: GCP.
    • X.ai: Oracle.
    • Nvidia: Azure.

So, How Do You Get More GPUs?

The ultimate gatekeeper is Nvidia. They control the allocations, and they care a lot about who the end customer is. Cloud providers might score extra units if Nvidia likes the customer—preferably well-known brands or startups with solid reputations.

Nvidia is less inclined to help out companies directly competing with them in the AI chip arena. But if you show them the money—commit to a big deal, have a low-risk profile—you might just get a bigger slice of the pie.


The Road Ahead

The GPU shortage is real and isn’t going away anytime soon—likely persisting through the rest of 2023. This scarcity is shaping the AI landscape in significant ways, potentially consolidating power among the giants and making it harder for newcomers to break in.

If you’re as fascinated by this unfolding saga as I am and want more in-depth analysis, consider signing up to get notified about LLM Utils’ new posts.


Stay tuned, stay savvy, and let’s navigate this wild GPU ride together.

Case Against Bloated MVPs

I used to constantly fall prey to this — especially when working on projects in college — where talking to consumers in depth or deep rooted industry knowledge identifying a real gap was present. So I’d build products/tools that I thought were “cooool” and “useful”.

Send it out after so many hours of iteration. Product is done. You’re on a high… the random people you sent it out to stop coming to the sites. There are no users and the website is a ghost town.

It was a bloated MVP!

So below are my lessons:

  • rephrase the word Minimum Viable Product with Minimum Viable “Thing”
    • what’s the thing that solves the solution - value
    • i believe the commoditization of engineering or product building has left the ability to build insanely “cool” looking tools, without the necessity of the product existing — thus “bloated mvp”
    • “is anyone getting anyyy value out of it? or do you have 0 users?”
  • think of your #1 most compelling value prop story for your MVP
    • tell the story with as much specific context as possible
  • generate specific value to 1–10 specific people you know, consistent with your specific value prop story earlier
    • the logo/ux can come in later
    • the aim of an mvp should be how can i get the first user and give them enough value to convince myself of the product i want to build

      Screenshot 2024-02-04 at 8.30.53 PM.png

  • coding the saying into an algorithm use a greedy search algorithm of giving value to someone. where maximum value can be generated is where entry should be done for even slight doubling down of features → further sales → a product → money etc…
  • what’s the most manual way i can give value to a person, then when it’s actual value, i can scale it to a product
    • be a manual value consultant in your area — until you can’t scale it anymore because of need → then go on building the mvp

if all this is true, why are we then still building bloated mvps?

  • Value Prop Blindness: You don’t really understand what problem you’re solving, and you don’t realize how important it is to pass this sanity check before building anything
  • Cargo Culting: You want to build up your self-image as a “founder”, and you have a mental image of founders building products, so you set out to build a product
  • Social Permissivity: The startup community hasn’t yet picked up on the idea that a Minimum Viable Product typically shouldn’t be an actual product, so you get to plow ahead in the wrong direction without feeling socially pressured by your startup-peers to course correct until it’s too late
  • Sense of Control: Working on product design and engineering makes you feel (wrongly) like you know what you’re doing and you’re making tangible progress
  • Fun: Product design is fun. Engineering is also fun.

Early AI Meditations 2

Tweet on my mind

https://twitter.com/karpathy/status/1642607620673634304

  • Managed Retrieval Engines
    • Problem
      • Semantic search gets you 90% of the way there for easy questions & answers, but only 30-40% for hard Q&A
      • The hard part is understanding which documents are relevant to the query you give to the LLM
    • Why this is interesting to me
      • I see two routes document retrieval could go
        • Route #1 (Horizontal Retrieval): One general engine is really good at document retrieval across industries and domains (Law, Medical, Real Estate, etc.). It has a reasoning engine that tells it where to look
        • Route #2 (Verticalized Retrieval): Specialized retrieval engines are needed who are experts at traversing law documents which are different than medical, real estate, etc.
      • I’m unsure which way it will go! I’m currently leaning towards #2
    • Notes
      • Metal (Managed Retrieval) just announced an integration with LangChain
      • This topic likely deserves it’s own essay in the future. Here’s the TLDR of that essay already:
        • Full-stack retrieval goes like this:
          1. You have a raw corpus of documents (Held in the cloud)
          2. You split them into semanticly meaningful chunks (With LangChain or other text splitters)
          3. You convert them into some vector representation for easy comparison and searching (Using OpenAI’s embeddings)
          4. You store those vectors (using Pinecone or Weaviate)
          5. You retrieve certain documents based on the task at hand (Metal?)
        • I’m unsure how much of that stack a Pinecone.io is going to want to take vs a company like Metal.
    • Hypothesis
      • The winner of this space will go full-stack and take over more document management / retrieval workflows
  • Developer Monetization with OpenAI Plugins
    • Problem
      • In a market place you need adoption incentives on both sides to drive overall health. Without monetization for the plugin supply side (developers) it’s hard to get the demand
      • OpenAI has 100M+ users, now they need to incentivize developers to build & maintain plugins
    • Hypothesis
      • There might be a use case for micro-transactions (very hesitant to use that word) for plugin use that happens through OpenAI
      • LLM Plugin access will become a standard feature line on pricing tiers for virtually every company. Starting at the top (enterprise/mid-market) and working it’s way down to SMBs as more SMB-friendly tools get built
    • Notes:
      • I shutter at the words ‘micro-transactions’ because with all the talk over the past few years we have yet to see them happen in a material way
      • Plugin user level auth will make this seamless
  • Plugin Translators Dev Shops
    • Problem
      • Businesses will want their services to be accessible to LLMs, but they won’t all have the skills required to create, maintain, and develop plugins
    • Hypothesis
      • There will be shops that specialize in creating and maintaining plugins for companies. A small dev shop could likely ‘translate’ thousands of APIs at a time
      • Monetization incentives (above) will drive this
      • PSO (Plugin Store Optimization) will evolve out of too much supply
    • Notes
      • An early look at what this world will look like:

      https://twitter.com/matchaman11/status/1641502642219388928

  • Unstructured Data > Structured
    • Problem
      • Insights and data are valuable to businesses, but only when you have access to a source that the general market doesn’t. The harder a valuable piece of data is to grab, the more attractive it is
      • Many valuable pieces of data sit within unstructured text-based sources. It’s notoriously tedious and difficult to extract insights from them
        • Ex: Public filings, public records PDF, transcriptions
    • Hypothesis
      • There will be an addition to the data-service industry (like CBInsights) enabled by LLMs. BUT you won’t hear about it because suppliers know that their data’s value is derived from it’s scarcity. It’s not in their best interests to tell you how it’s gathered
    • Examples
      • Tech Extraction from Job Descriptions
      • Community Moderation & Analytics (Discord/Slack/Support)
        • Analytics: You have better ways to classify and report on conversations & requests in your community. Businesses would 100% pay for this if you give recommendations on how to increase health.
        • Moderation: Users post questions to the wrong channel. It would be nice to clean those up by going through them, classifying them, and moving them. Or stopping users from posting them all together
  • Reflection
    • Problem
      • LLMs are good, but not always on their first draft of a response
    • Solution
      • It’s super easy to ask them, “are you sure?” and get a better answer back. It’s been statistically proven to increase quality of answers over a varying level of benchmarks
    • Notes
      • Unfortunately reflection increases costs and latency since you’re making another API call. This isn’t a problem for all use cases, but users can be time-insensitive.
    • Resource: Great Video on the topic

      https://www.youtube.com/watch?v=5SgJKZLBrmg

  • Drag & Drop LLM/Chain Builders
    • Problem
      • Not everyone is technical. Even if you are, it’s sometimes easier to drag and drop rectangles on a screen than write code
    • Opportunity
      • Create no code tools that string together LLM calls
      • Basically no-code LangChain
    • Notes
    • Hypothesis
      • I don’t think this will be as big as it may seem. Simple no-code use cases are easier (aka Zapier), but going deep requires technical ability (aka Bubble/webflow)
      • Opinion: It’s interesting eye-candy but I wouldn’t recommend investing here
  • Elad Gil: Species Level Take Over - Link
    • Notes
      • This is just thought-candy, but I thought Elad had an interesting framework to think about different tiers
      • “For AI to move from merely another technology risk (in the long line of tech risks we have survived and benefited from on net) to a potentially existential species-level risk (all humans can die from this), up to two technological breakthroughs need to happen (and (2) below - robotics, may be sufficient):”
        1. The AI needs to start coding itself and evolve: tool → digital life transition.
        2. Robotics need to advance: Digital→real world of atoms transition.
      • Species Level Competition

    Untitled

Phoebe Philo

Phoebe A”cheaper”1 Philo

Quoting @yosoymichael from Twitter: ‘Phoebe Philo didn’t stop at “Open your purse!” She said, “Sell your house, rob a bank, and do some credit card fraud too!”’When the long awaited email dropped, I’m sure some tumbler age “Phiophites” gasped.

https://framerusercontent.com/images/S8FlVn4oJ7brWwttfQAZYumtN2Y.png

Background

To contextualize Phoebe Philo, we need to step back to “old Celine” and the legacy Phoebe had on “chic minimalism”. she’s the mommy of what’s now TikTok cringe of “quite luxury”.

Phoebe Philo, after serving as the creative director at Chloé for five years, left in 2006, succeeding her friend and predecessor, Stella McCartney, who had departed in 2001. Philo carried forward the growth momentum initiated by McCartney, garnering a dedicated customer base. Her exit from Chloé, cited as a choice to prioritize her family, was unexpected given her career peak. (Glance at her namesake brand’s MUM necklace)

https://framerusercontent.com/images/ArUPqnLF7ZHAvVq5gLVwPKylyEM.png

Her absence led to the rise of brands with a similar aesthetic, like Victoria Beckham in Spring/Summer ‘09, The Row in Autumn/Winter ‘07, and H&M’s COS in Spring/Summer ‘08. Though these brands emerged during Philo’s hiatus, their establishment would’ve begun in her absence, with COS, backed by H&M, being swift due to more resources.

Philo’s influence is undeniable; even lesser-known to the general public, her design essence impacted the fashion realm. She made a return, not with her own label but as the creative director at Celine in 2008. Her debut show was Spring/Summer ‘10. Despite challenges and competition, Philo’s distinct, consumer-focused designs set her apart. Her time at Celine further cemented her influence, elevating the brand’s financial standing in the industry.

Let’s quickly go over what made “old celine” the golden days, and birthed a cult following.

I. The Hallmarks of Philo’s Design Aesthetic:

Palette Choices

  • Monochromatic Mastery: Philo often favored a muted, monochromatic palette – a deliberate choice that exudes a sense of timeless elegance.
  • Emphasis on Neutrals: The use of beige, white, black, and navy became almost synonymous with her tenure at Céline.

Bookmark: Materiality & Texture

  • Tactile Luxury: From the buttery leathers to crisp cottons, the materials scream luxury but in a whispered, understated tone.
  • Material Interplay: Often paired contrasting materials, like wool with silk, to create depth and intrigue.

II. Silhouette & Structure:

Bookmark: Oversized Elegance

  • Effortless Oversizing: Philo championed the oversized silhouette, proving that volume can, paradoxically, highlight femininity.
  • Tailored Fluidity: Despite the ample fabrics, there was always a tailored element, whether in a cinched waist or a carefully draped fold.

Bookmark: Functional Femininity

  • Pockets and Comfort: Her designs often incorporated large pockets, an ode to practicality without compromising on elegance.
  • Ease of Movement: Flowing trousers, loose blouses, and drop-shoulder coats allowed for unrestricted movement.

III. Ionic Pieces & Collections:

Bookmark: The ‘Old Céline’ Trope

  • The Trapeze Bag: A beautifully structured bag with wings, it quickly became an ‘It’ item under her direction.
  • Glove Shoes: The V-cut shoe design, both in flats and heels, became a footwear phenomenon, emphasizing comfort and chicness.

    https://framerusercontent.com/images/B7cDbrMCzgEDXk7u8pxE9wqugY.png

Bookmark: The 2015 Spring Collection

  • The Modernist Touch: Philo’s play on proportions, asymmetry, and tunics over trousers presented a fresh take on layering.
  • Subtle Femininity: Pieces like the knit dress with flowing strands heralded a new, confident femininity.

https://framerusercontent.com/images/gqvG2IiOAWy60kakhfs4bwTvLGs.png

Her Proteges show how far back her roots extend. She nows she’s Phoebe Philo and she birthed all of them

Daniel Lee - After working under Philo at Céline as the Director of Ready-to-Wear Design, he took the helm at Bottega Veneta in 2018. Under his leadership, the brand saw a significant rejuvenation in its aesthetic and became a favorite among fashion enthusiasts and celebrities.

Naza Yousefi - She was a former accessories designer at Céline during Philo’s tenure and later founded the handbag label Yuzefi, which has become quite popular.

Peter Do is a notable designer who once mentioned that he was influenced by Philo. After studying at FIT in New York, he worked at Derek Lam and then joined Céline under Phoebe Philo, although he never worked directly with her. He later founded his own eponymous label, which is known for its tailored pieces and minimalist aesthetic reminiscent of Philo’s work. He’s now heading the comeback of Helmut Lang.

Rok Hwang - The founder of the brand Rokh worked at Céline under Philo. His label showcases deconstructed pieces, precision tailoring, and unique details that hint at his experience under Philo’s mentorship.

Lucie and Luke Meier - While their direct connection to Philo is not as former direct subordinates, they’ve exhibited aesthetic affinities with her. The duo, currently at the helm of Jil Sander, bring a minimalist and thoughtful design approach to their collections.

Gabriela Hearst - Although she did not work directly under Philo, Gabriela Hearst’s design ethos, which is sustainability-driven with a minimalist touch, has often been compared to Philo’s work.

New Collection

The elephant in the room - Price

Let’s use the brand’s most expensive bag currently, the ‘XL Cabas, as an example for how baffling Philo’s pricing is. It’s a huge tote bag in calf leather-“calf leather,” remember that–that’s selling for $8,500. Now let’s go back to the mid-2000’s where an average bag from a fashion house can costs about $700-$1,500. Philo’s famous ‘Paddington’ bag in leather when she was at the creative head of Chloe was around $1.500 at the time. Even the most expensive bag of a fashion house would always be made out of exotic leather, that would’ve costs around $4,200.

The price was a positioning necessity. Through it, Philo placed herself at the upper echelon of fashion — both in brand positioning and price. It feels like LVMH’s attempt to have an uber-luxury (Channel and Hermes. To be fair, it’s the price-point with the higher growth potential and less saturation. Hermes trying to keep the LVMH size just at 17%, so they’re trying to go around. And although Phoebe Philo doesn’t have the same demand power due to lack of “heritage” — and it’s hard to find a designer that carries the same amount of weight Philo.

https://framerusercontent.com/images/118RaiPMlXDIjxw0XTWWeiL48So.png

“Affordable Luxury” is sort of a weird thing to say for people that know Phoebe. These commenter are unserious and should really stop weighing in on these subjects. As @shannon_sense put it, “even if everything was reduced by 30% it would still be too expensive for most people.” There’s sooo much to be said about the ways people evaluate luxury fashion from their own perspectives rather than from the perspective of luxury customers.

I disagree with people that say it’s extremely expensive. It’s not. At least for Philo’s perceived standing. Chloe and Celine customers might not have prepared for the astronomical prices. I was expecting a higher price. They’re more mixed than I expected. I really was anticipating only the high priced items at first to establish the price point of the brand. But the range was a pleasant surprise, entry point products are usually a good idea and allows most people to get in on the action. I just didn’t expect so many so early.

Look, i’m not saying I can afford this, but I’m meant not to. It’s meant to be something to look forward to and them achieve. Not just buy. Pricey, but its a well studied price point, so I think its smart for the branding and positioning

However I disagree with people that say, similarly to Daniel Lee at Burberry right now, Philo must first acclimate her old customers used to her Chloe and Céline prices, as well as her new and potential ones to this new pricing under her name so that the bridge between “want” and “closet” which is actually buying the pieces and ensembles becomes easier to walk on.

Also between the time that I started writing this, and now finished, over 50% of the products in Philo’s website have sold out. Her target audience are ready to spend their money for her and that’s what she focused on. Touché

That leads me to this Rabkin post reacting to the initial release announcement about the mythical nature of Philo and luxury at large.

https://framerusercontent.com/images/STQWP8HRgjtAQEMq2GWx55byGO4.png

Phoebe Doesn’t Exist

This aura of this collection just selling out without any ads, runway, posters, influences, or anything the typical “high-fashion” world is used to, reminds me of Eugene Rabkin’s blog “Philo doesn’t exist”. Quoting directly, “the collection will be revealed [and it was] not in real life but through images – simulacra – and will be sold online, a hyperreal way of shopping. No one will have any direct contact with the clothes – arguably the only piece of reality here – until they will get a box at their home. Until then, no one will know how the materials feel, how the garments fit, or their true colors. We will not get an insight into Phoebe Philo’s work process, because she does not give interviews. We will never really know who designed the collection, how it was designed, and what it really looks like. The entire thing is a simulation.

https://framerusercontent.com/images/q4NtVoDu28AzqTmBcXjqf6yLc.png

To drive his point further, in 1991 Baudrillard wrote three articles about the first Gulf War: The Gulf War Will Not Take Place, The Gulf War Is Not Really Taking Place, The Gulf War Did Not Take Place. Of course he did not mean that there was no military action happening in Kuwait; what he argued was that our only experience of the war was through a narrow channel of highly mediated messages that have only tenuous relationship to the reality on the ground.

In other words, we live in a simulation – via screens, through social media, soaked in a semiotic system created by the vast leisure industry – entertainment, news, advertising, and so on. Similarly, when the Phoebe Philo collection came out this fall The Gulf War did not take place. Phoebe Philo does not exist.

https://framerusercontent.com/images/zFQg8rwLjPqgf2TG112xeMRtabA.png

Right Time?

In a 2006 statement responding to creating her own brand, she said it wasn’t the right time was actually right on the money so now that we are once again in an awful financial situation with what seems like an impending recession, the good old minimalism rises, and to Philo, this seems like the “right time” — along with Arnault’s backing ofc. Looking at the correlation of fashion and recession,

  • Chanel having a rise around the Depression in 1920s.
  • 1991 Helmut Lang and Jil Sander
  • 2008 recession and Philo’s rise at Celine. At the time, she was sweeping away the excess of aughts fashion with a confident new minimalism. Tapping on similar instincts now, she has an even bigger following to rely on.
  • In 2023, the cringe “old-money” and elevated basic.

The messaging from the website seems spot on: “Our aim is to create a product that reflects permanence.”

Subtle Wardrobe Direction

This Phoebe Philo feels like The Row had an affair with Rick Owens and the gay son was Bottega and thot daughter was Loewe very chic.

Everything on the website seems relaxed, less trendy, dignified. It’s reminiscent of older couture when designs were for women over 40, and aspirational. I love that she’s separating the girl from the women. This collection her news collection is about women. No Influencers, no celebrities just design and great products! The most important thing about it is intrinsic value. So much of fashion is what other people like And not what the customer actually likes. Now. It’s swinging back to the customer, and I’m all for it.

The ultimate modern wardrobe from a dissatisfied woman” says it all. And I like the collection…with caveats. But what really excites me is the idea that this collection could…maybe…free other designers from the crushing cycle of, as Horyn puts it, “chasing growth.” That chase has literally killed some of our greatest modern designers, and driven others to breakdowns. If this new Phoebe Philo augurs a new model, I’m all for it.

https://framerusercontent.com/images/7boj6gZH4QGP4di0Suv4CDsgHI0.png

Rather than attempt radical change, Phoebe Philo’s new collection offers women a subtle way to evolve their wardrobes. Having pushed boundaries before, Philo understands most women reach a point where overhaul is replaced by nuance. Adding special pieces allows self-expression, not reinvention. Philo knows women harbor hidden boldness behind practicality. Her clothes enable this duality. Witness trousers with a teasing back zipper, or a toothpick pendant necklace for discreet utility. Philo relates to the life stage where less becomes more. Her “edits” resonate by providing the special over the sweeping. Limited availability complements crafting a wardrobe across seasons, not discarding it each time. For women seeking expression through subtlety, Philo provides the perfect avenue in this new collection. Its allure is in Evolutionary, not revolutionary, dressing.

https://framerusercontent.com/images/O4yIoar63I1aLA4X9Q65Sv540.png

And my chick in that new Phoebe Philo

So much head, I woke up to Sleepy Hollow” Ye

Suddenly Popular LLMOps

Suddenly Popular LLMOps

Sometimes, all of sudden, micro-markets emerge. They can be triggered by all sorts of things, for example an external change (COVID) or a new technical capability (LLMs). The current LLMOps/PromptOps space is an instructive example. Over the last year, the number of developers experimenting with AI model APIs has 1000X’d.

The cycle to date has been something like this:

Models at scale have emergent behaviors that are magical and shocking. Consumers experienced DALLE2, ChatGPT, and a small number of LLM products gained real traction rapidly (Copilot, Jasper, Midjourney, Character). Startups have flocked to leverage these capabilities, VCs are funding them like it’s 2021 Many incumbent technology leadership teams are excitedly, anxiously resourcing AI projects. These developers all start by tinkering: they try different prompts, chain together model API calls, connect to other non-AI services, and integrate with input data sources. OSS frameworks such as LangChain and LlamaIndex, and a significant cohort of YC companies have already emerged to solve some piece of this problem. A million developers are trying to do the same thing, experiment and ship a prototype. Entrepreneurial developers see an opportunity. The billion dollar question is whether all this interest leads to any durable market.

The history of software features many legendary companies that started with an elephant of a vision almost too big to take the first bite of (Figma, who collapsed several categories of software and put them into the collaborative web to solve end-to-end for product designers). But it is also populated by companies that iterated to platforms, starting with a timely wedge (Hubspot, which expanded from SMB content marketing to the only real contender to Salesforce).

We believe great companies can emerge from the morass of spaces like “LLMOps.” But those that do will be teams that see the wedge for what it is, rather than misreading immediate momentum and interest for durable value. The distance between Github stars and Twitter likes and at-scale deployments and six figure enterprise contracts is very far. Solving an easy but acute problem in a temporary market, faster than others do, can be a smart entry point to get momentum. All things are possible for a startup with momentum, money, and the right management team.

When everyone sees the same needs, the bar for understanding those needs and executing on them goes up. The question is not, “Do developers want LLMOps?” but instead, “Which segment of those users do I focus on? What do they really need, and in what order? What will make the product easy to adopt, and what objections will I face? What architecture will support those users, and what compounding advantages can I build?”

AI is a landscape of shifting sands. What developers want today is not what they’ll want in six months, and what they need to build demos is not what they need in production, is not what they’ll need for integration into existing products. But demos could be the path to distribution. Marching in lockstep with customers along the path to market maturity requires being even more “niche” in an already small market because there are segments even now. The closer you are aligned with where some set of customers are today, the more customer trust you can build, the more likely you are to find demand others don’t understand, the better chance you have of building a very important company.

At the beginning of a market, no one really knows what user needs are. Founders who have solved a problem themselves, ahead of the crowd (or a previous iteration of it) have some advantage. But because the market is evolving, founders who are learning from customers, who launch and then have the resolution of conversation necessary to really develop a product, have even more advantage. I’ve often been surprised how common it is for startups to have an insufficient depth of understanding of customer problems, or to misread the signals from customers. Especially when working with friends and early adopters, people are inclined to be nice. If a smart and charismatic team describes a high-level problem they face reasonably accurately, they’ll nod assent, nicely. “Would you like to lose weight?” is a very different question than, “will you lose weight by eating ⅓ fewer calories, not drinking socially, and prioritizing workouts four days a week?” Customers want to solve problems. They may not picture the roadblocks to adoption and tradeoffs. They may not be willing to be directly skeptical. Here is where increasing resolution of conversations, forced prioritization and asking for the sale all provide better signal.

Sometimes, emboldened by the strength of immediate need, and feeling the pressure to raise money and execute quickly in a noisy market, founders will be quickly drawn to “defensible technical depth” as their narrative to investors. The risk is that they’re not yet sure it’s true, but they say it enough to convince themselves of a world model that’s wrong. Counterintuitively, recognizing that no part of solving the immediate problem is hard forces a more useful ongoing search and paranoia. Defensibility is overstated for most early-stage startups. It is wrongfully sought by investors, too.

The problem with the “sell picks and shovels during a gold rush” analogy is that picks and shovels are fungible, and software products are not. Eventually, defaults emerge. The risk of solving easy problems is that they’re easy for other people to solve too. They can be solved by incumbents with a distribution advantage, or by other startups.

Leadership even in “temporary markets” is a valuable position, and “easy problems” can still be good entry points for startups to leverage. Almost any growing problem ends up deeper than it first appears.

Paris Spring ‘24 Men’s Quick Reflections

Paris Spring ‘24 Men’s Quick Reflections

The most rick a rick show has been since pre pandemic.

https://framerusercontent.com/images/nVXMKVDRdvfbMXfX9JoeY1ygMxo.png

Uncertain future at Lanvin is the re’see the new show?

Prada slime. prada, shorts?

https://framerusercontent.com/images/RbbslAHLmoqLFKRP38WAejxGVno.png

Wait 032c makes clothes? Yes, and you’re late.

Louis Vuitton — what’s a king to a god, what’s a show to a spectacle, what’s a spectacle to JAY-Z.

  • I liked the role it had during the LV show, but also questioned the deviation between playing with actual great design while embracing hip hop (i.e, Fear of God) vs this (which seemed slightly off tbh).

lol pusha t’s coke music being played for so many of the men’s runways

It’s the year of Jonathan Anderson.

Gucci has to be changing entire comms strategy by Sept.

Zegna as the less cool Fear of God!

The Row moved all operations to Paris

It’s Rhude to owe people money lol

https://framerusercontent.com/images/u3Ky14FMyXhXKMjonVXhxyADiqU.png

Lemaire is great i wish they talked more!

Did anyone talk about Saint Laurent

Why would you wear Hermes RTW as a man when there is Loro Piana

A magazine curated by sacai - the bluest blue

Everyone is on ozempic fr - empty box in their fridge at posh hotel Le Bristol

https://framerusercontent.com/images/mF8egMeGqxEdCUoweGVO6mdnQzo.png

Does anybody read boring reviews?

If you walked for Junya i hope your hair is doing ok!

https://framerusercontent.com/images/olyOX5LtcEbDz25Ou6Vuyif0E2o.png

The aldia net flats are the shoes of the summer

https://framerusercontent.com/images/ex81jaUEjEj7taWt5jNk7nU.png

Kenzo aw man…

Jacquemus dropped the ball

Radio Sites I LOVE

Radio Sites I LOVE

This is a link to Ethio FM 107.8.

And here is a preview:

Momodou Lamin Jallow

Momodou Lamin Jallow

Go listen to this guy. A true generational talent that’s at his peak right now. I can’t think of anyone that’s at this potential level at the moment.

I feel personally attached because of the following reasons:

  • There was a time where my most played genre was UK Hip Hop
  • I’m Ethiopian and love the traditional African sounds/instruments
  • I love it when words are emphasized and you can not only hear the words but feel them

But he’s had a three album RUNN now;

1. Common Sense “Came in a black Benz, left in a white one I’m just a hoodlum I came with bonsam”

2. Big Conspiracy “We run from ‘rales with the mullianis They can’t see my face, I’m like a hijabi I gotta stack bread ‘cause I’m building my army They know I’m so solid, they callin’ me Harvey I get all the ‘usna and all of the narnis”

3. Beautiful and Brutal Yard (BABY) “He weren’t the same when I saw him again, he’s a real shapeshifter Used to pray facing qibla I just chill in my sector, you know I’m Hannibal Lecter Put that boy in his Pampers, us man, we’re not rampers Post outside, we’re campers, come to your uni campus Splash him, John the Baptist, I don’t need no accomplice Maybe only a driver, turn that man to a diver That day, it was raining, put on the windscreen wiper”

Also unrelated to Hus, but general music commentary – it’s truly magical how three biggest artists in the world can combine forces to make a song THAT boring. Hopefully Utopia isn’t that bad.

Men's Week Review [Ongoing]

Vetements

Guram Gvasalia explained. “But when we still live in the real world, with Apple’s headset yet to be released, we wanted to create a physical object that would give the look and feel of an AI generated image.” The point of the exercise, much like what Jacobs was getting up to with his analog 1980s designs, was to champion the human. “At its core,” Gvasalia continued, “the collection is actually anti-AI, as quality can only be done by human hands.”

The resulting silhouettes are hyperbolic the way clothes in virtual reality are, especially the pants which puddle at the floor like poured taffy.

I wish it was more monotonic — i feel like when you’re playing with proportions that aren’t familiar, it might overwhelm the person to say no when you’re dropping a lot of colors into the mix.

Given, most of my favorite looks were the black coats shown above.

Rick Owens

Wtf man that was intense — I guess he really did evoke an emotion with it.

I f*cking love the fact they only had one color - BLACK.

Incredible.

Ok on a more nuanced note, I like the tops most of the times, I loved the hoods on pieces you typically don’t see. I have to say one of the more interesting looks related to suits that could breath some life into the death of suit dressing (people were making the suit pants shorts for Gods sake).

I get the aesthetics of the shoes are a bit different now, but truly didn’t feel the pants/shoes.

Rick Owens said: “This morning when it was raining I was almost hoping it kept raining during that show. That no one would turn up. Then we’d have that same vibe, that emptiness, which is what I loved about those shows. It was like ‘even under these circumstances, we’re going to forge ahead and run it even if nobody shows up. We’re still gonna do this. Because we’re unstoppable.” Yet this was mindful consumption, contradiction with a cause, fashion with a position. Bang! It was beautiful, for the damned. I felt overwhelemed the at the end wtf

The tops in all of this and the loopy loop is sickkk — especially look 3 and 5. All over, 23, 24, and 32 are great as well.

After a few years of wondering why Rick is considered one of the Goats, I finally get it — I get it. Ricky fucking Owens man.

JW Anderson

V underwhelming to be honest, but Idk what to expect form a guy who’s label is just his name and isn’t notorious yet.

The figures of everyone involved looked similar and cohesive in an odd way, but there wan’t a piece that grabbed my attention that much.

SHORTS — the disproportioned shorts are my personal favorites and it’s clear to see they’re being used repeatedly as he found something that he even is amazed by. Looks: 6, 11, 40, 41, and 42.

Aside of that v underwhelming and chill…

Wales Bonner

Wales Bonner

I want to like it, and I don’t that kind of bums me that I don’t.

Maybe it’s because the spring/summer line is quite different from the fall/winter lines that i usually have a large affinity towards.

I want to learn more of her story and how her designs are different in the aim of understanding her work in depth.

The coat at the end and some of the tracksuits were incredible making me think that there’s more here than meets the eye.

Glenn Martens: Vanguard of the Modern Silhouette?

Glenn Martens: Vanguard of the Modern Silhouette?

https://framerusercontent.com/images/OnPQvEeSSqoIysx3qDfwyBbJVG8.png

Three nuanced ensembles from Glenn Martens, shot by Luis Alberto Rodrigues. Showcasing his work across Diesel, Jean Paul Gaultier, and y/Project.

Martens’ rapid rise illustrates how hungry the industry is to anoint a new savior. He’s trying to bring back avant-garde experimentation to the high-fashion mainstream, after years of minimalism dominating luxury fashion.Yes, he modernizes Y/Project and Courrèges with a youthful energy. What I worry about is: does his seemingly recycled aesthetic lacking in true creativity? Wondering if he’s overhyped and underwhelming given the industry is too eager to coronate the next big thing before they are ready.

Drawing inspiration from the fluidity of architecture, Martens creates designs reminiscent of the natural mountainous vistas of his homeland. It comes as no surprise that he shares the podium with JW Anderson, another proponent sculptural fashion. Both trailblazers, Anderson leads LOEWE and JW Anderson, while Martens’ indelible mark is felt across multiple maisons.

Manifesto of Martens’ Aesthetics:

  • Layered intricacies.
  • Thoughtful twists.
  • Bold prints.
  • Precise folds and jagged edges.
  • The art of the oversized and asymmetrical.

In a world obsessed with commodified luxury, Martens tactfully navigates. With a touch of pragmatism, he merges forward-thinking experiments from Y/Project with Diesel’s vast outreach. The result? An exploration that satisfies both fashion elites and mass markets.

L’Artiste’s Choice?

Y/Project, under Martens, strikes a chord even in its offbeat notes. While some ensembles soar, others provoke thought, but each carries Martens’ unmistakable signature. An echo of the late Yohan Serfaty’s vision, Martens has ushered in a new era. He juxtaposes Serfaty’s legacy with his lavish, fun, and slightly audacious style. Influenced by occidental aesthetics, Martens has iteratively evolved the brand, deftly playing with proportions, silhouettes, and fabrics.

Y/Project pulsates with a unique cadence. Sometimes it’s harmoniously in sync, while occasionally it stumbles. But there’s undeniable novelty. As aptly described by “Fashionlover4”, it’s akin to a “fashion student’s wet dream”. While Yohan Serfaty laid its foundations, Martens has expanded its horizon with gender-fluid, enigmatic designs evocative of the iconic Rick Owens.

Resurgence of the Denim Giant

Diesel’s renaissance under Martens is nothing short of remarkable. Martens dips into the brand’s golden era of the ’90s and early ‘00s, stirring nostalgia while redefining Diesel’s modern identity. The brand’s playful metamorphosis from “For Successful Living” to “For Sucsexful Living” post Martens’ intervention is emblematic of his audacious touch.

Yet, as Martens flirts with Diesel’s denim legacy, there are challenges. While the runway flaunts luxury, the racks sometimes reveal impractical flamboyance. The harmony between Diesel’s essence and Martens’ flair needs fine-tuning. But, given time, there’s no doubt Martens will blend his experimental spirit with Diesel’s rich denim history, offering fresh takes on classic staples. With a dash of patience and Martens at the helm, fashion enthusiasts worldwide can anticipate a reimagined denim dynasty. The future might be denim, and it’s couture.

His exaggerated proportions and gender-bending styling cover familiar ground already charted by predecessors like Martin Margiela, Ann Demeulemeester and Raf Simons. But, unlike those designers’ radical conceptual garments, Martens’ oversized blazers and trench coat dresses are tame in comparison. Yes, his subversion of masculine and feminine codes challenges binaries. But in 2022, gender fluidity in fashion is practically mainstream. Younger talents like Harris Reed and Telfar Clemens are doing more to expand definitions of identity and expression through clothing.

Where Martens does excel is in his digital-print tailoring and knitwear. Martens once shared, “My design realm is vivacious, whimsical, and a tad provocative.” He honors the brand’s essence while continually exploring new forms and dimensions, mirroring an artist rediscovering age-old hues. Pieces like the anatomical prints from the spring 1996 “Cyberbaba” collection exemplify his masterful reinvention. His pixelated and blurred suiting fabrics, often in bright hues, have a hyper-modern vibrancy. The asymmetrical color-blocked knits he designs are actually quite imaginative in their use of graphic color and texture. Martens clearly has an affinity for digitally manipulated textiles and colors. When applied to Y/Project’s signature oversized tailoring and body-hugging knits, the results bring an edgy, hypermodern look to life. His custom fabrics point to the potential he has to develop a more distinct design identity.

Looks

Spring 24

I like how he’s redefining what wearability could look like. The issue I constantly face is when he doubles down on playing with the structure and color. It’s overwhelming and far from pleasing.

  • Highlights:
    • I’ve fallen in love with the buttoned boot

https://framerusercontent.com/images/EC3nHezNOeR4v9PQbX1LCds.png

https://framerusercontent.com/images/tq9R9sSkQjESH5B8t54T8gd4.png

Later end of the show consisted of fabrics that look like they’ve been paper-mache style dried immediately after the washer. I’m a fan of the tops, skirts, and pants in this texture. Including denim!

https://framerusercontent.com/images/S483sdj2BBz8XD0Nj2RpkCMoWwQ.png

https://framerusercontent.com/images/GtbwhQrq1hr3n6xSRoQuW6wRaT8.png

Fall 23

  • This feels like the official incorporation of denim bleeding from Diesel into Y/Project’s work.
  • Highlights:
    • Denim bleeding into other fabrics in a tasteful manner

https://framerusercontent.com/images/vibc4llgK2HmWFz4J8LmBRKUmM.png

Feels like not knowing where the clothes end and the shoes start

Early AI Meditations 1

Early AI Meditations 1

Tweet on my mind

https://twitter.com/blader/status/1640387925912477698

  • 🧠 AI Memory - LLMs are great reasoning engines, not great at memory. Major opportunity for players to provide infra to help with this. Likely will be verticalized
    • Problem
      • In-context learning works, however you need to elegantly select the right context you’d like your model to have.
      • Similarity search only goes so far. Most solutions only do top-N results. Lack of connecting ideas.
    • Solutions today
      • Similarity search via Pinecone, Weaviate, etc.
    • Hypothesis
      • Different verticals will need different knowledge graph expertise. Law vs medical vs sales vs product vs user research. Verticalized players will likely emerge
    • Notes:
      • OpenAI mentions better memory on their plugin’s next steps - “Integrating more optional services, such as summarizing documents or pre-processing documents before embedding them, could enhance the plugin’s functionality and quality of retrieved results. These services could be implemented using language models and integrated directly into the plugin, rather than just being available in the scripts.”
  • 🏗️ LLM Coordinators (ex: LangChain) - Organizing, customizing and providing modularity to LLM applications
    • Problem
      • Developers needs ways to customize how their product consumes and instructs language models.
      • All developers run through the same friction when building apps. Prompt templating, retries, parsing output.
    • Solution today
    • Notes
      • Libraries like LangChain make it easier to work with LMMs. It’s unclear how much OpenAI and other companies will strategically build product into the space. Ex: LangChain and LLama index are great at document loading. Developers now need to choose if they load docs through them or use an OpenAI Plugin.
    • Model swapping, finer tuned control over agents, definitely needed.
  • 🌆 Internal company APIs - Proprietary Plugins for internal company use
    • Notes
      • Plugins could are a beautiful way for LLMs to chat with external facing apps. A cute and demo worth example of this is ChatGPT booking a dinner reservation.
    • Hypothesis
      • My hypothesis is that companies will have an internal LLM that carries out instructions with internal facing apps and plugins.
      • While large enterprise might do this themselves to start, my hypothesis is that Mid market/SMB will outsource this to products that do it for them
    • Example applications
      • Some companies are so massive that it’s difficult to know what is going on around the org. It would be great if there was an LLM that was watching a feed and only alerted me of what I needed
      • Trained specifically on a company’s code base and could make recommendations
      • Could train product marketing to better articulate how code works
      • Keep up technical documentation up to date
      • This will be similar
  • 🤖 No code ways to make your own apps - Big opportunity to empower people to make their own apps powered by AI.
    • Problem
      • Non-technical people have great ideas, but can’t build apps to execute them
    • Hypothesis
      • Low-code and no-code has already been around, but the barrier to entry is still too high. As english becomes a programming language more SMB owners will build apps that have a solid use case
      • Micro-SaaS acquisition could likely heat up here. If not to purchase a company, then for start up that can execute better to run with their idea.
  • 🎯 Offshoots of Plugin Store - Apple AI App Store
    • Notes
      • OpenAI decided to use an open API specification format, where every API provider hosts a text file on their website saying how to use their API.
      • This means even this plugin ecosystem isn’t a closed off that only a first mover controls
    • Hypothesis
      • Most of the infrastructure and support we see around the apple app store will likely follow the plugin store
  • 🔐 LLM Privacy - The Signal of LLMs
    • Notes
      • The company to crack a private LLM (Ex: Get the reasoning power of an LLM but with complete privacy) will gain massive traction.
    • Hypothesis
      • This is a horizontal feature that would likely be extremely attractive to OpenAI and other providers
  • Can we reduce the security threats present in the way we treat llms

Weekly Reading Roll

Weekly Reading Roll

Last Updated: "Week Ending": 03-15-2024

Mountain Dew’s Twitch AI Raid

  • I’m split on how I feel about this. Incredible way of marketing to the right audience by cornering true fans. However, I worry how intrusive this could get.
  • Are we entering a new era of affiliate marketing and product placements?
  • “During the live period, the RAID AI will crawl all concurrent livestreams tagged under Gaming looking solely for MTN DEW products and logos. Once it identifies the presence of MTN DEW, selected streamers will get a chat asking to opt-in to join the RAID. Once you accept, the RAID AI will keep monitoring your stream for the presence of MTN DEW, if you remove your DEW, you’ll be prompted to bring it back on camera, if you don’t, you’ll be removed from our participating streamers.”

[Abstractions Rule Everything Around Me](https://benjaminschneider.ch/writing/aream.html) - Benjamin Schneider

  • “I realized that people came up with some of the abstractions most impactful in our everyday lives without ever referring to either! The more you notice all the abstractions you interact with, the more coming up with useful abstractions starts to look something humans are just generally interested in — and pretty good at.”

[Yudkowsky vs Hanson on FOOM: Whose Predictions Were Better?](https://www.lesswrong.com/posts/gGSvwd62TJAxxhcGh/yudkowsky-vs-hanson-on-foom-whose-predictions-were-betterhttps://www.lesswrong.com/posts/gGSvwd62TJAxxhcGh/yudkowsky-vs-hanson-on-foom-whose-predictions-were-better) - 1a3orn

  • I alternate between worried/excited with all the recent ai this/that debates — esp. around agi or interpretability voids. It was fun looking back at debates in the course of ML over years in the rationalist community and what they got right/wrong. This is a good summary of Eliezer and Hanson’s predictions.

[Are you serious?](https://visakanv.substack.com/p/are-you-serious) - Visakan Veerasamy

  • “So the point is to take the work seriously but you don’t take yourself too seriously. There’s a riff about this in Stephen Pressfield’s War of Art, where he talks about how amateurs are too precious with their work: ’The professional has learned, however, that too much love can be a bad thing. Too much love can make him choke. The seeming detachment of the professional, the cold-blooded character to his demeanor, is a compensating device to keep him from loving the game so much that he freezes in action.’”
  • “I’m still publishing. That’s the litmus test. Are you publishing, whatever publishing means to you? I want to see it!”

[Resignation Letter](https://www.espn.com/pdf/2016/0406/nba_hinkie_redact.pdf) - Sam Hinkie

  • clarity, brevity, and specificity in summarizing his objectives
  • “A competitive league like the NBA necessitates a zig while our competitors comfortably zag. We often chose not to defend ourselves against much of the criticism, largely in an effort to stay true to the ideal of having the longest view in the room.”

[Why Generative AI Is Mostly A Bad VC Bet](https://investinginai.substack.com/p/why-generative-ai-is-mostly-a-bad) - Rob May

  • Surprisingly early (Jan 7) call on why LLM Startups might not be the move. + I like Rob

When the cost of something trends towards zero because of new technology:

  1. You will get an explosion of that good.
  2. That good will decline in value and defensibility
  3. The economic complements to that good that see increased demand as a result of the explosion in the original good, will be the place to invest.

[THE NEXT ACT OF THE GVASALIA BROTHERS CIRCUS:](https://www.sz-mag.com/news/2023/07/op-ed-the-next-act-of-the-gvasalia-brothers-circus/) Eugene Rabkin

  • “It sounds bizarre, like a desperate couture attempt at streetwear, or worse, like a Marie Antoinette playing-at-shepherdess scenario.”

    “This is just the latest chapter in the Gvasalia circus, which, sadly, the fashion commentariat cannot get enough of.”

[a Nirav or a Naval](https://auren.substack.com/p/a-nirav-or-a-naval-that-is-the-question) - Auren Hoffman

It’s very important to realize what you’re changing or chasing. You have the ability to revolutionize a bunch of things as you’re deffo an outsider. Never discredit that. And don’t let the fact that you sometimes appear as an insider to gain clout, make you inherently an insider that’s un-opinionated/dull/and unable to influence a tectonic change.

[Superliner Returns](http://paulgraham.com/superlinear.html) - Paul Graham

  • “always be learning. If you’re not learning, you’re probably not on a path that leads to superlinear returns.”

[Why Do Rich People In Movies Seem So Fake?](https://sundogg.substack.com/p/why-do-rich-people-in-movies-seem) - Michella Jia

  • “If you are excellent in the first way, it behooves you to control the contexts in which you perform — and if you can control these contexts well, you also come off well. As for the second form of excellence, it often appears latent until catastrophe or circumstance forces a change of context. In this sense, the second type of excellence is much more difficult to spot.”

[Telomeres: Everything You Always Wanted To Know](https://www.notion.so/Daily-Log-Fall-2023-ee985cd122004f9fb8e4dabd25ee4b69?pvs=21) - Nintil

  • “The usual function ascribed to telomeres is as an anti-cancer mechanism: if we cell begins dividing too much then its telomeres will progressively shorten and it will stop dividing (or die). To overcome this, cancers end up reactivating telomerase to keep their telomere length.”

[An Extremely Opinionated Annotated List of My Favorite Mechanistic Interpretability Papers](https://www.neelnanda.io/mechanistic-interpretability/favourite-papers) - Neel Nanda

  • “The core thing to take away from it is the perspective of networks having legible(-ish) internal representations of features, and that these may be connected up into interpretable circuits. The key is that this is a mindset for thinking about networks in general, and all the discussion of image circuits is just grounding in concrete examples. On a deeper level, understanding why these are important and non-trivial claims about neural networks, and their implications.”

Vector Embeddings - Hype from Excess Dry Power?

Vector Embeddings - Hype from Excess Dry Power?

Embeddings – A Hype Cycle Fueled by Excess Dry Powder?

At the bottom of everything that I’m trying to do here, what I’m trying to do is evaluate whether AI companies are worth leaving everything being and betting on?

Reflexivity Framework

I want to start off with the idea of reflexivity as I assume the best investors put their earnings and future (skin in the game) by predicting how the future goes. This relates as I’m looking if there is anything material in the AI space that will change the career direction I take. Investors don’t base their decisions on reality, but rather on their perceptions of reality instead. A framework is essential when looking at new technologies and the ecosystem it creates.

However, their actions from these perceptions have an impact on reality, or fundamentals, which then affects investors’ perceptions and thus prices. The process is self-reinforcing and tends toward disequilibrium, causing prices to become increasingly detached from reality – ie. crypto.

People get used to things. People think about the world through the lenses provided by the status quo of the things they use. Then when the world changes, sometimes whole new ideas are possible. The strongest example of this is probably the Web. It enabled all kinds of ideas that people didn’t think of before. The network of interconnected computers provided a new mental model for them to work from to invent new things.

Social media didn’t immediately come with the web. Why not? It takes time for the new reflexive part of an innovation to arrive. To understand what is fully possible under the new technology paradigm, some people need to have worked in it natively for a few years so that they begin to break down the status quo way of thinking.

Defensibility in Building

From a technical perspective it’s a huge breakthrough that will have lasting impacts. As technology makes doing more stuff faster and easier, it’s increasingly difficult to find areas of long-term defensibility in business models. The key position investors seem to be taking is that “context layers” that take these generative tools and put them into some point solution of a workflow is the place to make a bet.

It’s hard to make these defensible for two reasons:

First, there will be too many players because the barriers to entry are low and that drives a competitive dynamic that is unfavorable to investors.

Second, they risk competition with the foundation models themselves as those models improve. Not only could OpenAI boot your company off of their API, but they could also improve upon their model faster than you can build out the middle layer - rendering your improvements useless in a matter of days with a massive new update.

Sometimes, emboldened by the strength of immediate need, and feeling the pressure to raise money and execute quickly in a noisy market, **founders will be quickly drawn to **“defensible technical depth” as their narrative to investors. The risk is that they’re not yet sure it’s true, but they say it enough to convince themselves of a world model that’s wrong. Counterintuitively, recognizing that no part of solving the immediate problem is hard forces a more useful ongoing search and paranoia. **Maybe defensibility is overstated for most early-stage startups. At seed stage, the only defensibility is the quality of founders. **Also, It’s actually irresponsible to not leverage GPT – similar to mobile/cloud. Startups are often a spread trade on new innovations before wider adoption. And especially salient with such a general purpose technology like LLMs — a rising tide.

Most $10B+ companies seem defensible now, but it took them several years…execution is the only real moat **It is wrongfully sought by investors, too. The problem with the “sell picks and shovels during a gold rush” analogy is that **picks and shovels are fungible, and software products are not. Eventually, defaults do emerge. The risk of solving easy problems is that they’re easy for other people to solve too. They can be solved by incumbents with a distribution advantage, or by other startups. That’s why leadership even in “temporary markets” is a valuable position, and “easy problems” can still be good entry points for startups to leverage. Almost any growing problem ends up deeper than it first appears – but you have to be cognizant of this going in.

Unstructured Data > Structured

  • Insights and data are valuable to businesses, but only when you have access to a source that the general market doesn’t. The harder a valuable piece of data is to grab, the more attractive it is
  • Many valuable pieces of data sit within unstructured text-based sources. It’s notoriously tedious and difficult to extract insights from them
    • Ex: Public filings, public records PDF, transcriptions

Full-stack retrieval goes like this:

  1. You have a raw corpus of documents (Held in the cloud)
  2. You split them into semanticly meaningful chunks (With LangChain or other text splitters)
  3. You convert them into some vector representation for easy comparison and searching (Using OpenAI’s embeddings)
  4. You store those vectors (using Pinecone or Weaviate or Chroma)
  5. You retrieve certain documents based on the task at hand (Metal?)

I’m unsure how much of that stack a Pinecone.io is going to want to take vs a company like Metal – which is a current YC.

Contextualizing

We can’t build unique models, but we can change the data through embeddings and update them affordably. Initial OpenAI embeddings, and cosine ranking is subpar after the initial wow factor. So to improve on models in private data, we need fine-tuning models with domain, incorporate keyword ‘wut’ search, and have multiple ranking methods.

Problem

  • Semantic search gets you 90% of the way there for easy questions & answers, but only 30-40% for hard Q&A
  • The hard part is understanding which documents are relevant to the query you give to the LLM

Why this is interesting to me

  • I see two routes document retrieval could go
    • Route #1 (Horizontal Retrieval): One general engine is really good at document retrieval across industries and domains (Law, Medical, Real Estate, etc.). It has a reasoning engine that tells it where to look
    • Route #2 (Verticalized Retrieval): Specialized retrieval engines are needed who are experts at traversing law documents which are different than medical, real estate, etc.
  • I’m unsure which way it will go! I’m currently leaning towards #2

The winner of this space will go full-stack and take over more document management / retrieval workflows

Vector Embeddings - learned matrix transformations that translate a dimensional space to another one while trying to go through a big information loss

Most places don’t bother to define embeddings in general, or instead they describe the properties of the embeddings they want to use. Some want compression, some want cosine similarity.

At the end of the day, any medium that comes into these codes has to be converted to numerical vector. These conversions might be image converters, nlp text converters, audio converters etc. Not only do embeddings allow us to analyze and process vast amounts of data, but it also has the added benefit of being language-agnostic. Embeddings are modular independent and anything that’s an input can be embedded.

The ability to vector match let’s you have outcomes like “pink spiky fruit” mapping to dragonfruit instead of exact word matching that might lead to spiky fruit, or pink fruit etc… Put easily, vector matching adds context. Basically even if you hav things that don’t have traditionally the same meaning, this will reduce it to points where the little amounts of nuance matter and we can match it to a specific part — meaning we can vector embed and capture meaning.

Given these vectors are essential, there is a need of databases for vectors that allow for storage, indexing, and servicing.

Vector search libraries help developers search through large collections of vectors for clusters or nearest neighbors. Popular ones include Google’s ScaNN  or Facebook’s Faiss.  Vector search libraries are great for vector search, but they’re not databases and have trouble at large scale.

Con

There are currently a thousand “load embedding vectors into a vector database and selectively load results into the context window” startups right now its crazy

Gap in Market

Pros

There’s an issue with having an open source alternative that doesn’t let users log in with GitHub and spin up an index and upload their vectors.

Features and Integration

One pain point that we noticed with a lot of existing vector stores is they often involved connecting to an external server that stored the embeddings. While that is fine for putting applications into production, it does make it a bit tricky to easily prototype applications locally.

They found that these were mostly geared to other use-cases and access patterns, like large-scale semantic search. Additionally, they were often a hassle to set up and run, especially in a development environment.

Since Chroma is deployed locally it will have lower latency than a managed cloud service due to network latency.

Cons

The issue with unmanaged — self hosted — vector databases:

  • Self-hosted vector databases are a big step up from vector search libraries, but they still require significant configuration from engineering teams to scale without affecting latency or availability. They don’t come with any security guarantees (i.e. GDPR or SOC 2 Type 2) and leave you with the operational overhead of maintaining additional infrastructure, monitoring additional services, and troubleshooting when things break. Solving these problems is where managed vector databases come into play.

Open Sourcing

Pro

  • Can be the ability to move at the pace of ai. We don’t know what it’s bringing and the dimensional shifts it’s going to take. So to keep up with the directional momentum of the ground moving underneath, letting users determine and help us evolve the databases might be a better way.
  • AI is a landscape of shifting sands. What developers want today is not what they’ll want in six months, and what they need to build demos is not what they need in production, is not what they’ll need for integration into existing products. But demos could be the path to distribution.

Con

It’s like anything else, the risk adjusted returns are great enough to justify the most probable outcome.. at least in someone’s book. Not all of these companies are being built to generate cashflows, at least a few are grinding until they can be acquired by someone who has a vision for how to extract value.Big fan of langchain, fwiw.

The projects are highly technical so if a layperson wants to use it they pay $$$ for a layperson dive into it

Another thing I suspect is if you get major corps to use your tech then their lives depend on your team so they’ll “donate”. This is actually tax deductible for them

Product Progression

Vector databases naturally sit at a critical point in the machine learning toolchain; any company with a lot of customers there would be well positioned to expand along that toolchain with new products. In particular, we can easily imagine a future where Pinecone begins offering a model hosting service, allowing them to manage the entire vector data pipeline.

Eventually, to win, the can become a truly seamless database for storing, indexing, and serving unstructured data. Bring your data:

  • Vectorize
  • Index
  • Partition
  • Store
  • Query

Eventually, becoming an OLAP (online analytical processing) for unstructured data.

Every team wants to know the best way to leverage retrieval, how to chunk and embed their documents, which model they should use, how to ensure the retrieved data is relevant to the query — chroma will answer these questions

Bigger Fitting:

modular and flexible framework for developing A.I-native applications.

“The real power comes when you are able to combine [LLMs] with other things.”

LangChain aims to help with that by creating… a comprehensive collection of pieces you would ever want to combine… a flexible interface for combining pieces into a single comprehensive ‘chain

Edge over others;

Pessimism

Qdrant, Weaviate — clearly didn’t market as well as Pinecone. James Briggs did an amazing job and he should be hottest DevRel in the space right now! Their blogs have high recall and the learn series is often recommended

Hype Cycle and Market

The billion dollar question is whether all this interest leads to any durable market.

The cycle to date has been something like this:

  • Models at scale have emergent behaviors that are magical and shocking.
  • Consumers experienced DALLE2, ChatGPT, and a small number of LLM products gained real traction rapidly (Copilot, Jasper, Midjourney, Character).
  • Startups have flocked to leverage these capabilities, VCs are funding them like it’s 2021
  • Many incumbent technology leadership teams are excitedly, anxiously resourcing AI projects.

Great companies can emerge from the morass of spaces like “LLMOps.” But those that do will be teams that see the wedge for what it is, rather than misreading immediate momentum and interest for durable value. The distance between Github stars and Twitter likes and at-scale deployments and six figure enterprise contracts is very far.

The question is not, “Do developers want LLMOps?” but instead, “Which segment of those users do I focus on? What do they really need, and in what order? What will make the product easy to adopt, and what objections will I face? What architecture will support those users, and what compounding advantages can I build?” ((** good hebbia starter email)

Current Worry Among Every Thinking Person

Too many people are hunting for a neat strategic narrative of “which layer of the stack endures,” telling some clean story about “data moats,” or wringing their hands that large labs or incumbents are going to win the core modalities (text, code, image etc.) — this kind of hand wringing is folly. the history of software markets is nondeterministic.

I believe, thee huge amount of value creation / capture out of the box for creative product folks is incredibly promising for startups. time and effort is better spent understanding customer problems deeply, and understanding the state of the art, and leveraging the latter for the former. who wins is based part on market structure, but also partly on who the players are, their execution, and how they redraw the software category lines

Intellectual Honesty Required:  “thin shims on foundation model APIs” have fallen prey to technical arrogance. Copilot became quickly essential because co figured out how to fit “passive” prediction into coding workflows in a way that made sense to developers. People building from the models/tools up (VS from the customer back) are often unwilling to focus enough to do that last mile to make a product useful for customers. Extreme amounts of CUSTOMER FUCKING CENTRICITY and building backwards.

  • Think about whether it will matter for the use case once models improve
  • Incorporate private data/customer data in the model context to improve outputs
  • Assume that incumbents in your space will at least adopt surface-level generative AI features and think about how you can go beyond those.
    • Advantages of the incumbent:
      • Distribution
      • Prop. Data
      • Capital
      • Talent
    • Advantages of the startup
      • Speed
      • Focus
      • Centralization of data
      • Less repetitional risk
  • Think about the right insertion point for your product and try to go deep into workflows while minimizing disruptions but bringing out the full value of AI.

Props to them

Developer Marketing: Clever move by chroma. Marketing is the key vector for DB companies’ success, particularly for Vector DBs as we are still in the hacker/experimental phase. Developers value familiarity and ease of use over technical features

I don’t think the billions of LLM developers need to worry about scale. Chroma is moving vector infra out of data centers to the edge and your file system in an AI-first ecosystem. The reason why langchain hackers preferred chroma was that it was easy to use locally. Once you need to connect to an endpoint for scale, the complexity comes. There might be inability to scale…

Serving a customer’s needs well – in this case usually developers and larger companies wanting to integrate AI to their systems – is often more important (and harder) to think about than defensibility. In many cases defensibility emerges over time - particularly if you build out a proprietary data set or become an ingrained workflow – which Chroma is likely to follow, or create defensibility via sales or other moats.

The less building and expansion of the product you do after launch, the more vulnerable you will be to other startups or incumbents eventually coming after and commoditizing you. Pace of execution and ongoing shipping post v1 matters a lot to building one forms of defensibility above.

Well is DATA the new moat — building for these prioprietoary data sets???

Other thing to think about while servicing smaller customers on their ML Dev journey is the graduation issue – will they be too big to want to host it themselves, and can we scale alongside then (i.e the stripe phenonmenon)

Enterprise document managemen

  • By default, internal company documents (slides, docs, emails, messages, APIs) are not optimized for LLMs

  • Big companies will need custom solutions to organize all of their internal documents for LLMs to parse and retrieve

  • A company will emerge as “the first place for your LLMs to ingest your documents”

  • Unstructured might be the front runner

Tape Your F***ing Mouth

#

Tape Your F***ing Mouth

A Graduation Towards Nasal Breathing.

There are a few occasions where once can honestly say: “well this changed my life.” This is one of those times.

Jesus, I fucked myself over through having the worst possible foundation for the most critical life action — breathing. I was breathing wrong all my life! Even better, I didn’t know about it. If you’re a poor breather, you might not even know what good breathing feels like, until you experience it.

Triggers For Realization

Little did I know, at the time, what led to mouth breathing was what seemed like two very uncorrelated incidents. 1. People constantly telling me that I breath heavy and audibly when sitting. I just excused it as having bad sinuses. Eventually, one of my friends found it idistracting, and showed me the meme below. ![Nasal Breathing](/assets/images/breathing.png) 3. I knew I sort of snored, but I wasn’t aware of the magnitude until getting to live with two other friends in London flat. Looking back, having roomates that were light-sleepers was a blessing in disguise. They were waking up muliple times causee I snored so hard. We looked at possible reasons for my snoring, and I tried nasal decongestant sprays and nasal strip that didn’t help.

Having three sleep monitoring apps on my phone tracking the consistency of my snoring, I noticed the sounds mainly came from my mouth. So combining the noise’s source, and the two factors above, I decided, fuck it, I’m taping my mouth shut when I go to bed and see what happens.

Changing My Breathing Behaviour — An Experiment

Let me tell you it was HARD! The first few nights, I felt actually sick. Those times you go to bed while having a horrible flu. I would constantly wake up after having to rip it off. Magically, the tape is off and pasted on my headboard. The tracking apps showed. clear indication of snoring sounds starting immediately after the tape was off, and nothing beforehand.

It took a couple of weeks to get used to, but afterwards, I couldn’t even go to sleep without it. I was using Amazon’s [SomniFix Mouth Tape](https://www.amazon.com/Sleep-Strips-SomniFix-Breathing-Nighttime/dp/B076CQ1NR8), or sometimes just plain bandaids when if I run out. I didn’t want to have a day where I slept with my mouth open.

My sleep quality improved drastically, mouth wasn’t dry when waking up and zero snoring. I keep on feeling better, less tired, and more centered/aware of my body. Grogginess and genaral low levels of energy were apparent when I forget to mouth-tape/ran-out.

I decided to stop mouth breathing in its entirety — even when awake and without a mouth tape. It took longer and active peripheral monitering than expected. Quickly, it became norm and air incoming through my mouth felt unnatural and cold.

Hindsight Research

At this point being common knowledge, Cottle (1958) states at least 30 health beneifts of nasal breathing. These include humidificaiton and cleansing of the air, regulation of the direction and velocity of air to veins, 50% more resistance to airstream leading to more oxygen uptake compared to the mouth, and increased circluation of blood oxygen…etc. More benefits listed on another Graham T (2012) study.

One key improvement that’s less commonly know is the mixture of air with nitric oxide in nose turbinates (tiny shelf-like bone like strcutures in the nose). Nitric oxide is commonly known to be an environmental pollutant right? Always had a bad connotation in my head. How is this useful then? I did a quick deep dive, and in 1998 three scientists recieved the Nobel Prize for discovering nitirc oxide as a signalling molecule in the cardiovascular system.

Additionally, it’s a potent bronchodilator and vasodilator (thank to my 8 months of medshcool, dialation is expansion, so dialation of the bronchioles the dialation of our vaso(vessels)- bloodvessles). Expansion is important in increasing the amount of air absorbed. And where do these enzymes producing the nitric oxide exist, IN THE FUCKING NOSE!

Current State

Currently, it’s been 8 months since my first tape, and mouth breathing feels like genuine torture. I’ve been strict about nasal breathing, and adding more deeper/healthier breathing techniques on top of it.

However, the only times I end up slipping into mouth breathing is when exercising. It might be the lack of focus from the body to the game/run or the fact that I just need a lot more oxygen intake than what the nose affords (double breathing), but I'm constantly being mindful of closing my mouth when exercising. I will report back with updates in a few months as it’s likely to take longer for this adjustment.

Additonal Quotes

According to Lundberg (2008):

“Nitric oxide gas from the nose and sinuses is inhaled with every breath and reaches the lungs in a more diluted form to enhance pulmonary oxygen uptake via local vasodilatation. In this sense nitric oxide may be regarded as an ‘aerocrine’ hormone that is produced in the nose and sinuses and transported to a distal site of action with every inhalation.”

Chang (2011) named nitric oxide the ‘mighty molecule’ and noted that it is an active component of the cardiovascular, endocrine, and immune systems, and is extremely versatile and significant within and throughout the human body. The fact that nitric oxide plays a significant role in cardiovascular health is evidenced by the fact that one of the Nobel Prize winners mentioned earlier wrote a book titled ‘No More Heart Disease: How Nitric Oxide Can prevent - Even reverse – Heart Disease and Strokes.’

Conclusion

Dude, mouth breathing almost single ruined my qualtiy of life. I’d go as far in saying MOUTH BREATHING IS FUCKING CHRONIC! Sadly, the adverse effects aren’t common knowlege as they ought to be. It takes a conscious effort, and a slightly uncomfortable one, I might add, to change a habit as engrained as breathing patterns. Although I did a brute-force approach of just closing/taping my mouth whenever possible, there are breathing retraining methods. Some that I found helpful in hindsight include the [Buteyko Method](https://buteykoclinic.com/the-buteyko-method/) and the [Papworth Method](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2094294/).

</p>

Notice how you breath, if there are signs of mouth breathing, try methods linked above, and if all else fails, tape your fucking mouth!!

Taleb's Antifragility Review

#

[[A pre grammar or spelling check upload]]

Content

My Take

Summary

Summarizing My Summary LOL

Random Quotes

My Take

In this book Antifragility, Naseem Taleb introduces a new paradigm of looking at risk — a very optimistic one. Taleb approaches risk by focusing on the strength of our infrustrucutre to face them instead of predicting it. The concept is a thoughtful and well explained way of convincing you to embrass nature’s stressors. At the core, it argues, with chaos, things can fall, so become resilient, or better yet feed from it to become better. Antifragility is a fun and interesting read. That is, even if you question what Taleb is saying—and sometimes you really should —he forces you to examine your own biases and assumptions.

Nassem Taleb’s academic works including [Dynamic Hedging](https://www.fooledbyrandomness.com/dynamichedging.pdf) are the industry standards for hedging and complex systems. In his writing, tweets, or a few podcasts I’ve heard, Taleb resists categorization. If I had to pin him, I'd call him an anti-guru guru, that reached those levels organically. His ideas are usually sound, but his iconoclastic nature usually ends up side-lining them.

Contextualizing Taleb’s nature while reading the book helps put up with his antics because he's smart, funny, and fearless and tackles consequential topics. His messages are rebellious and critical of many aspects of current knowledge, demands better data from popular claims, and is aware fo understanding the meaning of the tools we use to know thing in all fields. I love his constant digs into professors that have little stake in the equations they sell as appearing “neat”. He makes fun of them as using their tools in real settings is obviously pretty dangerous, and often times infeasible. There’s some fun, good humored name calling, that is relativly less dangerous than the propogandas spewed out by his targets. “Be skeptical”, Taleb says, especially of media and people who like to forecast, because of the fact we are so bad at it — and it is easy to create plausible stories for why an event (especially a Black Swan) has happened — after it has happened. This makes people who aren't so smart, sound very smart to the uninitiated, which then leads to harm if we come to rely on these prediction-lovers (which we often do).

His iconoclastic nature becomes more apparent with his grandiose overstatements, the book jumps in and out of personal and systemic level societial criticisms. Antifragile jumps around from anecdote to technical analysis to perls-of-wisdom making it a bit of a mess to keep up. His arguments ususally go out of bounds, talking about preferences metric and imprerial systems, hedonic tredmill of more materialism, dual sex strategy, procastination, and other random topics that are discussed in relative depth, with little relation to antifragility — other than becoming adjusent examples of fragility. At points, it’s fair to question was this a book about antifragility or was antifragility simply one of the many many random topics?

Taleb constantly gives advice on his general worldview holding small threaded links to antifragility. It seems like the obviously strong principle of antifragility is so good, that it seems Taleb struggled to control hismself from applying it everywhere, and distorting some arugments to support his world view. That fills the book with some confident claims you might disagree with — but keeps the book entertaining instead of pure academic literature. Don’t get me wrong, it’s a very interesting book, and some parts even illuminating, but I didn’t see a strong common thread joining some of the ideas in delivering a potent message. Some of the ideas are scatted, and nuggets could get missed or wrongly attributed. But, I can only imagine the level of restraint needed to limit the application of such a strong and dynamic heuristic. It’s original thinking! That sort of insipired me to write the summary and grouping below to concretely pin what Taleb’s arguments and solutions.

Summary

Defining Antifragility

Core to Taleb’s world view is, humans are horrible at predicting the future, however ***black swan events are inevitable***. When modeling real-world events, where the unkown is far greater than the known, we can’t put too much emphasis and trust on our risk-assessment models. Thus, Taleb argues, one should shy away from assessments of risk and pay close attention to ***inoculation against*** risk in the first place.

A Talebian theory to initate this review should be **System Fragility**; where, humans build and optimize systems for average use, rather than for extreme scenarios. Given, these systems break at times. So, we should ***desgin our systems to benefit from this volatility*** — imitating nature.

A core distinction in reading this book his admittedly the simple concept of fragility, robustness, and antifragility. Taleb uses cool ancient examples to explain the triad of **Fragile, Robust, and Antifragile.** Damocles, who dines with a sword dangling over his head, is fragile. A small stress to the string holding the sword will kill him. The Phoenix, which dies and is reborn from its ashes, is robust. It always returns to the same state when suffering a massive stressor. But the Hydra demonstrates Antifragility. When one head is cut off, two grow back. Fragile things are exposed to volatility, robust things resist it, ***antifragile things benefit from volatility.*** It’s a very powerful model for understanding systems that should be a cornerstone in starting to fundamentally understand complex systems.

Antifragility Matters

Nature is a recurring demonstration of antifragility that Taleb uses throughout, ranging from human body and earth’s ecology. We, as a society have evolved ***have a tendency to try to reduce normal swings of life.*** Taleb mentions there’s a tendency to overmedicate and overdose with drugs such as prozac to smoothen out normal mood swings we experience. Instead of understanding how to benefit from these swings life throws around, humans have the tendency to retreat to predicting when swings come, estimating how big they could be, and lower/prevent their instances.

“This is the central illusion in life: that randomness is risky, that it is a bad thing— and that eliminating randomness is done by eliminating randomness.” The argument is don’t focus on the prevention! Embrace these swings. Taleb says he is not against intervention in any way, however. It’s just that he often sees too much, ***naive intervention. (??)***

Signal vs. Noise

The author says we often intervene because we listen too much to the news, to the noise, rather than ***looking at the substance and at the long term repercussions.*** And the shorter the time frame you observe an even, the higher the noise you will perceive. People with too much smoke and complicated tricks and methods in their brains start missing elementary, very elementary things. Persons in the real world can’t afford to miss these things; otherwise they “crash the plane”. Unlike researchers, normies were selected for survival, not complications. He alludes to less is more in action: the more studies, the less obvious elementary but fundamental things become; ***activity, on the other hand, strips things to their simplest possible model.*** I guess this idea effectively carries onto his next book of “Skin in the game”. Connecting these, the more skin you have within the systems, the clearer signals get compared to speculating from the outside — preventing naive intervention and building towards antifragile systems. It’s essential to be mindful of the Signal/Noise ratio. Trying to be very selective as the vast majority are just trying to stay relevant and get eyeballs which tends to lead to a very noisy stream of output.

Personal Levels

Humans become better after traumatic accidents. When we grow out of frustrations and hardships, we also show antifragility. Taleb states a loser is someone that is embarrassed by mistakes and tries to rationalize them away instead of introspecting and becoming better with the new piece of information. We could say, losers ego is not antifragile. Thus, honesty with oneself, and ones ego is probably the most important step you can take to make your “system” antifragile. Taleb invokes stoic principles on multiple occasions as ways of handling randomness and becoming more antifragile.

Generalizing, Nassim Taleb says it’s best to prepare with failure in mind than trying to predict how and when failure will happen and how to avoid it. I don’t think there’s anything fundametally wrong with his thesis. However, points of contention can arise in discussions about how to harness antifragility. For example, I don’t know if I agree with applying stoic techniques of “practicing poverty” helps reduce your fragility from being afraid of losing your wealth as success brings fragility alongside itself. But I love his allusions of in life, antifragility is reached by “not being a sucker”.

Developing Antifragile Systems

Ok, now we’ve established the need to develop antifragile systems, how can we prepare for it?

On principle, first step toward antifragility consists in ***first decreasing downside, rather than increasing upside;*** that is, by lowering exposure to negative Black Swans and letting natural antifragility work by itself. Taleb’s main emphasis is on ***minimal intervention*** and reliance on the self-healing abilities of organic systems.

Barbell Strategy - Situating Appropriately

The barbell strategy is a practical strategy of prioritizing decreasing downside, rather than increasing upside. Covering your downsides, while increasing your upsides. You play it very safe on one side so that you can take more risks on another side. If the risky part plays out badly, you’re still OK. If a black swan event will make the risks pay off big, you profit handsomely.

I guess using the Barbell Strategy, Taleb trys to strengthen the fact that ***antifragility is the combination aggressiveness plus paranoia***— clip your downside, protect yourself from extreme harm, and let the upside, the positive Black Swans, take care of itself. Exemplifying, from the book, it would be putting most of your money in safe investments and 10% in highly lucrative ones. Or, you can take a very safe day job while you work on your literature. You balance the extreme randomness and riskiness of a writing career with a safe job.

Taking the example a bit further, he steers close to the subject of generally avoiding mediocrity. “Do crazy things (break furniture once in a while), like the Greeks during the later stages of a drinking symposium, and stay “rational” in larger decisions. Trashy gossip magazines and classics or sophisticated works; never middlebrow stuff. Talk to either undergraduate students, cab drivers, and gardeners or the highest caliber scholars; never to middling-but-career-conscious academics. If you dislike someone, leave him alone or eliminate him; don’t attack him verbally.”

Additionally, another principle is, ***introduction/acceptance of small and constant “stressors”***. Humans tend to do better with acute than with chronic stressors, particularly when the former are followed by enough time for recovery, which allows the stressors to do their jobs as messengers. Think weight lifting. Alluding to medicine, Taleb takes a slightly controversial take of we should do nothing to those experiencing mild volatility but be wildly experimental with those experiencing extreme volatility. Again, thessue with these methods might start from their lack of universal applicability. On principle, Taleb has important and key points, but applicability can’t be universally uniform!

Options - Diversifying Against Risk

Taleb says that ***options*** are a great way to ***make the system more resistant to shocks.*** The more options you have, the more ways you have to respond to black swans and unforeseen events. An option is what makes you antifragile and allows you to benefit from the positive side of uncertainty, without a corresponding serious harm from the negative side.

According to ***Jensen’s inequality,*** if you have favorable asymmetries, or positive convexity, options being a special case, then in the long run you will do reasonably well, outperforming the average in the presence of uncertainty. The more uncertainty, the more role for optionality to kick in, and the more you will outperform. This property is very central.

Expanding into entrepreneurship, and going contrary to Peter Thiel’s advice in [Zero to One](https://thepowermoves.com/zero-to-one-summary/), Taleb seems to slightly ***mock plans and business plans*** and takes the example of a few successful corporations which started doing something completely different than what they ended up being successful for. That’s why you invest in people, not in business plans: the successful entrepreneurs must be able to change course. I sort of agree with this, and intially strong business plans are an exemplification of the founders’ ability to formulate things, not convincing on their own. (Reminds me of a [Venture or Substance](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1533384) paper that argues against viability of business plans in determining venture success in B145 class I took senior-year).

Coding the processes of building antifragility in systems,

  1. Look for optionality (many options), and rank them according to the their relative optionality
  2. Find open-ended pay offs, not closed ones
  3. Invest in people, not business plans that can change careers 6 times if needed
  4. Apply the barbell principle and limit your downside

Taleb claims, collaboration gives us huge amounts of optionality, but of course we can’t see it until it’s already happened. He advises, actively spending more time around other people and collaborate with them.

Options, are thus, vectors of antifragility. Instead of predicting what’s going to happend, ***options position you in a way that whatever happens, all you have to do is evaluate it once you have all the information and make a rational decision***. Referring, I thought the following quote from the book was a intersting take: If you “have optionality,” you don’t have much need for what is commonly called intelligence, knowledge, insight, skills, and these complicated things that take place in our brain cells. For you don’t have to be right that often. All you need is the wisdom to not do unintelligent things to hurt yourself (some acts of omission) and recognize favorable outcomes when they occur. (*The key is that your assessment doesn’t need to be made beforehand, only after the outcome.)* Although I agree wiht his intial principles, I’m again worried about how extreme he takes his interpretations about optionality.

Removal and Decison Making

Taleb argues the ***solution to most things is by removing things, not adding things***. The greatest— and most robust— contribution to knowledge consists “in removing what we think is wrong— subtractive epistemology.” Disconfirmation is more rigorous than confirmation.

He mentioned a cool point of view on “giving many reasons” for anything. He says that robust, strong decisions, require just one single reason. When people try to cram too many reasons why it’s usually because they are putting up smoke screens.

The error of thinking you know exactly where you are going and assuming that you know today what your preferences will be tomorrow has an associated one. It is the illusion of thinking that others, too, know where they are going, and that they would tell you what they want if you just asked them. Accordingly these decison making processes work alongside optionality before making the decision.

The theory extends further with possibly the funniest paragraph in the entire book: “I would add that, in my own experience, a considerable jump in my personal health has been achieved by removing offensive irritants: the morning newspapers (the mere mention of the names of the fragilista journalists Thomas Friedman or Paul Krugman can lead to explosive bouts of unrequited anger on my part), the boss, the daily commute, air-conditioning (though not heating), television, emails from documentary filmmakers, economic forecasts, news about the stock market, gym “strength training” machines, and many more.”

Summarizing My Summary LOL

Important systems should be built to benefit from volatility. Getting to the nitty-gritty details and having ”skin-in-the-game” helps clear smoke-from-fire and prevents naive intervention, eventually building antifragile systems. Bulding these antifragile systems, it’s important to first decrease downside, rather than increase upside; that is, by lowering exposure to negative Black Swans and letting natural antifragility work by itself. Taleb’s main emphasis is on minimal intervention. The key is combining agressiveness with paranoia. Additionally, having more options position you in a way that whatever happens, all you have to do is evaluate the scenario once you have all the information and make a rational decision. Even in life, I you don’t end up in optimal scenarios by rigourous planning, but better alignment of values and methods of maximizing alternatives.

Random Quotes…

…but cool(some I agree with, others I find just entertaining)

Drink no liquid that isn’t at least a thousand years old (wine, water, coffee). Eat nothing invented or re-engineered by humans.

Something being marketed is necessarily inferior, otherwise it would not need to be aggressively marketed. Marketing beyond conveying information is insecurity.

The pursuit of meaning within Big Data has brought about many more spurious and random relationships than meaningful understanding. The false relationships will grow much faster than the real one, simply because chance allows so many more of them to be found.

“The best way to verify that you are alive is by checking if you like variations. Remember that food would not have a taste if it were not for hunger; results are meaningless without effort, joy without sadness, convictions without uncertainty, and an ethical life isn’t so when stripped of personal risks.”

“Ancestral life had no homework, no boss, no civil servants, no academic grades, no conversation with the dean, no consultant with an MBA, no table of procedure, no application form, no trip to New Jersey, no grammatical stickler, no conversation with someone boring you: all life was random stimuli and nothing, good or bad, ever felt like work. Dangerous, yes, but boring, never.”

“f*** you money”— a sum large enough to get most, if not all, of the advantages of wealth (the most important one being independence and the ability to only occupy your mind with matters that interest you) but not its side effects, such as having to attend a black-tie charity event and being forced to listen to a polite exposition of the details of a marble-rich house renovation. The worst side effect of wealth is the social associations it forces on its victims, as people with big houses tend to end up socializing with other people with big houses. Beyond a certain level of opulence and independence, gents tend to be less and less personable and their conversation less and less interesting.”

Tulipmania & Taming Irrationality

#

My quest to understand market speculation —rather than relying on lazy quips about “animal spirits” or irrationality.

Children given to Moloch

“Tulpenwoede” An Intro

Once in the Netherlands, tulips were worth more than real estate. Legend has it that in the 1630s, a sailor was thrown in a Dutch jail for eating a tulip, thinking it was an onion. At the time, the sailor’s gluttony would have fed an entire crew.

Does an asset price rising to crazy heights with the help of ordinary investors hoping to avoid missing out sound familiar? It represents one of the earliest instances where asset prices deviate from intrinsic values.

During the period, wild speculation and euphoria of the masses led to “irrationally exuberant” spending on these bulbs. Ordinary citizens, even to the lowest dregs, were trading tulips. Properties were converted, assets to cash, and invested in flowers. Houses and land were sold at ruinously low payments at tulip markets. These tulips were loved for their deep, bright colors and exotic appeal and didn’t experience price swings due to changes in production costs. Nor did they find new utility. Their popularity coincided with the Dutch Golden Age, where the republic was one of the world’s leading economic powerhouses.

Per David Roos, post the 1620s depression, “…the Dutch enjoyed a period of unmatched wealth and prosperity. Newly independent from Spain, Dutch merchants grew rich on trade through the Dutch East India Company. With money to spend, art and exotica became fashionable collector items. That’s how the Dutch became fascinated with rare “broken” tulips, bulbs that produced striped and speckled flowers.”

During the event, historian Mike Dash mentions. Dutch artisans worked long hours for low wages. “When the day’s work was done and they could finally go home, it was to cramped and sparsely furnished one or two-room houses that were in such short supply the rents were high…to people trapped in an existence such as this, the idea that one could earn a good living by planting bumps and sitting back to watch them grow must have been irresistible.”

Post the rampage of The Bubonic Plague, there was a labor shortage, leading to higher wages and extra income for those who worked. Plus, the plague meant widespread lowered risk-aversion. The Dutch were fine indulging in speculation, knowing that each day could be their last. There was a post-plague “mood of fatalism and desperation,” aiding speculation and reckless spending. The rich are accelerating prices even higher, buying rare breeds of tulips. Combining these factors, tulips increased in popularity as a means for people with disposable income to acquire wealth for the first time in many years.

Tulips were being sold for more than 10x the annual income of skilled artisans, and people kept on pouring life savings into buying tulip contracts anyway. When confidence was at its peak, everyone imagined their passion for tulips would last. I’d imagine those profiting trading bulbs could not resist telling family and friends of their good fortune.

There are stories of a man that sold his house in Hoorne Town for three bulbs — i.e. the first speculative bubble in history.

Children given to Moloch

Calming Hand Taming Irrationality

Like all bubbles, in 1737, the market burst with groups of auctioneers lowering prices with no buyers. Due to the lack of interest, the market disappeared entirely in the coming days. However, investors acted rationally. According to Nicolaas Posthumus, a Dutch historian, serious tulip financiers generally did not participate in the speculative markets. The “mania” was usually self-contained within smaller circles and pushed by “casual traders”.

It is easy to claim that bubbles are irrational. They seem to represent a deviation of prices from fundamental values and contradict the basic economic theory. But there has been little attempt to understand the details of how speculation and the government are intertwined.

Earl Thompson argues the market for tulips was an efficient response to the government conversion of futures contracts into options contracts. This was a deception by the government officials hoping to make a quick profit. The conversion meant investors who had bought the right to buy tulips in the future were no longer obliged to buy them. If the market price isn’t up to one liking, the investors had the option to pay a fine and get out. This increased tulip options prices, then collapsed when the government saw sense and canceled these contracts. The spot price and futures prices weren’t volatile. Tulipmania was only a contractual artifact. Contrary to popular interpretations, there was no actual “mania.”

The critical concept of preventing actual “mania” from happening is thus the government’s calming hand. During the dutch times, corrupt officials realized their pursued ruse would cause mania, and stopping the conversion was a calming hand to their own doing. Nowadays, the government buoys speculators through unconventional monetary policies like quantitative easing — printing money to buy government bonds and mortgage securities.

The calming hand of governments nowadays is through unconventional monetary policies that are deemed to not encourage speculation — rather dampen it. However, economists worry that investors have come to rely on this calming hand of central banks. Unconventional monetary policy has been attacked for promoting further financial gaming. When it is taken away, speculative urges return. Central bankers might feel pleased with themselves for having tamed “animal spirits,” but market uncertainty edges back in the weeks after monetary policy intervention.

This reliance has developed to a point where now, without regular interventions, markets become increasingly skittish. Central banks used quantitative easing and other monetary policies to save the world from financial meltdown. But easy money repressed, rather than extinguished, speculative practices. To feel comfortable halting these unconventional policies, central banks must ensure that the probabilities of nasty-tail risks have fallen. But can they ever do that? Hmmm…?

Children given to Moloch

Spotify Projects

#

Spotify Projects

Project I: Wrapped All Year Around

I made an app that continiously updates my Spotify Wrapped and shows it to me all year round…

Screen Shot 2022-07-13 at 10 51 14 PM Screen Shot 2022-07-13 at 10 59 13 PMScreen Shot 2022-07-13 at 11 00 11 PM

I used Python, Spotify API, Google Sheets, and Glide to build the app.

The code for the python (is here) if you want to try it youself… Additionally, the google/spotify api authentication might get tricky, so look up tutorials.

Project II: Custom Banger Song Recomendation

Exploring new music to find artists or songs that I like is just so damn good! Especially when the source is unexpected. I’m a song chaser more than sticking with an artist. Even with artists that I like, it’s only most of their songs that appeals to me, while their other songs are admittedly just mid/bad.

I’ve liked vibes and specific sounds more.

To make the song finding process easier, I ran a few machine learning models on a playlist of 50 of my favourite songs and manualy rated them to train the dataset.

Result is a 1.5k song that’s scawered all over Spotify to find songs that are in close proximity to my eclectic taste. Surprisingly it seems to be working – I’ve liked almost every song I’ve heard so far.

Will update soon, but the code is linked here

And below is the playlist.

Summer Playlist

And finally, my summer playlists that I’ve banged through Minnesota! I’ll keep on updaing till the end of summer – removing and adding songs till we finish!

Invisible Hand Actually Malevolent?

#

If you’re not familiar with the original SSC essay, read my summary below before reading my thoughts on top here…

Invisible Hand Actually Malevolent?

A Review of Meditations on Moloch!

Children given to Moloch

Moloch is about the triumph of incentives over values. The triumph of instrumental goals over terminal goals. The Nash-Equilibrium where the system is at a steady state is Moloch. The source of most evil. A trap where people can't get out of as they are forced to think and act locally. Falling prey to the competitive forces that maximize individual outcomes, instead of preferring cooperation to submit to the god of our values. Moloch appears at any point when multiple agents have similar levels of power and different goals. Moloch exemplifies unfortunate competitive dynamics.

Deep down, nobody actually wants it to keep going this way, even the winners. It's a hedonic cycle for civilization. Left unchecked, it will sacrifice all our values and all we really value. "Sacrifice values to get ahead." It is not necessarily greed; at points, "getting ahead" becomes necessary.

"Coordination problems create perverse incentives" is a very basic tenet of economics, which is essentially what the post boils down to. However, this economics-101 sentence is dull, uninspiring and doesn't really tell the entire story. Scott Alexander takes a perhaps poetic way of introducing the concepts to those who are unfamiliar with them. Mr. Alexander is a lecturer who had jazzed up "Week 4 - Coordination Problems" with a poetic personification, but with little economics literature around such problems. To do so, Alexander uses Allen Ginsberg’s poem, which serves as the post's underlying theme and is referenced throughout. Even with my familiarity with the concept of coordination problems, I still thought the poem itself was esoteric. I don't think referencing the poem helped to explain the concept. From the surface, it leaves the impression of writing things that sound intellectually rigorous as opposed to writing something that is actually intellectually rigorous. For the most part Alexander avoids this, but the Moloch stuff is more dubious.

In Ginberg's poems, Moloch isn't just a literal god. Neither a set of equations. Moloch is part of human nature — one we're horrified by. Scott Alexander does a good job of building the image of Moloch in our world. It gives off a vague, yet powerful sense of knowing. It sort of allows one to have a shorthand answer to why things happen — Moloch!" What is Moloch? The demon god of Carthage, and to him we say Carthego delenda est"

Where do we go from here? Per SSC, to defeat Moloch, we need an agent that we side with holding human values. "Elua" or the "Gardner" that will optimize for what we like. The essay reads ominous. Scott takes Ginsberg's poem and retells it — nature has fucked us over, and reason is the only thing that can save us from it. This reminds me of Bucky Fuller's quote "You never change things by fighting the existing reality. To change something, build a new model that makes the existing model obsolete."

Alexander's bias is along the lines of "AI is the looming existential threat that will kill us all". The first AI to hit Singularity-level will outstrip everything around it in terms of intelligence, and so would truly be a singular entity with no competition. This seems, to Alexander, not just a utopia, but the only viable way of escaping the Malthusian trap. I'm assuming this relates to good superintelligence – the only thing that will save us from a bad one, is a good one that sides with us. A battle between the evil god Moloch, and an alternative god Elua — a superintelligence that has values aligned with humans.

It’s tempting, and intellectually satisfying, to look at a set of problems, extract a meta-problem and then propose a solution: by solving the meta-problem, you solve all of its instances, too. However, the effectiveness of the solutions is dependent on how well the abstractions fit the instances. Plus, how unintended consequences won’t overshadow the benefits. The singular autocrat may stop us from races-to-the-bottom, but can implement policies we’re not particularly happy about.

In Alexander’s case, he just wants a mechanism to stop competition inevitably sliding into local optimization traps, not necessarily advocating for an ideal utopia. Surely our super-intelligent AI overlord would be tempted to stray outside those bounds and look for other ways to help humanity out. The AI is far smarter than we are and has the wellbeing of all of humanity in its purview. How long until it decides that it knows with certainty that it can better manage our happiness than we can?

So, what then?

I guess for Marx, capitalism was Moloch, and communism was a solution. While the god-like powers of a super-intelligent AI could potentially solve Communism's information problem, it can't know what is in people's hearts. It will provide a target for the power-hungry to attempt to co-opt, and in defending itself is likely to crush the freedom and flourishing that it was supposed to nurture. There’s a fatal flaw which has been demonstrated time and again by attempted instantiations of Communism: there are people who will go to unimaginable lengths to secure power. They outcompete anyone that’s mild mannered, and eventually the whole system collapses. Although it’s hard to predict how this will take place under our new AI overlord, I can predict it will happen ad-nauseam. Maybe the AI will detect and prevent subversions, but similar to autocrats' attempts, it’s hard to do without clamping down on freedom in general.

Similarly, one might argue there won’t be coordination problems if everything is ruled by one royal dynasty / one political party / one recursively self-improving artificial intelligence. To begin with, royal dynasties and political parties are not singletons by any stretch of the imagination. Infighting is Moloch. Getting to an absolute power required sacrificing a lot to Moloch during the wars between competing dynasties/political systems. But even if we assume an immortal benevolent human dictator, a dictator only exercises power through keys to power. Plus, has to constantly fight off competition for his power. Stalin didn't start the Great Purge for shits and giggles, and The Derg didn’t assassinate literate and opposing politicians in Ethiopia for nothing; it's a tried and true strategy used by rulers throughout history. Royal succession, infighting within parties, and interactions between individual modules of the AI, all sacrifices to Moloch. The hope with artificial superintelligence is that, due to the wide design space of possible AIs, we can perhaps pick one that is sub-agent stable and free of mesa-optimization, and also more powerful than all other agents in the universe combined by a huge margin. If no AI can satisfy these conditions, we are just as doomed. Even then, there’s the fragility of the outcome – there’s a huge risk of disutility if we happen to get an unfriendly artificial intelligence.

For Unabomber, the method to stop Moloch was the destruction of complex technological society and all complex coordination problems. I categorize this solution in the primitive bucket whereby one assumes all problems will be simple if we make our lifestyle simple. But that’s not defeating Moloch, but completely and unconditionally surrendering to Moloch in its original form of natural selection. Goals are mismatched. Avoiding Moloch is an instrumental goal; the terminal goal is to promote human well-being. But in primitive societies people starve, get sick, most of their kids die, etc. Additionally, this doesn’t work in the long term; even if you would reduce the entire planet into stone age, there would be a competition to see who gets out of the stone age first – which got us here in the first place.

A lot of the rationalist community is focused on AI, which makes sense in that light of the existential risk of unaligned AI. However, looking at projects focused on non-AI solutions to countering or defeating Moloch, I ran across Game B. Game B seems to be a discourse around creating social norms that defeat moloch. So far it seems to me like a group of people who are trying to improve the world by talking to each other about how important it is to improve the world. “What are all those AI safety people talking about? Can you please give me three specific examples of how they propose safety mechanisms should work?” I haven't seen easy answers or a good link for them.

Do Moloch and Eula co-exist? Aren’t they one? An enforcer god(Moloch) for the prize (Eula). Would we want Eula’s values if we didn’t strive for it? Anyways, let's finish off with this beautiful deception by Dostoevsky on the pessimism of utopia: *"Shower upon him every earthly blessing, drown him in a sea of happiness, so that nothing but bubbles of bliss can be seen on the surface; give him economic prosperity, such that he should have nothing else to do but sleep, eat cakes and busy himself with the continuation of his species, and even then out of sheer ingratitude, sheer spite, man would play you some nasty trick. He would even risk his cakes and would deliberately desire the most fatal rubbish, the most uneconomical absurdity, simply to introduce into all this positive good sense his fatal fantastic element. It is just his fantastic dreams, his vulgar folly that he will desire to retain, simply in order to prove to himself--as though that were so necessary - that men still are men and not the keys of a piano"*- Notes from Underground

Biblical Moloch

Summary of Initial Passage

Introducing The Beast

In Part I, the essay situates the main issue/character at play Moloch by illustrating him through Allen Ginsberg's Poem and multipolar traps that exist within society. In response to C.S Lewis' question "What does it? Earth could be fair, and all men glad and wise. Instead we have prisons, smokestacks, asylums...Sphinx of cement...eats up their imagination? The poem responds "Moloch does it" This part characterizes the theme of the essay by introducing us to Moloch -- the humanized version of civilization that we can almost "see". Through Bostrom's example of a dictator-less dystopia, Alexander introduces a lack of strong coordination mechanisms. From a god's-eye-view, we can optimize systems(especially ones filled with hardships with simple agreements, however, no agent within the system is able to "effect the transition without great risk to themselves".

To further illustrate these coordination issues, Alexander uses 10 real-world examples of multipolar traps: The Prisoner's Dilemma, Fish-Farming Story (one sneaky farmer will find a way to not pay for treating the shared pond, and the entire system follows), The Malthusian Trap (rats on an island are happy and “play music” until resources start being depleted by overpopulation becoming hard to exist, let alone play music), The Two-Income Trap (having a second job becomes the norm, without increasing quality of life if everyone does it), Agriculture is a less enjoyable way of living, but we are overpopulated so we need it, Arms Race (esp. expensive nuclear standoffs leading to heavy overspending of budgets that could go to better use), Cancer (only certain human cells overpopulating killing the host itself), and The "race to the bottom" where politics are pushed toward being more competitive than optimal for development of the society it leads.

Also other categories of multipolar traps where competition is regulated by an exterior source, i.e. social stigmas. Education - current methods are bad, but there is social signaling at play that perpetuates the system forward. Science - funding research, peer-reviews, and statistical significance tests are flawed, but rigor reduces the incentives a scientist gets from the previous. mentioned methods. Government Corruption. Congress - "From a god's-eye-view, every Congressperson ought to think only of the good of the nation. From within the system, you do what gets you elected."

Questioning Our Motives

In this part Scott questions why as evolved and cognizant humans we fall to these traps. Answer – incentives hard-coded. Expands on why it's hard to switch these incentives. Due to these competitions everyone's "relative status is about the same as before, but everyone's absolute status is worse than before." Incentives drive us collectively and they're built in analogy of terrain to determine the shape of the river. Although building canals by altering terrains is possible, it's hard nonetheless. Incentives are hard to change -- especially from the hard coded ones of humanity. It's because of these incentives that things like Vegas, that doesn't optimize civilization, but "exists because of a quick in dopaminergic reward circuit", exist.

Retardants Of Our Downfall

Given the beast and our inability to resist it, how have we not bottomed out yet. Part 3 discusses this by nominating reasons for a deceleration of our downfall. Well if everything seems rather bleak, what holds us from our incentives charging us rapidly downhill? "Why do things not degenerate..." Three basic reasons for the slowed, but inevitable, downfall. Excess Resources - we haven't reached the critical breaking point the Mathusalan rats experienced yet. Physical Limitations - there's literal physical limits to how far we can run downhill (eg. #of babies a woman can bear) Utility Maximization - "We've been thinking in terms of preserving values versus winning competitions, and expecting optimizing for the latter to destroy the former." However, fulfilling utilities sometimes need values to be optimized - although the equilibrium is fragile. eg. CSR to be a good firm. Greed doesn't bear capitalism, capitalism bears greed in people... Coordination - Although the lack of coordination is the main reason of these traps, subtle but potent coordination systems especially social codes are strong enough keeps us out of traps by "changing our incentives"

Tech Is An Accelerant

In this part, Alexander takes away the slight bit of hope that these 4 brakes introduce to slow your descent by introducing a new dimension, time. Additionally, Alexander points out at the acceleration of tech to fasten the blow on this dimension with glim dystopian futures where tech/ai eliminates each of the four brakes in Part. 3.

Well we'll reach these multipolar traps -- even if slow. Time is a relative, but key scale. Time is thus a dimension worth discussing. Time is further pushed by accelerated growth in technology. We can break the brakes in part 3, by reducing/removing physical limitations, for example. Tech deduces utility maximization as there is reduced need for human values, and coordination is unlocked to a new level by tech. Alexander further dramatizes the dimension of time and exasperation with technology by using a.i. dystopian futures. “The last value we have to sacrifice is being anything at all, having the lights on inside. With sufficient technology we will be "able" to give up even the final spark.”

Once The Genie’s Out The Box, There’s No Going Back.

Gnon - nature, and its god - operates within Newton's third law of action necessitating a reaction. Gnon is basically Nick Land's version of Moloch. Violating these nature's laws through civilization leads to Gnon's wrath and our downfall. Gnon is a punishing god with no escape. **Reality Is Seemingly Sad**

The future is bleak, and Gnon is just another exemplification of Moloch. Submitting to them and following the "natural order of things" isn't going to make you "free". There is no order! It's always downfall. **Alternatives To Inevitable Downfall?**

So what now? Given that Moloch/Gnon or whatever wants us, and everything we value (i.e. art, science, love, philosophy, consciousness) dead, defeating them should be a high priority. Alluding to Bostrom's Superintelligence whereby the design of an intelligent machine will create a feedback loop of out-intellegenting itself. Given our action plan should be designing computers/intelligence that is smarter than us, but still keeps human values. But contrary to hubris where expecting god to wall us off if we submit to him, this Alexander proposes a transhumanist movement that is "rather actionable.” Remove God from the picture entirely. As he puts it, "I am a transhumanist because I do not have enough hubris not to try to kill God."

**Un-incentivized Incentivizer!**

Elua – the god of "... free love and all soft and fragile things" and mostly human values still exists. Even if the god seems weaker without worshippers, there he exists. As long as Moloch, the god where you can throw things you love to be granted power, exists, the offer is irresistable. A stronger god where we should help.

Expansion of Mobile Money in Ethiopia

Summary

I looked at Ethiopia’s current business climate for mobile payment solutions that is financially inclusive. The current reach of mobile money has left the unbanked population that would’ve greatly benefited from the services, untouched. During the course of the paper, I look at regulatory reforms following 2018’s government change and other regulatory reforms to establish ground for additive services between EthioTel and mobile money incumbents to include unbanked population. Then, I look at the economic viability, market size, and economic considerations of the symbiotic relationship established on the telecom giant’s infrastructure. Finally, I look at how current sociocultural complexes could be navigated and benefit from the solutions suggested. \

Introduction

Saying Ethiopia’s economy is cash-dominated would be an understatement. Only 31% of the population has bank accounts, making financial services in rural areas close to impossible. Borrowing money and other financial services take place through mediocre Micro-Financing Institutions (MFIs) and local savings clubs. (A) Mobile money is making a significant impact in bridging the digital divide between the developed and the developing countries, making millions of poor people use devices to transfer money, pay for goods, and access sophisticated financial services. (Dermish, 2007) The recent regulatory climate of the Abiy revolution facilitates the formation of mobile money services that don’t require bank accounts. In this midst, partnership with EthioTel and incumbents would create great potential. Along with the right endorsement and orientation, it could reach unbanked regions, improve saving, and fuel growth in Ethiopia. 1 Regulatory Environment

Regulatory Environment

Since coming to power mid-2018, Ethiopia’s Prime Minister Abiy has promised to “openup” the economy and loosen its monopoly on state-owned enterprises. (A) Ethiopia’s highly regulated macroeconomic environment includes state ownership of the sole telecommunication provider – EthioTel. The commitment to liberalization started with partial privatization EthioTel and Ethiopian Airlines - Africa’s biggest flagship carrier. (A)

PM Abiy’s move also included an extensive overhaul of the financial sector. To boost noncash payments, the government announced the successful Kenyan mobile payment solution - MPesa would enter Ethiopia. (A) However, government doors were shut before completion of the deal. The sudden move was directed at excluding foreign fintech from reaping the business benefits and potential of the Ethiopian market. Plus, M-Pesa was considered to stifle local innovation. (A) Soon after, The House passed a bill September 2019 authorizing non-financial institutions, including EthioTel, to engage in mobile money services. The liberalization of EthioTel to private investors and newfound ability to participate in financial services allows partnerships with existing mobile money companies to emerge. A symbiotic relationship would morph the widescale network and userbase from EthioTel’s side; with payment infrastructure, institutional bank relationship, and payment agents from MBirr/CBE-Birr’s side. There are multiple advantages of employing existing local firms for mobile money solutions. One is the ability to prioritize unbanked regions, as urban regions are already within their userbase. Second, these firms participate in developing social values of saving and investing. All the while, transaction trend data could be used to inform policy decision making in the future.

Economic Considerations

Ethiopia’s economy has been growing with double digits over the past 10 years and will continue to thrive in 2021-24. (A) It also has high levels of FDI that will incentivize the government forward with similar reformist agendas. However, operational mobile payment platforms have had limited growth. All service providers have no banking license, which allows them to provide the service directly to customers – essential for unbanked citizens. So, platforms have been targeting banked, urban users that saw limited utility. M-Birr, Ethiopia’s first mobile money based on two banks and state microfinancing firms, only has 1.2 million users. Similarly, CBE-Birr (affiliated with Commercial Bank of Ethiopia), Hello Cash ( from Cooperative Bank of Oromia) (A), and Amole (operated by Awash Bank)(A) have had hampered growth. (A)

Amidst all of this, EthioTel has been growing tremendously over the past 7 years, reaching 44% of the population, while smartphone internet penetration lags. EthioTel’s widely available SMS SIM will have a hand in deriving better reach and inclusion of financial services – even without internet access. However, conflicts of interest will arise if Ethio-Tel decides to proceed with mobile payment services on its own – even after partial privatization. Instead, Ethio-Tel’s partnership should provide the SMS infrastructure needed to support M-Birr, CBE-Birr, and others in providing financial inclusion to non-banked. Mobile money has the potential to reach unbanked people with phones, most of whom are under the government safety net. Ethiopia’s Ministry of Finance could see significantly better efficacy from delivering Productive Safety Net Program’s financial assistance through mobile payment - contrary to cash where funds often get embezzled. The bill passed also requires a minimum of 50mill Birr and at least 10 shareholders to apply for a mobile money service license. This hurdle makes entrants more trustworthy, accountable, and sizable enough for healthy competition. Thus, strong capital markets and venture money going into mobile money, which has proven a lucrative investment in other African nations, will be of great benefit. To compete with more prominent incumbents (M-Birr and CBE-Birr), startups should form coalitions and agree with banks. That would strengthen their reach and potential to support transactions backed with assets.

Social and Cultural Considerations

Ethiopians are generally skeptical of innovation. They have a hard time trusting newer institutions, and legacy ones prevail – even with sub-par offerings. A past survey done in rural banked communities indicates that most people would rather walk an average of 3-4 miles for bank locations to find that ATMs are non-functional than use mobile payment methods. Mistrust emanates from thinking mobile money is independent of government control. Thus, endorsement from financial institutions, backing from EthioTel, and advocation from government bodies goes a long way in assuring communities.

Plus, Ethiopian’s are recognized for their short-term-orientation, especially in rural areas. The saying “Worrying doesn’t take away tomorrow’s troubles; it takes away from today’s peace” is usually taken out of context to oppose saving culture. Lack of financial inclusion doesn’t help. The government’s repeated trials to improve saving could benefit from mobile money solutions. Past studies on other African countries with mobile money solutions have shown an improvement in the likelihood of saving by 10.9%. (A)

Finally, entrepreneurship has been growing over the past five years due to increased backing from the government, a high number of STEM graduates, and jobless rates going up. Technological innovation has been on a steep climb, building Addis Ababa’s Sheba Valley. However, a major impediment in the new ecosystem is the lack of payment gateways that support audiences these startups are targeting. Current API’s don’t support the non-banking population, significantly limiting the market size and ability to develop economies of scale. The start of this service wouldspur growth in companies that offer online services, including e-commerce and delivery, fueling growth.

Are Algorithmic Stablecoins Possible? A Deep Dive Into Crypto's Holy Grail

TL;DR

After extensive research and analysis, I conclude that non-collateralized algorithmic stablecoins are possible, but with a major caveat: they need to earn their way to becoming fully algorithmic rather than starting that way from day one. Through analysis of historical attempts, game theory modeling, and statistical validation, I show that the optimal path is starting with partial collateralization and gradually reducing it as market confidence grows - similar to how the US dollar evolved away from the gold standard. FRAX’s fractional-algorithmic approach demonstrates this is viable, maintaining remarkable stability while reducing collateral requirements based on market demand.

Introduction

The holy grail of crypto has always been building the perfect digital money - a currency that’s stable enough for everyday use but free from government control. Bitcoin showed us that decentralized digital money is possible, but its volatility makes it impractical for buying coffee or getting paid a salary. Stablecoins emerged as a solution, but most rely on traditional financial system collateral, defeating the purpose of crypto’s promise of true decentralization.

This led me down a rabbit hole: are truly decentralized, non-collateralized algorithmic stablecoins actually possible? Or are they just a pipe dream?

It’s a deceptively complex question that touches on game theory, monetary policy, market psychology, and mechanism design. After diving deep into historical attempts, analyzing their failures, and studying successful approaches, I’ve developed a perspective I’m excited to share.

The Quest for Ideal Money

Before we can determine if algorithmic stablecoins are possible, we need to define what “ideal money” actually looks like. Drawing from economist John Nash’s work, I argue ideal money needs to solve the fundamental conflict between short-term and long-term interests in creating a stable digital currency.

Breaking this down, money serves three core functions:

  1. Unit of Account - A consistent way to measure value
  2. Store of Value - A reliable way to save wealth over time
  3. Medium of Exchange - An efficient way to transact

The key insight is that these functions need to work across different time horizons. USD works great as a medium of exchange in the short term, but inflation erodes its store of value over decades. Gold maintains long-term value but is impractical for daily transactions.

Ideal money would excel at all three functions both in the short and long term. It would also need to be:

  • Independent from government control
  • Globally scalable
  • Capital efficient

This is a high bar! But it gives us a framework to evaluate different approaches.

Why Focus on Algorithmic Stablecoins?

Through process of elimination, I found that non-collateralized algorithmic stablecoins are theoretically the closest to ideal money:

  • Bitcoin/ETH: Great for decentralization but too volatile
  • Fiat-backed stablecoins (USDC): Stable but rely on traditional banking
  • Crypto-collateralized stablecoins (DAI): Capital inefficient due to overcollateralization
  • Commodity-backed stablecoins: Hard to scale, tendency toward centralization
  • Algorithmic stablecoins: Potentially stable, scalable, and truly decentralized

The problem? Building them is HARD. Really hard. I identified three core challenges:

  1. Actually maintaining stability - The mechanisms need to reliably keep the peg
  2. Building sufficient network effects (Lindy Effect) - Need to become “money-like” enough that people trust them
  3. Overcoming the “Paradox of Stability” - Need speculation to grow but speculation creates instability

Learning from Failed Attempts

To understand if these challenges can be overcome, I analyzed two prominent approaches and their real-world implementations:

The Rebase Approach (Ampleforth)

Ampleforth tried to maintain stability by automatically adjusting everyone’s wallet balances based on the price. If AMPL is trading at $2, everyone’s balance doubles but each token is worth $1. Sounds clever right?

The problem is this doesn’t actually create stability - it just masks volatility. Your wallet might show 100 AMPL tokens worth $1 each today and 50 AMPL tokens worth $1 each tomorrow, but your purchasing power still fluctuated! The stability is an illusion.

Through game theory analysis, I showed how the incentives ultimately lead to speculation rather than true stability.

The Seigniorage Shares Approach (Basis)

Basis tried a multi-token model where “bond” and “share” tokens would absorb volatility to keep the main stablecoin pegged. When price is high, new stablecoins are minted and given to shareholders. When price is low, bonds are sold at a discount to remove stablecoins from circulation.

While more sophisticated, I identified fatal flaws in the mechanism:

  • Bonds expire after 5 years, creating dangerous cliffs
  • Lack of fungibility in bonds reduces their effectiveness
  • Circular dependency in incentives (need faith in future growth to maintain current stability)

The project ultimately shut down due to regulatory concerns, but the fundamental economic issues would have likely caused problems anyway.

A Better Way: The FRAX Approach

After seeing how pure algorithmic approaches failed, I analyzed FRAX’s hybrid “fractional-algorithmic” design. Rather than starting fully algorithmic, FRAX begins fully collateralized and algorithmically reduces the collateral ratio based on market demand.

The key innovations:

  1. Market-Driven Collateral Ratio - When demand is high and price is above peg, collateral requirements automatically decrease. When confidence falls, collateral increases.

  2. Programmatic Market Operations - Similar to how central banks conduct open market operations, but fully automated and transparent.

  3. Progressive Decentralization - Starts with training wheels (collateral) but systematically removes them as the system proves itself.

The game theory checks out - there are clear incentives for arbitrageurs to maintain the peg while the collateral provides a confidence backstop. The mechanism allows for a gradual building of trust rather than requiring it from day one.

Statistical Validation

To validate these theoretical arguments, I conducted statistical analysis comparing volatility across different stablecoin designs. The results were striking:

  • FRAX showed volatility levels comparable to fully-collateralized stablecoins despite much lower collateral requirements
  • Failed algorithmic stablecoins like Basis and Ampleforth showed significantly higher volatility
  • FRAX’s price movements were more correlated with established stablecoins than other algorithmic attempts

This empirically supports the theory that the fractional-algorithmic approach can deliver true stability.

Conclusion: Evolution Over Revolution

So are non-collateralized algorithmic stablecoins possible? Yes, but they have to earn their way there rather than starting from zero.

The key insight is that money is fundamentally about trust. The US dollar didn’t start as pure fiat currency - it evolved from gold-backing as faith in the system grew. Similarly, algorithmic stablecoins need to build trust before removing their collateral training wheels.

FRAX shows this is possible by:

  1. Starting with full collateral to bootstrap confidence
  2. Systematically reducing collateral as market demand proves sustainability
  3. Maintaining clear incentives and transparency throughout the process

The end goal of fully algorithmic stablecoins may be achievable, but the path there is through evolution rather than revolution. We need to recognize that while code is law, money is ultimately a social technology built on trust.

Looking Forward

This research opens up exciting future directions:

  • How can we optimize the collateral reduction process?
  • What role will these systems play in the broader financial system?
  • Can we create better price oracles and stability mechanisms?

But the core conclusion remains: algorithmic stablecoins are possible if we take the right approach. By learning from past failures and embracing progressive decentralization, we can work toward truly ideal money.

The dawn of algorithmic money isn’t a matter of if, but when and how. And that’s pretty exciting.


This post summarizes research I conducted for my undergraduate thesis. For the full academic analysis including detailed game theory modeling and statistical validation, check out the full paper [link].

Algorithmic stablecoin issues


layout: post title: “Algorithmic Stablecoin Issues and Game Theoretical Building Blocks” tags: [research] categories: [research] allowed_emails: [‘amenti4k@gmail.com’] —

TL;DR

After extensive research and analysis, I conclude that non-collateralized algorithmic stablecoins are possible, but with a major caveat: they need to earn their way to becoming fully algorithmic rather than starting that way from day one. Through analysis of historical attempts, game theory modeling, and statistical validation, I show that the optimal path is starting with partial collateralization and gradually reducing it as market confidence grows—similar to how the US dollar evolved away from the gold standard. FRAX’s fractional-algorithmic approach demonstrates this is viable, maintaining remarkable stability while reducing collateral requirements based on market demand.


Introduction

The holy grail of crypto has always been building the perfect digital money—a currency that’s stable enough for everyday use but free from government control. Bitcoin showed us that decentralized digital money is possible, but its volatility makes it impractical for buying coffee or getting paid a salary. Stablecoins emerged as a solution, but most rely on traditional financial system collateral, defeating the purpose of crypto’s promise of true decentralization.

This led me down a rabbit hole: Are truly decentralized, non-collateralized algorithmic stablecoins actually possible? Or are they just a pipe dream?

It’s a deceptively complex question that touches on game theory, monetary policy, market psychology, and mechanism design. After diving deep into historical attempts, analyzing their failures, and studying successful approaches, I’ve developed a perspective I’m excited to share.


The Quest for Ideal Money

Before we can determine if algorithmic stablecoins are possible, we need to define what “ideal money” actually looks like. Drawing from economist John Nash’s work, I argue ideal money needs to solve the fundamental conflict between short-term and long-term interests in creating a stable digital currency.

Breaking this down, money serves three core functions:

  1. Unit of Account - A consistent way to measure value
  2. Store of Value - A reliable way to save wealth over time
  3. Medium of Exchange - An efficient way to transact

The key insight is that these functions need to work across different time horizons. USD works great as a medium of exchange in the short term, but inflation erodes its store of value over decades. Gold maintains long-term value but is impractical for daily transactions.

Ideal money would excel at all three functions both in the short and long term. It would also need to be:

  • Independent from government control
  • Globally scalable
  • Capital efficient

This is a high bar! But it gives us a framework to evaluate different approaches.


Why Focus on Algorithmic Stablecoins?

Through process of elimination, I found that non-collateralized algorithmic stablecoins are theoretically the closest to ideal money:

  • Bitcoin/ETH: Great for decentralization but too volatile
  • Fiat-backed stablecoins (USDC): Stable but rely on traditional banking
  • Crypto-collateralized stablecoins (DAI): Capital inefficient due to overcollateralization
  • Commodity-backed stablecoins: Hard to scale, tendency toward centralization
  • Algorithmic stablecoins: Potentially stable, scalable, and truly decentralized

The problem? Building them is hard. Really hard. I identified three core challenges:

  1. Actually maintaining stability - The mechanisms need to reliably keep the peg
  2. Building sufficient network effects (Lindy Effect) - Need to become “money-like” enough that people trust them
  3. Overcoming the “Paradox of Stability” - Need speculation to grow but speculation creates instability

Learning from Failed Attempts

To understand if these challenges can be overcome, I analyzed two prominent approaches and their real-world implementations:

The Rebase Approach (Ampleforth)

Ampleforth tried to maintain stability by automatically adjusting everyone’s wallet balances based on the price. If AMPL is trading at $2, everyone’s balance doubles but each token is worth $1. Sounds clever, right?

The problem is this doesn’t actually create stability—it just masks volatility. Your wallet might show 100 AMPL tokens worth $1 each today and 50 AMPL tokens worth $1 each tomorrow, but your purchasing power still fluctuated! The stability is an illusion.

Through game theory analysis, I showed how the incentives ultimately lead to speculation rather than true stability.

The Seigniorage Shares Approach (Basis)

Basis tried a multi-token model where “bond” and “share” tokens would absorb volatility to keep the main stablecoin pegged. When price is high, new stablecoins are minted and given to shareholders. When price is low, bonds are sold at a discount to remove stablecoins from circulation.

While more sophisticated, I identified fatal flaws in the mechanism:

  • Bonds expire after 5 years, creating dangerous cliffs
  • Lack of fungibility in bonds reduces their effectiveness
  • Circular dependency in incentives (need faith in future growth to maintain current stability)

The project ultimately shut down due to regulatory concerns, but the fundamental economic issues would have likely caused problems anyway.


A Better Way: The FRAX Approach

After seeing how pure algorithmic approaches failed, I analyzed FRAX’s hybrid “fractional-algorithmic” design. Rather than starting fully algorithmic, FRAX begins fully collateralized and algorithmically reduces the collateral ratio based on market demand.

The key innovations:

  • Market-Driven Collateral Ratio - When demand is high and price is above peg, collateral requirements automatically decrease. When confidence falls, collateral increases.
  • Programmatic Market Operations - Similar to how central banks conduct open market operations, but fully automated and transparent.
  • Progressive Decentralization - Starts with training wheels (collateral) but systematically removes them as the system proves itself.

The game theory checks out—there are clear incentives for arbitrageurs to maintain the peg while the collateral provides a confidence backstop. The mechanism allows for a gradual building of trust rather than requiring it from day one.


Statistical Validation

To validate these theoretical arguments, I conducted statistical analysis comparing volatility across different stablecoin designs. The results were striking:

  • FRAX showed volatility levels comparable to fully-collateralized stablecoins despite much lower collateral requirements
  • Failed algorithmic stablecoins like Basis and Ampleforth showed significantly higher volatility
  • FRAX’s price movements were more correlated with established stablecoins than other algorithmic attempts

This empirically supports the theory that the fractional-algorithmic approach can deliver true stability.


Conclusion: Evolution Over Revolution

So are non-collateralized algorithmic stablecoins possible? Yes, but they have to earn their way there rather than starting from zero.

The key insight is that money is fundamentally about trust. The US dollar didn’t start as pure fiat currency—it evolved from gold-backing as faith in the system grew. Similarly, algorithmic stablecoins need to build trust before removing their collateral training wheels.

FRAX shows this is possible by:

  • Starting with full collateral to bootstrap confidence
  • Systematically reducing collateral as market demand proves sustainability
  • Maintaining clear incentives and transparency throughout the process

The end goal of fully algorithmic stablecoins may be achievable, but the path there is through evolution rather than revolution. We need to recognize that while code is law, money is ultimately a social technology built on trust.


Looking Forward

This research opens up exciting future directions:

  • How can we optimize the collateral reduction process?
  • What role will these systems play in the broader financial system?
  • Can we create better price oracles and stability mechanisms?

But the core conclusion remains: algorithmic stablecoins are possible if we take the right approach. By learning from past failures and embracing progressive decentralization, we can work toward truly ideal money.

The dawn of algorithmic money isn’t a matter of if, but when and how. And that’s pretty exciting.


This post summarizes research I conducted for my undergraduate thesis. For the full academic analysis including detailed game theory modeling and statistical validation, check out the full paper [link].

Content Cannon I

#

Content Cannon Of The Week (OTW)

Blogs/Pods Songs Other

Blogs/Pods

Beware of the Casual Polymath

I find myself being curious about a broad range of topics. And I’m not along in this. Most of us get caught up in interesting rabbit-holes which we explore in depth. In doing so, the label “polymath” starts floating around; either through self-labeling for just an explanatory adjective and sometimes for prestige, or others calling us polymaths due to the seemingly vast variety of topics we have expertise on.

My point is that we should not trust or glorify people on the basis of their apparent “Universal Genius”. Having a variety of interests is no more a sign of generalized intelligence than being able to walk and chew gum. And if someone does appear to have accomplishments in a variety of domains with fungible currency, their total status should not be a sum or multiple, but merely the status of their single most impressive feat.

The blog argues against the presence of the casual polymath and how the generalization to multiple disciples is faulty. An expert in one domain can master an unrelated skill, and be perceived as having general intelligence that can be extrapolated to another subject matter. If I’m good at data science, finance, and epidemiology, should you trust my opinions on politics as well?

So go read your SaaS/Meta-Science/Aerospace blog and revel in the genuine joy of intellectual curiosity. As Tyler Cowen would say, I’m just here to lower the status of polymaths.

Songs

Chance The Rapper

On this week’s song list is Chance the Rapper. Chance made ripples in the industry after his **Acid Rap** album that was hallucinatory revelations and “cigarette burns of his journey”. Acid Rap had both critical and popular acclaim. Chance wasn’t trying to be alternative. His work included his inspirations unbounded from elements of soul and gospel to blues-rock and jazz, and even house music.

The album propelled him into the rap stratosphere, and his gospel aura was featured on Kanye’s Ultra Light Beam from Life of Pablo gave him a buzz that seemed to have dominated his next project met with mixed reviews. The Coloring Book album started creating rifts between Chance and his core fanbase that was further pushed to a break-point with his latest album in 2019 The Big Day. The big day popularized the term “Chance fell-off”, referring to the stature and artistic he had shown with his initial releases. Part of the reason was fans felt the hunger Chance the Rapper had in forming his foundations, fan-base, and identity was gone.

Chance definitely grew up since 2013 and got a family now so that really effects the music, as well as his environment. He’s in a different state now. He is probably never gonna tap into that side of him again, at least not for a while… but who knows, music is forever changing, and artists are forever changing.

His fans have been begging for a rebirth for a while now, and it seems like he’s heard! Chance has been releasing singles over the past month ramping up for a new project. Similar to how he has the term “falling off” following him, the term “Chance is making a comeback” is making noise around the music scene again. I’m excited, and indeed there’s a freshness in these tracks that get’s me excited!

Enjoy - and watch the video as it’s an essential piece.

Other Findings

I was high when thinking of this, anyways, ways one spends their Sundays is an indication of where one is in life.

Retention Campaign at RaiseMe

Thirty percent of college students drop out of university. Contrary to people’s perception of incompetence, fifty-one percent blame finances. During my summer at RaiseMe - a startup for student success, two interns and I launched Retention. Through the process, I learned to leverage the diversity of thought, iteratively build and remodel projects and taking initiative.

During calls with students at risk of dropping out, I realized the need for RaiseMe’s platform to extend beyond getting students to college - ensuring they stay on track after enrollment. Constraints of human resources within the startup necessitated taking charge of building the extension. I convinced the other two interns and got a green light from our manager to dedicate extra time to the cause.

My team-mates were; an American that attended a prestigious university, a second-generation immigrant that went through community college, and myself - an international student that was part of a revolution in education. The difference in perspectives among the three of us was an edge. Varied frames of reference covered for our blindsides. We advanced in building statistical models that accounted for meaningful variables - even under unfamiliar contexts.

A significant correlation for student success was attributed to factors, like registering for orientation and classes on time and attending community events/meetups. Pilots with Wayne State University revealed unforeseen elements. Models had to be redrawn and tested multiple times. Planning, building, checking, and adjusting metrics revealed insights. I understood the power of continuous iteration and design thinking in making real products. More than the end product of the project, I learned that teams are more than arithmetic sums of their parts but powerhouses that harbor diversity of thought. Plus, I developed the process of repeatedly going back to the drawing board and iterating projects for realistic outcomes. Lessons that I take with me, even after that summer.