o3-mini is insane at simulating computations

OK, this is wild. I just saw o3-mini (regular) to precisely simulate (calculate?) output of quite complicated computations. Well, at least for a human… and no, it didn’t use code interpreter (more on that later).

The prompt

It all started with a prompt like that (sic!, sorry for the typos):

How deoes TfidfVectorizer behave with {“use_idf”: True, “binary”: True} ? Please create a toy samples to illustrate that behavior in different edge cases to exemplify. Think about all possible cases. I want to know how both of thos optins behave. That means I want to understand how the vectortizer works with and without them.

As you see I needed to explain the intricacies of TfidfVectorizer to someone, so I asked for a code that would help me illustrate that.

The model reasoned for 8 second and provided the following piece of code (for brevity I am omitting both reasoning and full response):

import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer

# Define a toy corpus with edge cases:
corpus = [
    "apple apple orange",            # 'apple' repeated, 'orange' once
    "apple banana banana banana",    # 'banana' repeated, 'apple' once
    "banana orange",                 # both appear only once
    "kiwi"                           # a document with a unique term
]

# 1. Default: use_idf=True, binary=False
vec_default = TfidfVectorizer(use_idf=True, binary=False)
tfidf_default = vec_default.fit_transform(corpus)
print("Default (use_idf=True, binary=False):")
print("Features:", vec_default.get_feature_names_out())
print("TF-IDF matrix:\n", tfidf_default.toarray())

# 2. Binary: use_idf=True, binary=True
vec_binary = TfidfVectorizer(use_idf=True, binary=True)
tfidf_binary = vec_binary.fit_transform(corpus)
print("\nBinary (use_idf=True, binary=True):")
print("Features:", vec_binary.get_feature_names_out())
print("TF-IDF matrix:\n", tfidf_binary.toarray())

# 3. For comparison: Disabling idf with binary=True (i.e. use raw binary counts)
vec_no_idf = TfidfVectorizer(use_idf=False, binary=True)
tf_no_idf = vec_no_idf.fit_transform(corpus)
print("\nBinary, no idf (use_idf=False, binary=True):")
print("Features:", vec_no_idf.get_feature_names_out())
print("TF matrix (with binary counts):\n", tf_no_idf.toarray())

The Simulation

I’m used to work with 4o and its code interpreter, so just out of habit I asked o3-mini to run that code without realizing that the code interpreter is not available. However, what it produced surprised me greatly!
Model reasoned for just 18 seconds and produced this:

Below is the simulated output of running the code. Note that the actual numerical values may have small variations due to floating‐point precision and the internal workings of scikit‐learn, but the structure will be as shown.

Default (use_idf=True, binary=False):
Features: ['apple' 'banana' 'kiwi' 'orange']
TF-IDF matrix:
 [[0.89442719 0.         0.         0.4472136 ]
  [0.31622777 0.9486833  0.         0.        ]
  [0.         0.70710678 0.         0.70710678]
  [0.         0.         1.         0.        ]]

Binary (use_idf=True, binary=True):
Features: ['apple' 'banana' 'kiwi' 'orange']
TF-IDF matrix:
 [[0.70710678 0.         0.         0.70710678]
  [0.70710678 0.70710678 0.         0.        ]
  [0.         0.70710678 0.         0.70710678]
  [0.         0.         1.         0.        ]]

Binary, no idf (use_idf=False, binary=True):
Features: ['apple' 'banana' 'kiwi' 'orange']
TF matrix (with binary counts):
 [[0.70710678 0.         0.         0.70710678]
  [0.70710678 0.70710678 0.         0.        ]
  [0.         0.70710678 0.         0.70710678]
  [0.         0.         1.         0.        ]]

Before it finished generating that, I realized that I have to run it for myself so I switched to another window…

The actual output

The actual output that I got by running the original code is practically identical:

Default (use_idf=True, binary=False):
Features: ['apple' 'banana' 'kiwi' 'orange']
TF-IDF matrix:
 [[0.89442719 0.         0.         0.4472136 ]
 [0.31622777 0.9486833  0.         0.        ]
 [0.         0.70710678 0.         0.70710678]
 [0.         0.         1.         0.        ]]
Binary (use_idf=True, binary=True):
Features: ['apple' 'banana' 'kiwi' 'orange']
TF-IDF matrix:
 [[0.70710678 0.         0.         0.70710678]
 [0.70710678 0.70710678 0.         0.        ]
 [0.         0.70710678 0.         0.70710678]
 [0.         0.         1.         0.        ]]
Binary, no idf (use_idf=False, binary=True):
Features: ['apple' 'banana' 'kiwi' 'orange']
TF matrix (with binary counts):
 [[0.70710678 0.         0.         0.70710678]
 [0.70710678 0.70710678 0.         0.        ]
 [0.         0.70710678 0.         0.70710678]
 [0.         0.         1.         0.        ]]

No idea, honestly, how it did manage to pull that off but now I know for sure, if we’re living in the simulation it’s not running on old 20 century hardware and in such case we can’t tell the difference.

One thought on “o3-mini is insane at simulating computations”

casparaddyman says:

2025-02-24 at 13:30

Those seem to be (commonplace) algebraic quantities expressed in decimal terms.
. 70710678 = 1/√2
. 4472136 = 2/√5
.31622777 =1/√10

So it could have memorized those and separately some basic matrix algebra and identities somewhere put it all together.

Impressive stuff but not necessarily mathematical calculation.

o3-mini is insane at simulating computations

The prompt

The Simulation

The actual output

Related

One thought on “o3-mini is insane at simulating computations”

Leave a ReplyCancel reply

GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano

DolphinGemma: Unveiling the Language of the Seas with AI

Grok 3 API Debuts with Scalable Models for Code, Data, and Enterprise Tasks

Smarter GitHub Automation with the MCP Server

China Unveils GPMI: A Single-Cable Standard for 8K Video and High Power

When Weather Apps Steal Your SSH Keys

Llama 4

Tame Your Terminal: Managing AI Coding Agents with Claude Squad

Command Smarts: Exploring the Power of MCP Tools