I’m messing around with OpenAI’s gpt-oss:20b. Why? No clue, really. It’s their first open-source model, and I’m curious. Instead of using fancy tools like ChatGPT or Copilot, I get to fiddle with it directly. It’s kinda based on GPT-4 but says it’s got a knowledge cutoff of June 2024. Impressive? Maybe. It can even do web searches. Cool or creepy? You decide.
Now, why did I test it with something only 10-year-olds should be sweating over? The 11+ exam in the UK, for those scratching their heads abroad. It’s a test to get into fancy schools. My kid is doing it soon, so why not see if this AI can outsmart my child? Spoiler: my son is safe.
Hardware talk — bear with me. I’m using an RTX 5080. It’s a beast, but not beastly enough for our AI friend, it seems. With 16GB of VRAM, my system started sweating and calling in backup from the CPU. I’m thinking of upgrading to an RTX 5090. More power, less stress. Fingers crossed.
So, the grand ‘Is it smarter than a 10-year-old?’ showdown. I tossed a practice 11+ test at it. “Here you go, brainiac,” I said (in my head). It spat out answers slowly, like a 15-minute think-tank session. The result? Nine correct answers out of 80. Not exactly Einstein level, right?
Here’s where it got entertaining — or frustrating, you pick. It nailed some questions, blundered others, real head-scratchers like misplaced answers, completely lost the plot. The riddle? It headed to the right answer in its reasoning but derailed somewhere between thinking and sharing.
“Oh, wait, got sidetracked!” Right, second attempt. I pumped up the settings, let it take its sweet time. A whole hour, folks. Did better, didn’t necessarily beat all. Yet, some responses made zero sense — like submitting a DIY quiz instead of answers. Creative, if nothing else.
Lessons learned? Increasing context length helps. More memory, more success, less hair-pulling for me. But is it all worth it? Time-consuming and hardware-taxing. Running it turned my setup into a surprise sauna. How nice.
At the end of the day, while AI isn’t passing the 11+ anytime soon, the experiment expanded my own cortex a tiny bit. Plus, hey, the office needed heating. Always look on the bright side, right?