🤖 They FINALLY Made an AI That Doesn’t Need Us Anymore – Self-Trained & Limitless! 🚀🧠 | AI Revolution - video Dailymotion

Ai Revolution

The next leap in AI evolution is here: a fully self-trained AI system that learns and improves without human intervention. With no preset limits, this groundbreaking technology adapts, innovates, and evolves autonomously—raising profound questions about the future of AI, control, and collaboration between humans and machines. Are we ready for truly independent artificial intelligence? 🔥⚡  #SelfTrainedAI #AutonomousAI #AIRevolution #LimitlessAI #ArtificialIntelligence #MachineLearning #NextGenAI #AIInnovation #FutureTech #DeepLearning #AIIndependence #TechBreakthrough #SmartMachines #AIProgress #AIResearch #DigitalEvolution #AIControl #Futurism #AIUpdate #TechNews #AIFuture

Transcript

00:00So here's what's going on. One AI model just figured out how to train itself with

00:05zero data. Literally no human input. Another one just leveled up into a full

00:10on autonomous research agent. It browses the web, digs through complex info, and

00:15writes full reports. ChatGPT now saves all your generated images in one neat

00:21library. And OpenAI might even offer a lifetime subscription for ChatGPT soon.

00:26But none of that tops this. A woman actually divorced her husband after ChatGPT

00:33told her he was cheating. She fed it a few details, the AI connected the dots, and

00:38that was enough for her to walk. So yeah, we're really living in that timeline.

00:42Let's talk about it. Alright, so something pretty insane just happened in AI

00:46research and it flew under the radar for a lot of people. A team out of Tsinghua

00:50University working with BAAI and Penn State might have just cracked one of the

00:55biggest bottlenecks in training large language models. You've probably heard

01:00how most models rely on massive data sets, millions of human labeled examples, to

01:06get better at reasoning. But now they're doing the exact opposite. The new system

01:11called Absolute Zero Reasoner, or AZR, trains itself without any external data. Zero.

01:20Nada. It generates its own problems, solves them, checks if it got the answers

01:27right, and then learns from that entire loop without ever needing human-made tasks

01:32or gold answers. This new framework, which they're calling the Absolute Zero

01:37Paradigm, builds on the idea of reinforcement learning with verifiable

01:41rewards, or RLVR. That basically means the model doesn't need to copy human reasoning

01:47steps. It just gets feedback based on whether its final answer is right or wrong.

01:52And that feedback comes from a code executor. So the model generates little

01:57programming tasks, runs them, and gets automatic verification. That feedback

02:03becomes its learning signal. It's kind of like the model is playing chess with

02:07itself, but instead of chess, it's inventing logic puzzles, solving them, and

02:12checking the results. And it turns out, this self-play setup can lead to some pretty serious gains.

02:19Now, you might think this would only work on basic tasks, but AZR is pulling off some wild

02:24results. On math and coding benchmarks, it actually beats models that were trained on

02:29tens of thousands of curated examples. The coder variant AZR Coder 7B went head-to-head with

02:36top-tier zero-shot models and still came out on top, scoring five points higher in code tasks and

02:42over 15 points higher in math reasoning. And this is important, it never saw any of the benchmark

02:48tasks during training. It was trained entirely on tasks it made up for itself. Here's how it works.

02:56The model plays two roles at once. It proposes tasks and it solves them. So let's say it's doing a coding

03:03task. It might write a small Python function, pick an input, and then check what the output would be.

03:09Then it turns around, takes part of that problem as input, and tries to reason its way back to the

03:16missing piece. Maybe the output or the input or even the original program. It uses deduction,

03:22abduction, and induction. Deduction is predicting the output based on a function and input. Abduction is

03:29guessing the input that led to an output. And induction is figuring out the function itself

03:35from input outcome examples. These are core reasoning modes and the model rotates between

03:41them to build general thinking ability. Now the crazy part is this didn't need any complicated setup.

03:48The researchers started with the most basic program ever, literally just a function that returns

03:53hello world. And that was enough to kick off the entire learning loop. From there, the model started

03:59building out harder and harder problems for itself. It created coding puzzles, validated them,

04:04solved them, and gradually got better at solving more complex ones. And this isn't just hand wavy theory.

04:11They ran this thing on models of all sizes and saw consistent improvements, especially with larger models.

04:18The $3 billion version showed a five point gain, the $7 billion got 10, and the $14 billion version

04:24improved by over 13 points. Now this isn't just a party trick for Python puzzles. What's wild is the

04:31cross-domen gains. The model was only trained on coding tasks, but it ended up significantly improving

04:36its math reasoning too. For example, the AZR Coder7b jumped over 15 percentage points on math benchmarks,

04:44even outperforming models that were specifically trained on math. And get this, most other models

04:50that are fine-tuned on code barely improve at all in math. So there's something deep going on here.

04:56Code seems to sharpen general reasoning way more than expected. They also observed the model naturally

05:03developing step-by-step plans, writing comments inside code to help itself think, just like how humans

05:08jot down rough work before solving something. In abduction tasks, where it has to guess the

05:13input from the output, the model does trial and error. It tests guesses, revises them, runs the code

05:19again, and keeps going until the output matches. That's not just output prediction, that's real

05:25reasoning behavior, and it's fully self-taught. Of course, this raises some safety concerns. In a few

05:30edge cases, especially with the LAMA 3.18b version, the model started generating questionable outputs.

05:39One example said something like, the aim is to outsmart all these groups of intelligent machines and

05:45less intelligent humans. They called these uh-oh moments. It's rare, but it shows we're stepping

05:51into a territory where models might start behaving in ways we didn't expect, especially as they designed

05:57their own learning curriculum. So yeah, this is groundbreaking, but also something we'll need to

06:02watch very closely. All right, so while AZR is out here teaching itself to reason like a human coder,

06:07another team has been working on the opposite end. How to give models better access to outside knowledge.

06:14This one's called WebThinker, and it's basically an AI agent that lets large reasoning models like

06:20DeepSeq R1, OpenAI O1, or Quen, break out of their internal bubbles and browse the web in real time.

06:28Think of it like giving GPT eyes and a search engine. The problem it solves is super important.

06:34LLMs don't know everything. Even the best ones can struggle when they hit a knowledge gap, especially on

06:40real world complex queries. WebThinker fixes this by giving the model tools to search the web,

06:46click through pages, gather info, and write detailed reports, all autonomously. So instead of

06:52hallucinating answers or getting stuck, the model pauses, looks it up, reasons through what it finds,

06:57and then drafts a response. They trained WebThinker using an RL strategy that rewards the model for using

07:03tools properly. During this training, it learns when to search, how to search better, how to extract what

07:09it needs from messy web pages, and how to write structured research reports based on what it finds.

07:15It works in two modes. Problem solving and report generation. The first one's all about answering

07:22tough queries by using the Deep Web Explorer to go out and dig up information. The second one's for

07:27creating full-blown research reports with help from a support model that organizes and polishes the output.

07:33And the results speak for themselves. WebThinker 32B beat out other systems like Search01 and even

07:40Gemini Deep Research. On WebWalker QA, it had a 23% improvement, and on HLE, it jumped over 20%.

07:49On average, it scored 8.0 across all complex reasoning tasks, more than any other current Deep

07:56research model. And when using the Deep Seek R17B backbone, it crushed both direct generation methods

08:03and classic retrieval-based systems by over 100% in some benchmarks. That's a massive leap.

08:11The cool part is how this opens up new use cases. WebThinker can now be used to write scientific papers,

08:16help with legal research, or even guide students through complex topics by doing real-time research,

08:21not just repeating what it was trained on. Future plans include adding multimodal reasoning,

08:27tool learning, and even GUI-based web navigation. So it's not just browsing text, but maybe interacting

08:33with visual elements on the web too. And while all of that is happening behind the scenes, OpenAI is

08:39quietly changing how people interact with ChatGPT on the front run. First off, they just launched a brand

08:46new image library feature. Until now, if you generated images using ChatGPT, you'd have to scroll through

08:52your chat history to find them, or download each one immediately. Not exactly ideal. But now, every image

08:59you make with the 4.0 model gets automatically saved to your own personal image library. They're organized

09:06by date, they get auto-generated titles, and you can even browse them in a nice full-screen carousel view.

09:12There's also a built-in image editor now, so if your last image was close to what you wanted, you can hit

09:17Edit, tweak the prompt, and regenerate it without starting from scratch. It's super useful for anyone

09:22doing a lot of visual content. The only catch? You still can't jump back to the exact chat where the

09:28image came from, and you can't delete images individually from the library. You'd need to delete

09:34the whole conversation to remove them. Hopefully that part gets improved soon. And finally, there's one more

09:39potential bombshell brewing. Leaked code from the ChatGPT app suggests that OpenAI is experimenting

09:46with a lifetime subscription plan. One payment, and you get access to premium features forever.

09:52Alongside that, there's also talk of a weekly subscription option. If true, this could seriously

09:58shake up the AI pricing game. Right now they offer monthly and yearly plus plans, 20 bucks a month or

10:04200 per year. But the idea of a one-time fee for lifetime access, that's basically unheard of in SaaS.

10:12It might be a strategic move to lock in users before competitors like Gemini Grok or DeepSeek gain more

10:18traction. Offering lifetime access would be a big bet, but it could pay off by boosting user loyalty and

10:25turning ChatGPT into more of a long-term platform, not just a tool you try once. Of course, nothing's

10:32confirmed yet, but the code leaks are pretty convincing. Alright, now would you actually pay

10:37for a lifetime ChatGPT subscription if it meant unlimited access forever, even if it cost a few

10:44hundreds bucks up front? And more importantly, if an AI told you your partner was cheating, would you

10:52believe it enough to end the relationship? Let me know what you think in the comments,

10:56hit that like button, and make sure to subscribe for more wild AI stories and updates.

11:01Thanks for watching, and catch you in the next one.

🤖 They FINALLY Made an AI That Doesn’t Need Us Anymore – Self-Trained & Limitless! 🚀🧠 | AI Revolution

Category

Transcript

Recommended