🚀 They FINALLY Made an AI That Doesn’t Need Us Anymore! Self-Trained & No Limits 🤖 | AI Revolution - video Dailymotion

Ai Revolution

The future is here! Discover the groundbreaking AI that learns all by itself — no human help needed and no limits to its growth. This self-trained AI is set to change everything in technology and beyond. Dive into the revolution! 🌐✨  #AI #SelfTrainedAI #ArtificialIntelligence #TechRevolution #MachineLearning #FutureTech #Innovation #AIRevolution #NoLimitsAI

Transcript

00:00So here's what's going on. One AI model just figured out how to train itself with

00:05zero data. Literally no human input. Another one just leveled up into a full

00:10on autonomous research agent. It browses the web, digs through complex info, and

00:15writes full reports. ChatGPT now saves all your generated images in one neat

00:21library. And OpenAI might even offer a lifetime subscription for ChatGPT soon.

00:26But none of that tops this. A woman actually divorced her husband after ChatGPT

00:33told her he was cheating. She fed it a few details, the AI connected the dots, and

00:38that was enough for her to walk. So yeah, we're really living in that timeline.

00:42Let's talk about it. Alright, so something pretty insane just happened in AI

00:46research and it flew under the radar for a lot of people. A team out of Tsinghua

00:50University working with BAAI and Penn State might have just cracked one of the

00:55biggest bottlenecks in training large language models. You probably heard how

01:00most models rely on massive data sets, millions of human labeled examples to get

01:07better at reasoning. But now they're doing the exact opposite. The new system

01:11called Absolute Zero Reasoner or AZR trains itself without any external data. Zero.

01:20Nada. It generates its own problems, solves them, checks if it got the answers

01:27right, and then learns from that entire loop without ever needing human-made tasks

01:32or gold answers. This new framework, which they're calling the Absolute Zero

01:37paradigm, builds on the idea of reinforcement learning with verifiable

01:41rewards, or RLVR. That basically means the model doesn't need to copy human

01:47reasoning steps. It just gets feedback based on whether its final answer is

01:51right or wrong. And that feedback comes from a code executor. So the model generates

01:57little programming tasks, runs them, and gets automatic verification. That feedback

02:03becomes its learning signal. It's kind of like the model is playing chess with

02:07itself, but instead of chess, it's inventing logic puzzles, solving them, and

02:12checking the results. And it turns out this self-play setup can lead to some pretty

02:17serious gains. Now you might think this would only work on basic tasks, but AZR is

02:23pulling off some wild results. On math and coding benchmarks, it actually beats

02:28models that were trained on tens of thousands of curated examples. The coder

02:33variant AZR Coder 7B went head-to-head with top tier zero shot models and still

02:39came out on top, scoring five points higher in code tasks and over 15 points

02:43higher in math reasoning. And this is important, it never saw any of the benchmark

02:48tasks during training. It was trained entirely on tasks it made up for itself.

02:55Here's how it works. The model plays two roles at once. It proposes tasks and it

03:00solves them. So let's say it's doing a coding task. It might write a small Python

03:05function. Pick an input and then check what the output would be. Then it turns

03:10around, takes part of that problem as input, and tries to reason its way back

03:15to the missing piece. Maybe the output or the input or even the original program. It

03:21uses deduction, abduction, and induction. Deduction is predicting the output based on

03:27a function and input. Abduction is guessing the input that led to an output. And

03:32induction is figuring out the function itself from input outcome examples. These

03:38are core reasoning modes and the model rotates between them to build general

03:43thinking ability. Now the crazy part is this didn't need any complicated setup. The

03:48researchers started with the most basic program ever, literally just a function

03:52that returns hello world, and that was enough to kick off the entire learning loop.

03:57From there the model started building out harder and harder problems for itself. It

04:02created coding puzzles, validated them, solved them, and gradually got better at

04:07solving more complex ones. And this isn't just hand wavy theory. They ran this thing

04:12on models of all sizes and saw consistent improvements, especially with larger models.

04:18The three billion dollar version showed a five-point gain, the seven billion got ten,

04:23and the 14 billion version improved by over 13 points. Now this isn't just a party trick

04:29for Python puzzles. What's wild is the cross-domen gains. The model was only trained

04:34on coding tasks, but it ended up significantly improving its math reasoning too. For example,

04:38the AZR Coder7b jumped over 15 percentage points on math benchmarks, even outperforming models

04:45that were specifically trained on math. And get this, most other models that are fine-tuned on code,

04:51barely improve at all in math. So, there's something deep going on here. Code seems to

04:57sharpen general reasoning way more than expected. They also observed the model naturally developing

05:03step-by-step plans, writing comments inside code to help itself think, just like how humans jot down

05:09rough work before solving something. In abduction tasks, where it has to guess the input from the output,

05:15the model does trial and error. It tests guesses, revises them, runs the code again, and keeps going until the

05:21output matches. That's not just output prediction, that's real reasoning behavior, and it's fully

05:26self-taught. Of course, this raises some safety concerns. In a few edge cases, especially with the

05:32LAMA 3.18b version, the model started generating questionable outputs. One example said something like,

05:41the aim is to outsmart all these groups of intelligent machines and less intelligent humans.

05:47They called these uh-oh moments. It's rare, but it shows we're stepping into a territory where models

05:53might start behaving in ways we didn't expect, especially as they designed their own learning

05:58curriculum. So yeah, this is groundbreaking, but also something we'll need to watch very closely.

06:03Alright, so while AZR is out here teaching itself to reason like a human coder, another team has been

06:09working on the opposite end. How to give models better access to outside knowledge. This one's

06:15called WebThinker, and it's basically an AI agent that lets large reasoning models like DeepSeq R1,

06:21OpenAI O1, or Quen, break out of their internal bubbles and browse the web in real time. Think of

06:29it like giving GPT eyes and a search engine. The problem it solves is super important. LLMs don't know

06:35everything. Even the best ones can struggle when they hit a knowledge gap, especially on real world

06:40complex queries. WebThinker fixes this by giving the model tools to search the web, click through pages,

06:46gather info, and write detailed reports. All autonomously. So instead of hallucinating answers

06:53or getting stuck, the model pauses, looks it up, reasons through what it finds, and then drafts a response.

06:59They trained WebThinker using an RL strategy that rewards the model for using tools properly. During

07:05this training, it learns when to search, how to search better, how to extract what it needs from

07:10messy web pages, and how to write structured research reports based on what it finds. It works in two

07:16modes. Problem solving and report generation. The first one's all about answering tough queries by using

07:23the Deep Web Explorer to go out and dig up information. The second one's for creating full-blown research

07:28reports with help from a support model that organizes and polishes the output. And the results speak for

07:34themselves. WebThinker 32B beat out other systems like Search01 and even Gemini Deep Research. On WebWalker QA,

07:43it had a 23% improvement, and on HLE, it jumped over 20%. On average, it scored 8.0 across all complex

07:53reasoning tasks more than any other current deep research model. And when using the DeepSeq R17B

07:59backbone, it crushed both direct generation methods and classic retrieval-based systems by

08:06over 100% in some benchmarks. That's a massive leap. The cool part is how this opens up new use cases.

08:13WebThinker can now be used to write scientific papers, help with legal research, or even guide students

08:18through complex topics by doing real-time research, not just repeating what it was trained on. Future plans

08:25include adding multimodal reasoning, tool learning, and even GUI-based web navigation. So it's not just browsing text,

08:31but maybe interacting with visual elements on the web too. And while all of that is happening behind the scenes,

08:37OpenAI is quietly changing how people interact with ChatGPT on the front run. First off, they just launched

08:45a brand new image library feature. Until now, if you generated images using ChatGPT, you'd have to scroll

08:52through your chat history to find them or download each one immediately. Not exactly ideal. But now,

08:58every image you make with the 4.0 model gets automatically saved to your own personal image library.

09:05They're organized by date, they get auto-generated titles, and you can even browse them in a nice

09:11full-screen carousel view. There's also a built-in image editor now, so if your last image was close

09:16to what you wanted, you can hit Edit, tweak the prompt, and regenerate it without starting from scratch.

09:21It's super useful for anyone doing a lot of visual content. The only catch, you still can't jump back

09:26to the exact chat where the image came from, and you can't delete images individually from the library.

09:33You'd need to delete the whole conversation to remove them. Hopefully that part gets improved soon.

09:38And finally, there's one more potential bombshell brewing. Leaked code from the ChatGPT app suggests

09:44that OpenAI is experimenting with a lifetime subscription plan. One payment, and you get access

09:50to premium features forever. Alongside that, there's also talk of a weekly subscription option.

09:56If true, this could seriously shake up the AI pricing game. Right now they offer monthly and yearly plus

10:02plans, $20 a month or $200 per year. But the idea of a one-time fee for lifetime access, that's basically

10:10unheard of in SaaS. It might be a strategic move to lock in users before competitors like Gemini, Grok,

10:17or DeepSeek gain more traction. Offering lifetime access would be a big bet, but it could pay off

10:23by boosting user loyalty and turning ChatGPT into more of a long-term platform, not just a tool you try

10:31once. Of course, nothing's confirmed yet, but the code leaks are pretty convincing.

10:36Alright, now would you actually pay for a lifetime ChatGPT subscription if it meant unlimited access

10:42forever even if it cost a few hundreds bucks up front? And more importantly, if an AI told you

10:49your partner was cheating, would you believe it enough to end the relationship? Let me know what

10:55you think in the comments, hit that like button, and make sure to subscribe for more wild AI stories

11:00and updates. Thanks for watching, and catch you in the next one.

🚀 They FINALLY Made an AI That Doesn’t Need Us Anymore! Self-Trained & No Limits 🤖 | AI Revolution

Category

Transcript

Recommended