Skip to playerSkip to main contentSkip to footer
  • 5/29/2025
Big moves in the AI world! OpenAI has officially dropped CODEX AGENT, marking a new chapter in AI development. Meanwhile, Manus AI rolls out an exciting upgrade boosting performance and usability. Plus, the latest from Claude 3.8 Sonnet brings advanced features and smarter interactions. πŸ”₯πŸ’‘ Stay tuned as we break down these major AI advancements and what they mean for the future of technology! πŸŒπŸ€–

#OpenAI #CodexAgent #ManusAI #Claude3 #AIUpgrade #ArtificialIntelligence #TechNews #AIInnovation #MachineLearning #AIDevelopment #FutureTech #AITrends #SmartTech #AIRevolution #TechUpdate #AICommunity #Innovation #NextGenAI #AI2025 #DigitalTransformation

Transcript
00:00All right, so a lot just dropped in the AI space and honestly it's one of those
00:06weeks where everything is shifting at once. We've got OpenAI finally launching
00:10codecs inside ChatGPT, Manus AI stepping into image generation in a way that
00:15actually feels intelligent, Google opening up more about its search future,
00:20and Anthropic silently laying the groundwork for what looks like their
00:24biggest Claude upgrade yet. There's a lot going on so let's talk about it.
00:29All right, so let's get into Codex, OpenAI's new software engineering agent that
00:35just dropped as a research preview. If you're using ChatGPT Pro, Team, or
00:40Enterprise you've probably already seen it pop up in the sidebar. Codex works more
00:44like a full-stack dev running inside a secure cloud-based sandbox. It operates
00:49entirely within its own isolated environment. No internet access, no
00:53external API's, nothing leaking out. You just connect it to your github repo and
00:58from there it starts handling actual engineering tasks without needing
01:03constant supervision. You can throw at tasks like writing new features, fixing
01:08bugs, running tests, cleaning up messy code, or even digging through your code base to
01:12answer specific questions. It spins up a virtual environment, loads everything in,
01:17and then handles the task from end to end. That includes setting up and running
01:21tests, applying linters, checking types, the whole routine. And the best part is you can
01:26watch it work in real time. Perminal logs, test results, status updates, they're all
01:31right there so you always know what's going on. The model running behind all
01:35this is Codex 1, which is a specialized version of OpenAI's O3, fine-tuned
01:41specifically for software development. They didn't just train it on general
01:45purpose data, they used reinforcement learning on real coding tasks, pull
01:50request patterns, and team workflows. The result is a model that writes clean,
01:54structured code, understands project layouts, and mirrors the way human
01:59engineers actually work. In internal benchmarks, it hit 75% pass-at-1 accuracy
02:06on SWE bench verified tasks. That's a noticeable jump over O3 high, which
02:11landed at 67%. And what's nice is it doesn't need a ton of configuration to be
02:17useful. Sure, you can give it agents MD files to help it navigate your repo more
02:22efficiently, but even if you don't, it still figures things out. It respects your
02:27architecture, follows your naming patterns, and can juggle multiple tasks without
02:32stepping on its own toes. Once Codex finishes whatever you've assigned, it
02:36doesn't just give you some output and call it done. It commits the changes
02:41directly within its sandbox and includes logs and references so you can trace
02:45exactly what it did and why. From there, you've got options. You can review the output,
02:50make tweaks, turn it into a pull request, or pull it down locally and keep working
02:54from there. If you prefer working from the terminal, Codex CLI is probably the
02:59better fit. It's the open source version you run locally, and now it uses Codex
03:03Mini by default. That's a smaller, faster model based on O4 Mini, optimized for low
03:09latency workflows. It's great for everyday tasks, renaming variables, writing test cases,
03:14refactoring functions, stuff that takes time but doesn't require deep focus. You
03:20can leave it running in your terminal almost like a quiet assistant who's
03:24always ready to help out when things get repetitive. For API usage, pricing is
03:29pretty straightforward. Codex Mini costs $1.50 per million input tokens and $6 per
03:36million output tokens. Plus there's a 75% discount on cached prompts. So if you're
03:42repeating similar tasks, the costs drop significantly. Codex is part of OpenAI's
03:48push to turn ChatGBT into a workspace built around task-specific agents. There's
03:54operator for browsing, Sora for video, deep research for analysis, and now Codex for
04:00software development. Access to Codex is generous for now, though rate limits are
04:04coming. The idea is simple. You assign real coding tasks and Codex handles them
04:08while you stay focused. It's built to feel like part of your team, understanding
04:13your project, following your standards, and quietly taking care of what slows you
04:17down. Alright, now while OpenAI keeps expanding its agent lineup inside ChatGBT, over
04:23in China, something wild just happened. Maness AI, the autonomous agent from
04:28Monica, also known as Butterfly Effect AI, has introduced an advanced image
04:33generator that's on a completely different level. And it's not just another model that
04:37turns prompts into pretty pictures. It's a full-blown visual problem solver
04:41built right into an autonomous agent framework. Let's say you ask it for a
04:46modern Scandinavian living room. Maness won't just throw together random
04:51furniture. It first analyzes your intent. Are you designing a catalog, creating ad
04:55visuals, or drafting a room layout? Then it builds a strategy. It uses layout engines to
05:01arrange space, style detectors to match the look, and browser tools to pull design
05:06trends or brand guidelines. It might even select real IKEA furniture, consider
05:10spatial relationships, apply color theory, and ensure everything fits the purpose.
05:14The system is built on a multi-agent architecture, where separate modules
05:20handle planning, execution, and verification. They run independently but
05:24collaborate like a design team, allowing Maness to work through complex
05:28workflows, not just one-off prompts. That's why it can deliver things like
05:33product campaigns, architecture mock-ups, or platform-ready visuals. All consistent,
05:38brand-aware, and usable. It's already being tested for e-commerce, product
05:42visualization, marketing content, and even architectural planning, like generating
05:47full interiors from blueprints. The big limitation? It's still in closed beta and
05:53available by invitation only. So, unless you're part of a select test group,
05:57you can't use it yet. Now let's shift to Claude. Anthropic's been relatively quiet,
06:02but behind the scenes, they're cooking up something huge. There's been a bunch of
06:06internal leaks about a new model possibly named Claude 3.8 or Claude 4. We're seeing
06:12names like Neptune appear in their config files. And yeah, Neptune being the eighth planet is
06:18probably not a coincidence if we're talking versioning. Publicly, they denied the rumors
06:23after that win four months of mass contest went viral, but come on. There's too much back-end
06:27evidence pointing to something real. One of the standout leaks even showed internal tools with
06:32redacted model names and Easter eggs tied to upcoming versions. And the information confirmed.
06:38Anthropic is prepping upgraded versions of both Claude Sonnet and Claude Opus. The big thing with this
06:45new wave of Claude models is what Anthropic calls true agentic behavior. That means the model can
06:52switch autonomously between reasoning and acting without user prompts. It doesn't just generate
06:57an answer in one shot. It breaks down the problem, builds a plan internally, then switches into action
07:03mode to call tools, search data, or run code. If something goes wrong mid-tack, it backtracks,
07:09rethinks, and tries again. That's a real agent, one that reasons like Gemini, but potentially with more
07:15precision in task delegation. And this actually mirrors what OpenAI's O3 model already does inside
07:21ChatGPT, where it can browse, run code, and iterate before showing you the final result. But Anthropic's take
07:27might offer better transparency or control depending on how they deploy it. For example, with the upcoming Claude
07:33update, developers might see the full breakdown of thoughts, tool calls, and revisions in the
07:39background, not just the polished final response. And they're not stopping there. Anthropic is also
07:45investing in making these agents work better with complex tool chains, possibly building integrations
07:50with search, databases, and APIs all inside one flow. That's a direct response to what Google's doing
07:56with its own AI mode inside search. DEO Sundar Pichai was recently on the All In podcast and the big
08:03question came up whether Google is being disrupted by ChatGPT, perplexity, and other AI native tools that
08:11are rapidly eating into traditional search behavior. Pichai didn't seem rattled at all. His take was that
08:18disruption isn't inevitable unless you ignore it. He sees it more as a shift, one that Google is
08:25actively adapting to rather than resisting. And the numbers show they're already moving. Over 1.5 billion
08:32users have engaged with Gemini-powered AI overviews inside Google search. These aren't just summaries
08:40or snippets. It's an AI layer baked directly into search results designed to give more context, answer
08:47follow-up questions, and reduce the need to click through multiple pages. It's a way to keep users in
08:52Google's ecosystem while still giving them something closer to an AI chat experience. But they're not
08:59stopping there. Google is preparing to launch something called AI mode which will turn search
09:04into a full-on conversational experience. It's not just query and result anymore. You'll be able to ask a
09:10question, get a response, follow up with more context, refine your query, and get deeper answers
09:15all inside the search interface. Basically, it turns search into a Gemini-powered assistant with memory
09:22across turns. And this isn't some distant roadmap item. It's already been confirmed and will be showcased
09:28in more detail at Google i slash o. That said, Google's position isn't bulletproof. Apple recently signaled
09:36it may replace Google search in Safari with a more AI native system, possibly its own or powered by
09:42another provider like OpenAI. That kind of move could hit Google hard, especially on mobile, where
09:47Safari holds massive market share. The moment that new surfaced, Google stock took a noticeable hit.
09:54Investors clearly understand what's at stake here. Still, Pichai isn't new to these shifts. He pointed out
10:00that people had similar concerns when mobile search took off and again when TikTok started pulling younger
10:06audiences away from YouTube. In both cases, Google adapted, integrated features from rising platforms,
10:12and kept their core products alive. Pichai's betting they can do the same again, this time by making
10:17search smarter, more conversational, and more useful than anything that exists in AI chat apps today.
10:25So while yes, Google is under pressure, especially from Apple and OpenAI, they're not standing still.
10:31They're building, integrating, and reshaping how search works to stay relevant in an AI-first world.
10:37Now the question is, with Cloud, Gemini, Codex, and Manus evolving fast, who's actually building the
10:43smartest agent right now? Let me know in the comments, drop a like if this gave you something to think about,
10:50and subscribe if you want to stay ahead of where all this is going.
10:53Thanks for watching, and I'll see you in the next one.

Recommended