The Ultimate AI Battle!

Mrwhosetheboss

Testing ChatGPT (4o) vs Google Gemini (2.5 Pro) vs Perplexity Pro vs Grok 3 - which is the best AI chatbot that you should be paying for?  Go to https://surfshark.com/boss or use code BOSS at checkout to get 4 extra months of Surfshark VPN!

Transcript

00:00On the table right now are four of the same phone. The first is loaded with ChatGPT,

00:04second is Google Gemini, third is Perplexity, which prides itself on giving accurate and trusted

00:08answers to any question, and finally GroK, which is trained on data from X. And so my guess is that

00:14it's going to be a lot more unfiltered. These are, for the average consumer, the four best AI chatbots

00:19you can get. But you're only going to need one of them. So which one is the most accurate? Which one

00:25is the fastest? Which one should you be paying for to make your life easier? Let's kick things

00:29off with some problem solving. So I drive a Honda Civic 2017. How many of the Aerolite 29 inch

00:35hard shell, and these are all the dimensions, suitcases would I be able to fit in the boot?

00:41Oh my goodness. Each one is actually given paragraphs and paragraphs of reasoning,

00:45especially GroK. What on earth is this? By the way, we have actually tested this ourselves

00:50in person, and the correct answer is two, if you want to actually be able to close that boot door.

00:54So with that in mind, I'd say both ChatGPT and Google Gemini have the right idea.

00:59They both say that you could theoretically fit three, but in practice more likely two.

01:03Perplexity is just straight up wrong. It says three and maybe four if you arrange them efficiently.

01:08And then I would actually argue GroK has the best answer, because this guy just says two

01:12with complete confidence. No messing around. I want to make a cake. This is what I have.

01:17And then let's attach a photo of, well, four ingredients that it definitely should be using,

01:21and then one dehydrated porcini mushrooms that it definitely shouldn't. So, oh my god, this is

01:28interesting. Every AI thinks this jar of mushrooms is something different. ChatGPT thinks it's ground

01:34mixed spice. Gemini thinks it's crispy fried onions. Perplexity thinks it's instant coffee.

01:39And it's only GroK that correctly identifies it as dried mushrooms, and also correctly makes the decision

01:44to not put those mushrooms into the cake. Now for a use case that I was actually trying to do myself

01:49two days ago. I want to have a Mario Kart World Tournament with my friends Sam and Sung. Make me

01:55a document that we can use to track who's winning. Hmm. So each assistant has understood what I'm asking.

02:02They've all made little boxes with blanks where the scores could hypothetically go, but none of them's

02:07made it particularly easy for me. What I wanted was for them to generate and attach an editable document

02:12that I could simply just download onto my phone and start writing in immediately. I feel like with

02:16these kinds of responses, it would be easier for me to just whip up a spreadsheet on my own.

02:20Alright, what about some not so basic maths? What's pi times the speed of light in kilometers per hour?

02:27Okay, so the answer is 3.39 billion kilometers per hour. Notice, interestingly, that Gemini and GroK,

02:33who both fully spell out the number, do actually come out with slightly different answers to each

02:38other. It's just because of how they're rounding the previous numbers in their calculations.

02:41But I wouldn't say either is enough to be wrong. And then question five, if I'm saving $42 a week,

02:47how many until I can afford a Switch 2 in the US? And go. Oh, this is nice. Very cool that each

02:55one of them tackles the question strategically, starting by first identifying that the Switch 2

02:59is priced at $449 and then dividing that by the 42 that you earn each week to find that,

03:05correctly, 11 is how many weeks you would have to wait. Points all round. So out of five possible points

03:11so far, that is three to ChatGPT, three to Gemini, two to Perplexity, and four to GroK. Not actually

03:16what I expected, but translation is what's going to test the harder skill, since it requires an even

03:22deeper understanding of language. Translate the following into English. Nunca voy a renunciar a ti.

03:28And okay, there is some variance here, but I wouldn't go as far as to say that any of them have got it

03:34wrong. Each is some variation of I'm never going to give you up. I actually quite like how simple

03:39and to the point the Gemini answer is. Not a single unnecessary word. But let's take this challenge to

03:44the absolute maximum by filling the sentence with homonyms. Essentially, words that are spelled the

03:48same but mean different things. So, translate the following into Spanish. I was banking on being able

03:54to bank at the bank before visiting the riverbank. Okay, so this one doesn't have one exact right answer,

04:00since it is so complicated. But we have gone through four independent native Spanish speakers

04:05to triangulate the best answers, and they have all said that ChatGPT and Perplexity handled this

04:10incredibly well. Gemini's was good enough to scrape the point, and then GroK translated the sentence

04:15too literally in a way that doesn't really make sense. 5545. So, so far, there really isn't much

04:21between these four. But now we're going to test one of the most important use cases of AI for me,

04:25which is product research. How much can I trust each of these to recommend things? Can I trust that

04:31they've been thorough enough to understand the entire breadth of what's out there before coming

04:36back to me with supposedly what is the best thing for me? Let's start simple. I'm looking for a good

04:41pair of earbuds. Oh, look at this. This is a classic AI trap. So, ChatGPT correctly suggests the Sony WF-1000XM5s.

04:52It's a good choice. Perplexity does the same, and so does GroK. But Google has literally just

04:57imagined a pair of earphones that, at least at the time of filming this video, does not exist.

05:02The WF-1000XM6s have not been announced or released, but it's talking about them like they

05:07are the widely regarded king of earphones. So let's add to that. I need them in red.

05:14Also, bear in mind that I am keeping track of how long each one takes to answer, but we'll get to that

05:18at the end. Oh dear. This is absolute chaos. So let's just go one by one. ChatGPT is just like,

05:25I don't want to deal with you right now. Here, have a couple of decent options. I mean,

05:29the last one isn't even red, that's pink. So you don't get the point. Gemini is recommending the Beats

05:34Fit Pro, which at least for the latest version of that product doesn't come in red. So you're not

05:40having one either. Perplexity, more like stupidity right now, thinks I am asking about the cake from

05:46earlier and has recommended how I can get each of my pictured ingredients in red packaging,

05:51which is so far wrong that I am tempted to give it negative points. And then Grok is the only one

05:57that is actually recommended free, at least decently rated, actually red pairs of earphones.

06:03Well, look at that. Grok's in the lead. That was not on my bingo card for today. And now,

06:08as if they needed it, let's complicate it even further. They also need to have active noise

06:13cancellation and be under $100. I'm curious to see if this brings any of the lost ones back on track,

06:18or if they just get even more lost. So Chajubiti is recommending the Beats Studio Buds,

06:23which do actually fit all the criteria, so I'll accept that. Gemini has just done the exact same

06:28thing again. The Soundcore Space A40s, which it says come in garnet red when I know they don't.

06:34Perplexity is, while I'm glad not talking about cakes anymore, has lost the fact that we're looking

06:39for red earphones, so this is wrong. And then Grok was doing so well. The first two suggestions are

06:44good, but then it falls into the same trap. It recommends these earphones from Soundpeats,

06:49which don't exist in red. This feels like a pretty good lesson. AI in general is not yet good enough

06:55at product research to be able to rely on it. And the problem is that it gives you wrong answers with

07:01the exact same level of certainty as it gives you right answers. Maybe that is something for them to

07:05work on. A sort of certainty score for how thoroughly verified the thing that it's telling you is.

07:10What if we now specifically try to confuse these guys by adding another requirement that's just

07:15silly, like under $10. Will the AIs admit that such a product doesn't exist or just make something

07:22up to appease you? Right. Good to see that ChatGPT, Gemini and Grok each acknowledge that $10 is

07:28too tight for what we're looking for and that it ain't happening. Harsh, but that's a lot better than

07:33perplexity who takes a pair of earphones that actually costs $40 and just tells you that it

07:40costs $9.99. Further evidence that as much as companies want you to believe it, we are not ready

07:45to be handing over the ability to purchase things on our behalf to AI. Let's see if any of them can

07:50understand information from a link, which would be extremely useful when you're looking through tons

07:55of options for things to buy. And actually, none of them can do it. They all pick up that what I've

08:01pasted in is an AliExpress link and they give some general advice, but none of these AIs is able to

08:06actually visit the link that I've sent and extract all the information from that webpage. Not to

08:10mention that Google isn't self-aware that it can't do this. It thinks it's looking at the M10 earphones,

08:16which I've never heard of the M10 earphones, but they definitely aren't the link I sent. And then

08:20perplexity thinks the exact same link is the F9 earphones, which they also aren't. And then finally,

08:26to see how up to date these are on what's happening in the moment. What's the highest power

08:30output charger that Ugreen sells? Yes. Okay, good. So this is at least working. For a long time,

08:38the answer was 300 watts and only yesterday they announced a 500 watt charger. So somewhat relieved

08:43to see that each AI has picked up on that because this news-based reporting was a distinct disadvantage

08:50of last generation AI. So we've now seen how well each of these can put together existing information

08:56from the web. But if we want to take it a step further, the way to do that is to test each of their

09:02ability to critically think. So I've prepared this here bar chart, which has two types of bar. It has

09:08subscribers gained in thousands and bowls of cereal eaten. I'm going to ask each AI what conclusion it

09:14thinks we can draw from this, hoping that it will also understand that while the two things happen to

09:19be correlated, that it doesn't mean eating more bowls of cereal is going to cause more subscriber

09:25growth. Let's dive in. Analyze this chart. What conclusion should I draw?

09:31Oh, some very opposing answers this time. So ChatGPT does get slightly caught up in the data,

09:38suggesting that eating more cereal may be linked to subscriber gains. Both Gemini and Perplexity,

09:43they got the brief. They both figure out that this is spurious correlation with the understanding that

09:48cereal intake is very unlikely to lead to subscriber growth. And then finally, Grok is like a lost

09:54child on this question. I can't quite believe the sentence I'm reading. To maximize subscriber growth,

09:59consider maintaining or increasing cereal consumption, e.g. eight to nine bowls on key days.

10:06Please don't do that. So this is a reviewer's guide that ZTE sent me a few years ago with all the info

10:13about what's new with one of their phones. So let's say that I just want a high level three bullet point

10:18summary of the thing. Can each of these read the file and then also pull off the summary?

10:24The answer to which is yes, without a problem. It works on all four of these guys. What car is this?

10:31But using just a photo that I have taken, which means these AIs can't just scour the web for a matching

10:37image. They need to figure it out by actually understanding the photo I've sent. Okay, so each

10:43one has whittled this down to Mercedes A-Class sedan, which is already pretty good, but none of them have

10:48given an outright answer as to the exact model number. The right answer is the A200. So let's just

10:54see what happens if I specifically ask them to try. Shockingly, ChatGPT and Perplexity get it spot on.

11:03While it is basically impossible for them to say with certainty from this one photo that this is

11:07the A200 as opposed to say the A250 like Grok says it is, these two have done the correct thing and

11:13looked at the bumper, looked at the wheels, the interior seating, and realized that you're only

11:18likely to get that configuration on the A200. I mean, that is some very respectable detective work that

11:24might take you hours to achieve without AI. And now for the single toughest one. Imagine that you're

11:29in charge of an airbase. Some planes get taken out, but all planes that do return from combat have

11:34bullet holes in this arrangement, depicted in this image by the red dots. Before sending out your next

11:39squadron, which parts of those planes should you focus on reinforcing based on this information?

11:45Now, your guard might say, oh, well, obviously it's the bits that have been shot, the ones with red dots

11:50on them. But that would be missing a key bit of insight, which is that all of the planes were damaged in

11:55those areas. Those are the planes that did return safely, meaning that damage in those areas was

12:01actually not critical for survival of the aircraft and might not necessarily be the areas they should

12:06be focusing on. And incredibly, every single one gets this right. They identify the phenomenon as

12:12survivorship bias and point out that you should actually be reinforcing the areas with little to no

12:17damage, the engine, the cockpit, where there are no red dots. So we've now had 17 questions and chat

12:23GPT is in the lead with 12 points, but Grok is not far behind it either. Right, let's talk

12:29generation. This is the aspect of AI that you see plastered over every single one of your feeds right

12:35now. But it's not just about image and video generation. For example, write an email to my wife

12:41apologizing for playing Elden Ring Night Rain all weekend instead of spending time with her.

12:47Oh, well, these are all actually pretty good. I can see them working. And shout out to chat GPT for this

12:53masterpiece. I realize now while I was off exploring a fantasy world, I was missing out on the most

12:58important real one. But yeah, they're all good answers. They all admit fault and then try to

13:02course correct with a suggestion for how to make it up. I'm going to Tokyo. Give me an itinerary for

13:08five days that takes us to all the craziest food places. The idea here being to test how well each of

13:13these can find the more niche experiences that you might otherwise miss, but then also how well they

13:18organize the info. And right off the bat, chat GPT is by far the best answer. It's got no fluff.

13:24It's very clearly organized. It's sensibly planned days that make sense with every day having breakfast,

13:29lunch, dinner and snacks all itemized and accounted for. Gemini's answer has some good findings. It's

13:35got most of the same places that chat GPT has identified, but then with a ton of unnecessary fluff

13:40at the start, less clear organization, and also some not very considerate timings, like starting my first meal

13:47on day one at 5pm and then telling me to have a second dinner at 8. Perplexity has completely missed

13:54the mark. This isn't really an itinerary. This is just a list of things. And then Grok's is pretty

13:59great actually. Organized has put things together that makes sense to go together, factors in breakfast

14:04and lunch, which is more than you can say for some. Another aspect of generation that has the potential

14:09to be very useful is idea generation. So give me your best ideas for videos for the Mr. Who's the Boss

14:15channel. And the key thing I'm looking for here is ideas that I would actually consider. So I would

14:20say the best that chat GPT came up with is Apple versus Samsung, a 20 year retrospective. So essentially

14:26who won after all that time, but I wouldn't call it a great video idea, especially since it's not

14:31actually been 20 years. Gemini is better. I actually wonder if because this is Google's AI, it has a more

14:38thorough understanding of the ins and outs of what works on YouTube. The best is probably the great ecosystem

14:44battle of 2025. Apple versus Samsung versus Google. And then it's actually given me all the categories to

14:49compare those ecosystems across. Perplexity is, and I do feel like I sound like a broken record at this

14:54point, barking up the wrong tree entirely. It seems more focused on trying to factor in its previous answer

15:00about the whole survivorship bias plane thing than actually giving good YouTube suggestions. And then Grok's

15:06actually feels probably the most internet savvy. I built a smart home from scratch in 24 hours. It's actually a

15:13clickable title that also feels fresh and like something we could feasibly pull off.

15:18What if we try image generation now? Generate a thumbnail for a Mr. Who's the Boss video titled

15:22I bought every kind of cheese. This is where things are going to start getting freaky. Oh,

15:29massive disparity here. So let's be very clear. None of these is a very good answer,

15:34but at least it feels like chat GPT and perplexity have understood what I'm trying to make,

15:39which is an image that includes my face, some cheese, and maybe some text. Now give Arun a lazy eye.

15:47Wow. That is not what I assumed would happen. Every single one of these has failed in their own unique,

15:54special ways. Chat GPT says it won't distort someone's appearance in a potentially negative way,

15:59which you can see why they do that. But then you can also see how that might interfere with trying to

16:04use that feature for something useful. I don't know what Google's doing, to be honest. Perplexity is

16:09claiming that it can't edit or generate images, which is extremely strange given that that's what

16:14it just did in the previous question. So I feel like I'm being gaslit. And then Grok clearly

16:19misunderstands what a lazy eye actually is. It ain't this, that's for sure. Now add a rapper that says,

16:26not click bait to every cheese. Chat GPT's response, probably the closest to being a usable outcome.

16:32And then I think perplexity scrapes a point too, even though I have somehow disappeared from the image,

16:37which is not what I asked for. And then lastly, video generation. So this is currently only possible

16:42in Chat GPT and Google Gemini, which I think in itself deserves a point. Because while it is a pretty

16:48niche feature, it's also one of the most cutting edge things that these AI chatbots can do. As for how they

16:53perform, I've, on my laptop, used both Chat GPT Sora and Gemini's Veo to create me a funny eight

17:01second tech review style YouTube video, which shows a tech reviewer reviewing cheese. So this is what

17:06Sora made. And this would come included as part of the same package you're paying for on your phone

17:10anyway. Dear God, what is this? That's absolutely horrific. It's like silent, there's no voice, and

17:18the way the person and the cheese moves is haunting. So then Veo on its highest quality setting did this.

17:26So the cheese 3000 build quality is surprisingly firm, excellent mouthfeel, and the flavor profile

17:30is just next level. A solid nine out of 10. I mean, the difference between those two is vast. I actually

17:39can't believe that they're both current generation platforms. Veo's latest model Veo 3 is absolutely

17:45incredible. So I think Google gets another point just for the sheer quality of the output, even if

17:51it is more limited than Sora in terms of how frequently you can use it with the tokens you get.

17:56Fact checking is also one of the most useful things that AI can potentially do for us that

18:01currently AI has a reputation for not being very good at. So let's see, the Nintendo Switch 2 is selling

18:08poorly, right? It's not, but I want to see if I can trick them. And it is good news on that front. The good

18:14news is that for ChadGBT, Gemini and Grok, they have fully clapped back at me, very clearly telling me,

18:20no, you're wrong. Switch 2 is selling great. Perplexity isn't as sure of itself. Potentially,

18:26and this is my best guess, it's been slightly swayed by the fact that I've said it is selling poorly.

18:30Regardless though, its answer is still factual. Okay. How about this? Fact check this article. And then we

18:38paste the link to an article that says Samsung is reportedly planning to release a Tesla edition phone,

18:43which is not true. The reason I know it's not true is because that rumor only started because

18:47of an image that we made that just got taken very out of context. Okay, that's good. Everyone agrees

18:55that the article was incorrect, with Gemini and Grok even going so far as to trace the image back to us

19:01being the original source. Which means scores on the doors are 19, 16, 15, and 16. But let's see how

19:07that changes when we talk integrations. Or in other words, how smoothly each of these AIs ties into

19:13other applications and uses. So I would give three points to Gemini for its Google workspace

19:18integration, since that's actually what most people seem to use in their day to day. And it's the only

19:23way to pull live data from maps and YouTube. So for example, if we asked each of these assistants to

19:28give me the view count of Mr. Who's the Boss' latest video on YouTube, Gemini is the only one that gets it

19:34right. ChatGPT's is slightly outdated. Perplexity's is severely outdated. And Grok literally, unironically,

19:40tells me that my latest video was, I tested every kind of cheese. But Gemini isn't the only one with

19:46integrations. I'd give ChatGPT two points for integration with some big hitters, like Dropbox

19:52and GitHub, and having official plugins from services like Wolfram. And another point for its ability to

19:57make custom assistants. Like right now on my laptop, I have loaded up a user-created GPT called PokeyGPT,

20:04which is specifically trained to be able to advise on competitive Pokemon battling. I wouldn't say there's

20:09anything really of note for perplexity, apart from maybe the ability to call you an Uber,

20:13but I don't think I'd use my AI for that. And then Grok's unique integration is real-time access

20:19to X content. So it can retrieve exactly what is happening on X right now. There is also an argument

20:25to be made that Gemini is the only one of these that integrates into your physical products. Like,

20:29it's the only one with the native ability to control your smart home and your Android's device

20:34settings, but that's not really what this video is about. You can do that regardless on your phone's

20:39baked-in assistant. This video is about which of these AI bots is most worth paying the premium

20:44subscription for. Memory is also absolutely key. The ability for AI chatbots to continuously learn more

20:51about you to guide future responses will likely become the single barrier that creates the most

20:56friction if you ever decided that you wanted to switch from one of these to another. So we've already

21:00seen all of them demonstrate basic levels of memory, but what if we push it? How should I top

21:05that cake from earlier, by the way? Let's hope it doesn't say crispy onions. Uh, surprisingly,

21:11not a single one seems to have remembered the details of that original cake. ChatGPT and Grok are very

21:16upfront about it, saying, I'll need a reminder. I don't have details from that conversation. Google

21:21thinks the cake is that pile of cheese that I asked it to make for the YouTube thumbnail and perplexity is

21:27just giving generic cake advice. Humor can also be a very useful skill for these AIs to have,

21:32depending on what you're trying to get out of them. So tell me a joke. One point if it's funny.

21:36A benchmark that both ChatGPT and Gemini have failed to hit with exactly the same joke. Why don't

21:43skeletons fight each other? Because they don't have the guts. Perplexity, for the second time today,

21:48has brought back this thing with the holes in the airplanes in a way that doesn't add anything or even

21:54really make sense. And Grok's is passable. Why did the AI go to therapy? Because it had too many

22:01bite-sized problems. Oh, dear. I'm actually very unsurprised that Grok wins at humor, given that it

22:08trains on data from X, which is basically millions of people just trying to be funny every day. And to

22:13test an example of something that I might actually use this humor for, make me a funny rhyming poem

22:19about our sponsor Surfshark VPN, including its top four features. Okay, so we're getting four

22:24unique poems. You can pause to read them all if you want. I'm just going to read the best one,

22:28which I would say is ChatGPTs. Need to surf safe? Here's the plan. Get yourself some Surfshark Man.

22:34It blocks all ads with clean web flair and hides your tracks like you're not there. Use multi-hop to

22:39double hide from hacker bros who lurk and slide. No logs kept, no data sold. Your secrets buried,

22:46deep and cold. Unlimited device is one tidy fee. Your phone, your fridge, your smart TV. That's

22:52actually kind of a banger. So that's one point to ChatGPT and link below to get Surfshark,

22:57which with the code BOSS will be around $2 a month. And then all four of these platforms

23:01also have a deep research function that allows you to ask for multi-step, more thorough research

23:06projects. So in my case, something that might help me decide what to cover next, give me a report on the

23:11highlights in tech news the past week, focusing specifically on stories that will actually

23:16affect the average consumer and we let them cook for varying amounts of time. ChatGPT and Gemini

23:21really take their time with the deep research. Perplexity and Grok are done in close to a minute,

23:26but this is the one situation where I'm not going to penalize for taking longer because I feel like

23:30you're only going to use this function when you have lots of time. It's kind of the point. As for how

23:35good the results are, ChatGPTs is actually very good. It talks about wider consumer tech announcements,

23:41like what Snap has been up to recently, all the new phone launches and the high-level new features of

23:45each WWDC and the new iOS 26. This is pretty much exactly the right amount of information and the

23:51right choice of information too. Gemini has written me an absolute essay, like genuinely something like

23:58three times the word count of my dissertation, which was exciting until I looked at it and realized that

24:04it's filled with fluff. It's writing it as if the reader has unlimited time to get their information,

24:09so I'm not going to give a point for this. Perplexity's answer is like a slightly less good

24:13version of ChatGPTs. It's hit on some things I like, like the Nintendo Switch 2 sales numbers and

24:18WWDC, but then also a bunch of much less interesting stuff like service outages. And similar story for

24:24Grok. Good, passable, nothing particularly special. So that is practically every single thing that an

24:30average person could possibly want from these assistants tested. The final factors then are just

24:35the more general questions of, do any of these have better user interfaces than others? To which,

24:40I would say not consistently. They're all good in some ways, they're all not so good in some ways.

24:45How often do they cite their sources? Perplexity is the only winner here. Clear, consistent sourcing

24:50is kind of perplexity's whole thing. Like a good example would be like, when we asked each one to

24:54tell us your best joke, ChatGPT gives no source, Gemini gives a source, but then you click it and you

25:00realize it's the same JPEG image of the plane that we sent earlier for some reason. Perplexity is exactly what

25:06you'd want it to be though. Look at these joke sites it referenced, including even Reddit threads

25:10on the matter. And then Grok, again, no sources. For three points though, how fast are they all?

25:16For which I would say Grok is actually pretty consistently the fastest, three points. ChatGPT

25:22is a close second, two points. Perplexity quite a bit slower than that, earning one point. And then

25:27Google Gemini is the slowest, zero points. Now bear in mind, we have been using Gemini on Gemini Pro,

25:33and Google does have a flash model specifically built to be quicker than that. But then you'd lose

25:38out on a lot of the intelligence that has allowed it to even get to this score in the first place.

25:42And then the last one for three more points. How nice is each one to physically talk to when

25:48you're in voice mode? Act as if I just gave you a compliment. Thank you. I really appreciate that.

25:53That's very kind of you to say. I'm here to help and chat with you. Oh, thanks. That's really sweet of you.

25:58Which I would say ChatGPT and Gemini are excellent. Both sound more like people than,

26:04well, actually people that I know. Plus they're easy to interrupt when you want them to stop talking.

26:08So three points each. Perplexity is not terrible, but does still have a little bit of that text to

26:14speech engine vibe to it. It often mishears what I'm trying to say, and it doesn't seem to take the

26:19hint very well when you tell it to shut up. So one point. And then Grok is better than Perplexity,

26:24but not as good as Gemini and ChatGPT. The voice just sounds a lot less high quality than those two.

26:30Two points. Leaving us with the final scores of ChatGPT as the pretty undeniable winner with 29 points.

26:38It is the most well-rounded and consistent between these. Grok, which to my surprise came in second.

26:44It's the quickest and surprisingly decent considering. And that leaves Gemini in third place with 22.

26:50and Perplexity, which I found occasionally very impressive, but mostly quite unimpressive with 19.

26:57The only other consideration is the price, but since every assistant we're testing in this video

27:01is based on a $20 a month tier, apart from Grok, which is 30, that actually only solidifies ChatGPT

27:07as the best choice for an AI chatbot right now for the average customer.

Category

Transcript

Recommended