- 7/5/2025
Unlock the power of Generative AI in this dynamic lecture! Learn what generative AI is—creating content like text and images—and explore language models, including LLMs (Large Language Models) and SLMs (Small Language Models). Dive into the Azure Open AI Model Catalog and Azure AI Foundry to see how these tools transform tech. Perfect for mastering AI innovation!
Explore our other Courses and Additional Resources on: https://skilltech.club/
Stay updated with real-world tech skills
Follow us on other social media
LinkedIn: https://www.linkedin.com/company/skilltechclub
Instagram: https://www.instagram.com/skilltech.club/
Facebook: https://www.facebook.com/profile.php?id=61572039854470
YouTube: https://www.youtube.com/@skilltechclub
Reddit: https://www.reddit.com/user/skilltechclub/
X: https://x.com/skilltechclub
#GenAI #Machine learning #AI #ChatGPT #SkillTech Club #Copilot #AzureAIFoundry #OpenAI
Explore our other Courses and Additional Resources on: https://skilltech.club/
Stay updated with real-world tech skills
Follow us on other social media
LinkedIn: https://www.linkedin.com/company/skilltechclub
Instagram: https://www.instagram.com/skilltech.club/
Facebook: https://www.facebook.com/profile.php?id=61572039854470
YouTube: https://www.youtube.com/@skilltechclub
Reddit: https://www.reddit.com/user/skilltechclub/
X: https://x.com/skilltechclub
#GenAI #Machine learning #AI #ChatGPT #SkillTech Club #Copilot #AzureAIFoundry #OpenAI
Category
🤖
TechTranscript
00:00with this now let's move forward to the next and the important slide which is talking about
00:13generative AI okay now if we talk about what is generative AI which is the final dot which
00:20we need to connect then we need to understand something which is known as language model now
00:25let me tell you one thing that language model is actually that commonly known word which is used
00:30for all the machine learning models which are used with generative AI now some people ask me this
00:37question also that why we call this model language model even though this can generate natural language
00:42text images programming language code and even videos also why we call it still language model
00:48well the answer is language model is called language model because you are ultimately going to provide
00:54input in the natural language text because it's always going to take input in the natural language
01:00text now you can see some examples here like let's say if i want to do some natural language input
01:06like write a cover letter for a job application now when i do this thing it understand that the response
01:12which i'm expecting is a cover letter so it's going to give me response in the language and it's going
01:16to give me the response associated with that while on the other hand if i say create a logo for a
01:22florist business is actually going to give me an image which is a logo of the florist business
01:27remember the response is in the various format but input is always in the natural language text
01:33and that's the reason we call this thing language model same way for the programming language things
01:38also i can say write a python code to add two numbers and then it's going to give me that kind of
01:43a logic generated with that in all these cases this input text which you're providing is something which is
01:49known as prompt and that's always going to be the natural language text that's why the models are
01:53actually known as language models now at this particular point i want to ask you one question
01:59guys i'm going to ask you one simple question based on the language model like and that is i'm sure
02:05that you all know chat gpt and if you already know chat gpt anyone knows what is gpt stands for
02:14just pause this video for 5 to 20 seconds and just try to think about this particular answer guys do
02:20you know what is gpt stands for i know you all are knowing chat gpt you have used it but do you know
02:27what is gpt stands for well maybe some of you have guessed the answer let me tell you in gpt g stands
02:34for generative p stands for pre-trained so this is generative because it's a generative ai model it is
02:41pre-trained because it's trained based on machine learning concept and then there is a t which is
02:46known as transformer now the obvious thing which we need to know next is why the hell we are calling
02:52this thing transformer yes this is generative model yes this is pre-trained model why they are calling
02:57this thing transformer well because all the modern language models are based on something which is known
03:03as transformer architecture let me show you that now this particular slide which i'm showing you right
03:10now is showing you how exactly transformer model architecture works now let me tell you one warning
03:17here first transformer model architecture is one of the most complex things of generative ai yes
03:24normally when i do corporate session for my clients for mostly three to four days kind of a duration
03:29transformer model architecture is what i covered for more than two to three hours so basically this is
03:34that big what i'm going to show you right now is a high level overview oh sorry i'm going to show you
03:40very high level overview of a transformer model architecture in this because this video is created
03:46for beginner audience so remember what i'm going to show you is maybe a bird's eye view kind of a very
03:51high level overview of transformer model architecture if you want to know more in-depth kind of details if
03:57you're a tech person you can just comment down in this video that create a separate video on transformer
04:02model architecture i'll give you an in-detail description of the transformer model components
04:08in that separate video right now in this particular slide we need to understand that how transformer model
04:14architecture based language model are actually allowing us to use ai agents and with that we can
04:20actually do prompt engineering so basically the step number one which you need to understand here is
04:26transformer models are actually having two different parts inside that encoder part and decoder part
04:32encoder part is actually responsible for processing the input which you're providing so basically when
04:38you provide a natural language input which we call prompt how exactly the transform model is going to
04:44understand that that is something which is done by encoder part so understanding the intent of the user
04:50what user is trying to say what kind of a context is associated with that all these things will be understood by encoder part
04:56and then based on that understanding whatever generative response we want to create in a form of language
05:03in a form of image or anything else that generation of the response is going to be done by the decoder part
05:09now if i focus on encoder part first encoder part is actually going to use something which is known as
05:15tokens as vectors basically whatever input prompt which i'm going to provide in the natural language text
05:21text that is going to be converted into something called tokens and that process is actually known
05:26as tokenization to understand encoder let's have a look at the tokenization first
05:32as you can see here i'm providing some kind of an input statement like i heard a dog bark loudly at a cat
05:40when i provide this input text it's actually going to convert this statement into tokens and
05:46each word or a part of the word is going to be converted into a separate separate token now in
05:52real time actually if you want to understand tokenization properly you actually have to
05:57understand that how it is converting that in real cases it's not going to be like it's going to create
06:02some kind of a numbers like one two three four five six seven eight basically it's going to generate
06:06some kind of a unique token id associated with that in that also when you have repeated words like for
06:12example in this case i have a word a so i i heard a dog and then i'm saying loudly at a cat so the a
06:21is something which is repeated twice and remember the a is not tokenized twice it's actually going to
06:27be tokenized once only which is with the token number three something like that now let me give you
06:33a real-time scenario of this thing with a real tokenization on azure open ai website now you can see
06:40right now i have opened one url which is platform.openei.com slash tokenizer i hope you all know
06:46that open ai is a company behind chat gpt and this is a company which was initially founded by sam altman
06:53and elon musk now obviously i do not want to go in depth into open ai but i'm showing you one of the
06:59page on openei.com only which is showing you how exactly language model tokenization is actually working
07:06now in order to understand this let me provide some kind of an input text here let's say i'm providing
07:13some text like i am
07:18learning ai
07:21with
07:23maruti
07:25makwana i'm putting my name so i'm learning ai with maruti makwana and
07:30i will love to use ai
07:39with copilot
07:42in my
07:45daily work
07:47i added the statement here and the moment i'm done with this typing you can see exactly below
07:52that is showing me that this statement is converted into 24 tokens 94 characters
07:58they are showing me each token differently with the different color in the below section
08:02not only that if i focus on the token ids just focus on this each word each character is actually
08:08somehow when it's converted into token is using a unique token id associated with that
08:15now i has a token id 40 m has a token id 939 and same way if i go further i think 20837 is a token id
08:25which is used for the word ai now if you focus on this word ai is used twice in this statement i have
08:31used it intentionally twice and this fourth one which is 20837 token for the ai it has to be
08:38available somewhere else once again which is here so in this area of token because ai is actually
08:44something which was used twice it has not created a new token for that it is just creating a same token
08:49and using it twice this kind of an area of token is what you're going to send to your language model
08:55so basically you're going to provide an input text only but step number one is going to convert the
09:00text into tokens and the token array is going to be sent to the language model and then it's going
09:04to be further processed now what further processing is going to do let's focus on that in order to
09:10understand further processing we have to go back to the transform model architecture diagram
09:15after that tokens are created so once your text is converted into tokens they are going to put
09:21these tokens into some kind of a three dimensional vector sorry multi-dimensional vector it's going to
09:26be even more than three dimensions but to understand this thing we are just going to keep a three
09:30dimensional cube here you can see i'm just trying to show you here that based on the token is actually
09:36going to see how one token is related to the other tokens so let's say we have a token for a word called dog
09:44it's going to point out this particular dog word in that particular three dimensional space
09:50and then then let's say that this is a dot which is actually representing dog then you have another
09:54word which is cat then it's going to represent that okay cat and dog both are animals so maybe these are
09:59going to be another dot which is going to be nearby because both are actually animals and then we have
10:05another word called puppy now puppy and dog both are same thing that's why they may be going to put puppy
10:10and dog in the same line but then puppy is nothing but a younger dog so it's going to put another dot there
10:16now this kind of representation in the multi-dimensional space is actually going to give an understanding
10:22to this language model that how this one token is connected with the other token basically connecting
10:27all these dots is actually going to understand the context of your input prompt this is how your language
10:32model understand remember that suppose if you have one particular word like skateboard skateboard is
10:38not even a living thing okay and that's the reason they are putting it somewhere in the different
10:42dimension all together that this is a skateboard now whenever you're going to provide an input prompt
10:47maybe it's just a one-liner or maybe it's going to be a more complex with multi lines and more details
10:52like that it's actually going to put that thing into this kind of a multi-dimensional space and based on
10:58that it's going to understand what is your context what kind of an intent you have and based on that
11:03it's going to do the generation part using decoders now there is one more question which my students
11:09asked me couple of times that maruti sometimes if i just rephrase my question in the copilot or in
11:16other ai generative tools actually it's going to give me a better answer it's going to give me a different
11:21answer sometimes why is this happening well the answer is here on this particular screen
11:27when you rephrase your question or when you use a different kind of words in that particular question
11:31by just proving the same point you are actually going to give more context or you're actually
11:36going to give more tokens into that prompt based on which it's actually able to understand that
11:41particular input from properly until unless your transformer model is not going to be
11:46understanding this particular input it cannot generate a desired output in a better way and that
11:51is the reason guys prompt engineering is very important thing for everyone who wants to use ai
11:58i hope you understood encoder decoder part now i'm not going into decoder very in depth right now it
12:04also has something called attention blocks and couple of other things with that i do not want to go into
12:08that right now in this particular video but i hope you understood the encoder and decoder part with this
12:14let's move forward to the next part of the generative ai so tokenization is something which you understood
12:21what is a token i hope you understood that also i have not explained this as a definition but if
12:25you see right now it's a token which is nothing but a fundamental unit of text that represents a
12:30sequence of characters often a word or a part of the word kind of a thing which is used by models to
12:36process and generate text if you want you can also go to this url and you can try different text with
12:43different kind of generative ai models which are available with chat gpd this is a screenshot of
12:48what i have shown you in that now let's move forward to next thing in language models so we understood
12:53one thing that we have language models which are based on transformer model architecture and almost
12:58all modern generative ai systems are actually having this kind of language models running in the background
13:04but when you're going to use this thing you actually have two different options available to you
13:09first option is yes you can train your own language model from scratch
13:14like let's say you have a team of data scientists in your organization you also have your own data
13:18which is a historically huge data which you have collected from the past observations and you want
13:24to train the model from scratch you can do that but that's going to be time consuming and it's going to
13:29be a lot many things which you have to take care of as well as it's going to be expensive also
13:35that's the reason number of organizations who wants to use generative ai they use something which is
13:40known as foundation models most generative ai solutions are going to use this kind of foundation
13:47models as a base and then if they want they can actually fine tune those models with their own data
13:54remember fine tune is another concept which is like you're going to provide your own data on these
13:59language models and then you can customize it now how to do fine tuning with the language models if
14:04you don't know that thing but in that case you should check out this video which is available on
14:08the same youtube channel skill tech club and i'm sure you're going to love it now we do not want to
14:14fine tune the model right now but if you want to use a foundation model then the first thing which
14:19you have to keep in mind is which foundation model you're going to use obviously azure open ai which is a
14:24combined service provided with microsoft azure cloud and open ai is going to provide multiple
14:31foundation models from open ai to you you can use gpt models which are chat gpt 3 3.5 4 4.0
14:40and 4.5 all these models you can use with azure open ai you can use embedding models also some
14:46people do not know but there are some embedding models also which were available before chat gpt
14:51so there were models like edda bebech curie all those models are available in this you can also
14:58use dali models which are models for generating images and you can use whisper models also which
15:04are for speech recognition which azure open ai models you're going to use it's obviously depends
15:10on your requirement on your generative ai tool which you want to associate with that but you can use any
15:15of those open ai models not only that suppose if you say i do not want to use open ai models for some
15:20reason can i use some other models yes as of now microsoft is having a service which is called azure
15:28ai foundry and i always say this is a portal so azure ai foundry is one portal under which you have
15:34hundreds of language models available for your kind of information guys this is a lot more if you want
15:40to use a language models from microsoft or you want to use language model from open ai hugging face
15:46midstral meta models which is a new name of facebook or database model even the latest deep seek models
15:53are also available on azure ai foundry portal now let me give you a sneak peek of that i'm not going
15:59to go in depth in that i promise because this video is for beginner audience but just for your kind
16:04information if you just go to this url which is ai.azure.com you will get a section which is known as
16:10model catalog and you can see right now this is that page this page is showing you that you have
16:151900 models as of now at the time of the recording of this video you have 1900 models available in this
16:22one particular page you have models from chat gpt you have models from meta which is llama model you
16:29have microsoft 5 models you have models from greatel you have a model from stable your models from minstrel
16:35uh even your deep seek model deep seek r1 deep seek v3 you have multiple models which are available
16:42here not only that you can actually compare models here so which model is going to be better for you
16:48in terms of cost in terms of latency in terms of capabilities which it is providing you can actually
16:54compare these hundreds of models and you can deploy them within few seconds not only that you can see
17:00they are showing you something which is known as model leaderboards in this model leaderboards they are
17:05actually going to give you different models based on which models are perfect for what kind of
17:10scenarios so maybe you want to focus on quality or maybe you want to focus on cost or maybe throughput
17:16or maybe there are so many other factors in which they are comparing and they are showing you which
17:20models are going to be better for you now obviously this is a tough job finding a good model from this list
17:27of models is something which is a tough job but once you get this thing everything else is going to be
17:32easy because foundation model is going to save you a lot of time as well as a lot of cost also
17:37associated with that as of now we are not going very in-depth in this but if you are interested in
17:43any particular model or if you want me to record a special video on how exactly we can compare the
17:49models and how we can analyze that which model will be the right choice for us in the requirement just do
17:55let me know i'll create a separate video for you on that now after all these things let's understand one
18:01thing here that we have different types of model also we have large language model which is obviously
18:07under the buzzword which maybe you heard about which is known as llm and we have a small language
18:13model also so like in the previous one when i was showing you that we have hundreds of models
18:18not all of those models are actually large language models because large language models are mostly
18:23trained with the large volume of data and with that we are going to have billions or trillions of
18:29parameter on which it got trained on the other hand we have small language models which are trained
18:35with very focused amount of data and it's going to have fewer parameters associated with that
18:39some of the examples of large language models are gpt4 gpd4 is one of the largest language model
18:46with 1.7 trillion parameters on which it got trained on the other hand if you go with microsoft
18:525.3 microsoft 5.4 or hugging face gpt neo these are those small language models which are
18:58limitedly created with the small amount of data but it is mostly going to be focusing on one
19:03specialized context only basically small language models are just going to be focusing on the
19:08specialized context and it has a generation capabilities limited to that context only
19:13on the other hand large language models are going to be comprehensive language generation
19:17capabilities will be there and it's going to be focusing on multiple contexts as well
19:22both are there with pros and cons so maybe you are going to go with large language model but there are
19:29some issues with that like because of the large size it can impact the performance and portability
19:35even when you want to do a fine tuning with the large language model is going to be an expensive thing
19:40because it's going to take a lot of time a large language model fine tuning can take few hours
19:45even sometimes few days on the other hand small language models are going to be faster portable less
19:51expensive if you want to do a fine tuning with your own data so choice is yours you have to figure
19:56out which one you have to use based on the requirement
Recommended
16:55
|
Up next