Skip to playerSkip to main contentSkip to footer
  • 5/23/2025
Welcome to this hands-on AI-900 lab session, where we dive into Knowledge Mining with Azure AI Search! 💡 In today's data-driven world, extracting valuable insights from unstructured data is crucial. Azure AI Search helps businesses unlock hidden knowledge from documents, images, and databases using AI-powered search, OCR, natural language processing (NLP), and machine learning.

This tutorial provides a step-by-step walkthrough on implementing knowledge mining using Azure AI Search to enhance searchability, data discovery, and enterprise applications.

🔍 What You’ll Learn in This Video:
1️⃣ Introduction to Knowledge Mining & Its Use Cases
2️⃣ Understanding Azure AI Search & Its Capabilities
3️⃣ Building an AI-Powered Search Index in Azure
4️⃣ Integrating OCR, NLP, and Cognitive Skills for Advanced Insights
5️⃣ Enhancing Enterprise Search with AI-Driven Relevance Ranking
6️⃣ Deploying & Optimizing Knowledge Mining Solutions

🛠️ Who Is This For?
AI & ML Enthusiasts exploring AI-powered search & analytics
Developers & data scientists working on knowledge discovery solutions
Professionals preparing for the Microsoft AI-900 Certification
Businesses looking to enhance search & document intelligence
📌 Key Highlights:
✅ Hands-on demo of Azure AI Search & Knowledge Mining
✅ How to extract valuable insights from large data sources
✅ Using AI-driven search to improve business decision-making
✅ Best practices for optimizing search relevance & performance

💡 Learn how to implement AI-powered knowledge mining with Azure AI Search today!

Explore Our Other Courses and Additional Resources on: https://www.youtube.com/@skilltechclub

Category

🤖
Tech
Transcript
00:00when we are looking for some information the most common thing which we do is we go into one of the
00:15search engines like Google and Bing and we try to search the information well because we have a
00:21huge amount of data all the time getting a particular data in a specific format is something
00:27which is easiestly done with the help of searches and I'm sure you all agree with me on this now today
00:34we are going to see one of the service which is exactly doing same but with your organizational
00:39data hi guys my name is Maruti and I'm back with another Azure AI service and today we are going
00:47to focus on Azure AI search service basically this concept is connected with something called
00:53knowledge mining but that concept of knowledge mining is something which is not very popular
00:59or even Azure AI service is one of the most underrated service because sooner or later
01:04most of the organizations are going to need it but they just don't realize it right now
01:09let's see what is this Azure AI search service so let's explore what is this Azure AI search service
01:15but before that we need to understand the concept of knowledge mining so what is knowledge mining well
01:22the scenario comes from this that let's say you have an organization where you always have a lot of
01:27content so again we have information overload we have a lot of content and most of the time in
01:33organization specific case this data is actually locked away in documents pdfs sometimes even hand
01:40written notes your data can be a digital or maybe the data is a physical document well when you have this kind
01:47of data getting a proper information from this kind of documents is time consuming labor intensive and
01:54sometimes it's even impossible without a proper mechanism that's the reason where we have a concept
02:00called knowledge mining this knowledge mining is going to make sure that not only you can combine
02:05the whole bunch of data from your organizational knowledge base but you can also search them easily
02:11you can generate dashboards kpis where you can actually get the insight from this data you can
02:17apply this data into your business applications you can actually connect this data with your ai search
02:24bots and then it can give answers based on your questions well there are a lot many use cases where
02:31knowledge mining can be useful but if you want to do knowledge mining you have to understand how exactly this
02:37is going to work and how azure ai search service is going to help you in that so let's check this out
02:43azure ai search solutions are actually divided into three different steps obviously the first step is
02:49you have to ingest your data in case of ingestion of the data you have to provide your data in a digital
02:55format maybe you have a physical documents and you have to scan them and you have to upload that and you
03:01have to ingest the data into some kind of a data store because your data can be structured
03:07unstructured or semi-structured in any format the most suitable data storage for this can be a
03:12database like cosmos db or it can be your blob storage account or it can be a data lake where
03:19you have a huge amount of data which is stored whichever data ingestion or data storage mechanism
03:24your organization is choosing the first step is always going to be data ingestion once data
03:30ingestion is done the next step is going to be ai enrichment and indexing and this is where
03:36your azure ai search service is going to play a very important role in this case this is going
03:42to allow ai to enable a deeper understanding of whatever information which is stored inside those
03:48document if your documents are multilingual your ai skill sets can help you to convert those language
03:55into your natural language it is also going to help you to extract information from charts graphs images
04:02and then with the help of technologies like ocr you can take that image content and then you can
04:08actually derive some meaningful insights from that it's going to help you to apply all your azure ai
04:14services like azure ai vision service natural language processing service or maybe your translation
04:20services all of this are going to be helping you to do this thing called ai enrichment this ai
04:27enrichment pipeline is what we are going to set up today in our step-by-step lab but once you have
04:33this kind of ai enrichment pipeline which is going to generate some kind of data that data we have to
04:39convert into a proper index i hope you know the concept of index because anytime when you want to
04:45search the content searchable content is going to be retrieved very quickly with the help of indexes
04:52and that's the reason azure ai service is having an indexer which is going to help you to configure
04:58your index configuration and then your data will be organized in a proper index which you can easily
05:04search once you have indexed data the next thing is you have to explore the data with the help of a
05:11searching mechanism yes we have a search which will be performed on indexes you will get the result in
05:17the json format and then once you have a data in json you can actually associate that with your
05:23further analytical services or you can actually associate that with the dashboard kind of services
05:28like power bi or some other services now all of these things looks complex but when you're going to
05:35do this thing you have to focus on these three steps first one is data ingestion where we need some
05:40kind of a storage second one is ai enrichment and indexing where we need azure ai service with
05:47obviously azure ai search service because azure ai search service is going to use azure ai services
05:54and then finally we need to associate the search which is going to help you to explore your data
06:00now we are going to see this whole process in today's lab so let's get started with that
06:06i am in my azure portal and the first thing which i'm going to do is as usual i'm going to click on
06:11create resource i have to create three different services right now the first service which i'm going
06:16to create is azure ai search service so i'm going to select azure ai search
06:25and when i search for ai search it's going to take me to the marketplace screen
06:29i'm getting azure ai search service here
06:35which is by microsoft i'll click on create i'm going to choose a new resource group which is ai
06:42900 rg this is going to be the name of my resource group in the search service name i'm going to use
06:49a unique number for all the resources which i'm going to create today so i'm going to say this is
06:55naruti srch search service and then i'm going to put a number which is going to be 2025 because this is
07:032025 so i'm choosing this in the location i have to be very specifically choose east us2 because all the services
07:11which i want to provision today i want to use same region for that so this is the first service which
07:17is a search service i'm choosing east us2 in the pricing tier i do not want to go with the standard
07:23because it's going to be an expensive one so i will go with basic and i'm going to click on select
07:29basic search is going to give me 15 gigabytes per partition kind of a thing and maximum three
07:34replicas and three partitions and maximum nine search units i think this is more than enough
07:39for this particular lab i'm going to click on next i think i don't need to change anything in the scale
07:45it's showing me the per month cost but anyhow i'm not going to keep this thing um permanently for
07:51one month so i'm okay with all this configuration i'll click on review plus create
07:56and if the validation is passed we'll click on create which is going to create our first
08:04search service now while this deployment is going on i'm going to open my azure portal in the separate
08:11tab yes in this particular demo i'm going to use multiple tabs so please do not get confused
08:16this is our first tab where we have just created azure ai search service it's still deployment is in
08:22progress now i'm in my second tab and in this case i'm going to search for azure ai service now if you
08:29check here when i click on ai plus machine learning we have azure ai services this is a multi-service
08:35account which i'm going to create right now so i'll click on create resource group will be same
08:41location is also going to be same name i'm giving maruti ai svc ai service with the number
08:512025 and the pricing tier for this is going to be standard yes i agree with the responsible ai
08:57guidelines and i'm going to click on review plus create so these are all basic provisioning which
09:02we are doing we are not doing much customization inside this right now
09:08and i'll click on create my first deployment is actually completed right now which is my search
09:15service my second deployment is going on which is actually ai service and if this one is going on
09:22i'll open the third tab in which i'm going to open azure portal and this is where i'm going to create my
09:28storage account okay in the third tab i'm going to search for storage account i'm choosing a storage
09:36account service we'll again go with the same resource group storage account name is going to be
09:42maruti storage svc with some number and east us 2 will be location primary service we are just going
09:51to go with gen 2 kind of a storage and redundancy i want to go for lrs i'm okay with all the configurations
09:59i'll click on review plus create and if the validations are passed we'll click on create the storage
10:06account deployment will be done very soon once this is done as we have discussed we are going to do step
10:12number one which is data injection so obviously i need to have some kind of a data which should be
10:17available well i have some data which is available from microsoft learn hands-on lab document so you
10:23can find the lab document link in the description of this particular video so you can just go through
10:28that and you'll get all the steps inside that including this link from where you can get this
10:32particular data my storage account deployment is completed i'll click on go to resource under data
10:39storage i'm going to click on containers i am going to create a new container inside this
10:45let's say i'm giving a name of this coffee reviews the anonymous access level right now is private
10:51now i do not want private access level so i need to change this first let me just cancel this i'll click
10:58on configuration which is under settings and then when i go there just make sure you have an option here
11:08which is allow blob anonymous access if you enable this then you'll have a public success enable on
11:14your containers and i want to do this thing as of now so i'm just enabling this property which is allow
11:20blob anonymous access then going back to containers creating a new container with the name coffee reviews
11:28and it's still not showing me the anonymous access level this is a very common issue which happens
11:34if this is still not showing you all you have to do is refresh this browser page
11:39now because azure portal is a browser website there are chances some decisions will not reflect
11:44immediately this is a very common issue and at that time you have to make sure that you're just
11:50going to refresh it once so this is how it is i'm going to click on create container the third time and
11:56i think this time i can not only give the name i can also change the anonymous access level to container
12:02remember container access level is very similar to public access so any file which is available
12:07inside this will be accessible through the public url as of now i do not want to take care of roll
12:13base access control or some other policy management for accessing this data and that's the reason i'm
12:18just keeping it container level coffee reviews container is created but i do not have any data inside
12:25this so let's click on upload i have a review folder with some nine reviews in the word document format
12:33so you can see all these are docx file and i'm just going to click on upload all these reviews will be
12:38available inside my coffee reviews container in the blob format now obviously you can use any kind of
12:45data whichever you have but you can see right now we have multiple reviews here the size of the file is also
12:51mentioned so you can see that and anytime if you want to review this particular data you can just
12:56click on that particular data you can click on edit section now most of the time if it is a text document
13:03the edit section is going to show you the data associated with that but because this is a word
13:08document it's not showing you the data inside this but you can just download this or you can just check
13:13it when it is downloaded so i'm not able to edit right now i'm not able to see that but yes we have a lot
13:19of reviews and as i said i am using an official content which is available on the step-by-step lab
13:24from microsoft learn website so you should also try this particular data first and then you can try with
13:30your own data now once our data injection is done let's talk about the second step which is ai
13:36enrichment and indexing for that i have to go back to my search service so far in this particular video
13:43we have not configured azure ai search service i am back in my tab where it is showing me that
13:49search service is successfully deployed and now we are going to configure that in this azure ai
13:54search service i just want you to see few things right now like in the left side where we have search
13:59management we have a section for indexes and indexes now when i click on any of this right now because
14:06this is a newly created service i won't have any indexes and i won't have any kind of indexes right now
14:13so all these things are not configured because we do not have any kind of connections right now it's
14:18a newly created azure ai search service what i'm going to do is i'm going to click on import data so
14:25i'm clicking on overview tab and the first thing which i'm doing is i'm going to click on import data
14:29i want to import data from my storage account uh it's asking me connect to your data is the first step
14:36of this wizard and they're asking me where is your data source is it some existing data source you want
14:41to use or you have something a data coming from samples or cosmos db or blob storage there are
14:47plenty of options which are available because we have a data in blob storage i'm going to choose azure
14:52blob storage which is giving me another configurable options they're asking me what would be the data
14:58source name i'm giving a name coffee customer data data to extract data to extract parsing mode
15:06subscription i do not want to change in the connection string i'm going to choose choose
15:11an existing connection string and when we do this thing it's going to help me to choose my connection
15:16string for my east us2 base storage account which is this in this storage account this is the only
15:23container which we have so i'm going to select this so this is going to use my connection string
15:29coffee review is my container name and if i want to specifically give some particular blob folder
15:34name here i can do that uh but we do not want because we do not have any folders in that container
15:40in the description i'm just going to put simply reviews for fourth coffee shops now this is something
15:46which is a simple description i'll click on next and this is a very important part of this wizard
15:52the second step is the one which is going to help you to do ai enrichment basically this is the one
15:58where you are going to add on your skills configuration in this you can see if you do
16:03not want to add on it manually this step is showing you optional so you can actually
16:09keep it optional and then it's going to be taken care automatically but you have a section here which
16:14is attach azure ai service add enrichment and then save enrichment to the knowledge store these three
16:20steps are very very important i'm going to expand this azure ai service this is my service which i have
16:28selected so i'm just going to choose this then i'm expanding ai enrichments under ai enrichments
16:35is asking me what kind of a skill set name you want to give i am going to give a name of my skill set
16:40that this is coffee skill set i want to select the check box that yes enable ocr and merge all the text
16:46into merge content field this is actually going to extract the text information from the images
16:52whichever is available in those word documents in the enrichment granulity level i'm going to
16:57change to pages 5000 characters chunk so it's going to be giving me to helping me with multiple
17:04pages with that and now we have a section where they are asking us to check items below required
17:09a field name so you can see right now here we have a cognitive skills which are available in this list
17:15so you can extract people names organization names location names basically this is something which is
17:22extracting the information from my existing content so what kind of cognitive skills i have to choose
17:28that's what i have to select i am going to select that yes i want to extract location names key phrases
17:34i want to detect the sentiment i want to generate tags from the images and i also want generate captions
17:40from the images because all these things are going to help me to search the relevant data
17:45now the third and the final step which is here is save enrichment to a knowledge store now this is
17:51going to help you to configure the knowledge store configuration with image projections or maybe document
17:57specific table projections or maybe blob projections now these are the separate projection kind of
18:02configurations which you can do here basically this is going to be your knowledge store which can store
18:08data in a form of file or table or maybe blob so i am going to select image projections
18:14documents as well as blob projections also with this so i am going to select image projections and
18:22documents where i have pages key phrases entities image details and image references i do not want
18:29blob projections as of now so these are the first two checkbox which i'm doing and when i'm doing this
18:33is asking me that where is your storage account which storage account connection string you want to
18:38associate here again we are going to choose an existing connection i'll choose my
18:43storage account which is in east us too i can use a different account also i'm going to create a
18:49new container here giving a name of this knowledge store so this is my same storage account but it's
18:55a new container i want to keep this container data private i'll click on create and this newly created
19:02container which is empty i'm going to select that as my knowledge store so this is a place where they are
19:09actually going to store the knowledge store specific data remember coffee review is a container which
19:14is providing data so that is the data which we have ingested and this is going to be my knowledge store
19:19where my data which is generated based on the ai enrichment is going to be stored i hope you're
19:25not getting confused with this let me click on next which is going to take me to custom target index if
19:32i want to specify in that now at the end of this document we have one section which is azure blob
19:38projections i want to tick mark this also and then you can see automatically it is detecting that my
19:44knowledge store kind of a container will be selected with this so that is coming with this and we are
19:49actually providing all three projections type here so file projection table projections and now blob
19:56projections are also going to be done in this after this all we can do is let's click on next which is
20:02customized target index first i want to change the name of the index to coffee index because i'm
20:08giving similar kind of naming convention everywhere and then below that they are giving me an option
20:13where i can choose what kind of a field name i want for retrieval for filtering for sorting for face
20:21table so i have to tick mark those things whichever i want to use for my searching or filtering kind of a
20:27thing alternatively if i want to change the search mode i can change it but i'm not doing it right now
20:34all i want to do is i just want to scroll down into this section where i can control what kind of
20:39things i want to use for sorting filtering and searching i am going to select filterable tick mark
20:45in this content and then same thing i'm going to do for all the other fields which are locations
20:51key phrases sentiment merge content text layout text image tags and image caption because i want
20:59filtering also for all this everything else i'm not changing i'm happy with this so i'm going to click
21:04on next create an indexer as usual the name of the indexer will be coffee indexer i want to schedule this
21:11indexer once only now suppose if you have a data which is continuously flowing in and then maybe you
21:16have to rerun the indexer with a specific schedule then you can decide this kind of a schedule whether
21:22you want hourly daily or some kind of a custom schedule you want for that i do not want any
21:27schedule because i'm not going to have a continuous flowing data in this injection of the cycle so i'm
21:33okay with once let's expand the advanced options and make sure that it is actually showing you base 64
21:39encoding keys are available for this this encoding keys are going to make your indexing more efficient so
21:46make sure this is selected here and if everything else is fine let's click on submit now this is
21:52going to run an indexer pipeline automatically so your ai enrichment is actually going to happen
21:58when this particular pipeline execution is going on you can see right now in the status of this
22:03service is showing you running so basically ai search service is running if i go into indexes right now
22:11it's going to show me that my coffee indexer is actually available there
22:15and the last run happened 26 seconds ago it's showing me right now that 9 out of 9 documents
22:21are successfully indexed with this and there are no errors as of now status of the indexer is success
22:28if i go to indexes section it's showing me that my coffee index is successfully created
22:33but the document count inside that is showing me zero because maybe my process is still not completed
22:39it's just going on this actually takes little time because it's going to take some time and then it's
22:44going to reflect it here so you have to wait and then you have to wait till this is showing you
22:48the proper document count with the proper data which is associated with that so let's just wait for
22:53some time now while this is going on you can optionally go to indexer you can click on your indexer name
23:00which is this one coffee indexer and you can see it's showing you right now that this is actually right
23:05now showing you the execution which is going on the duration for this is 12 seconds you can click on refresh
23:10and it's going to show you whatever execution which is happening inside this now obviously this is
23:16something which is going to take time so as i said it's going to take time and you have to wait for
23:21this to complete and reflect it there but yes if you want to observe it you can come here and you can
23:26check this now let me go back to my ai service i'll go back to indexes now in the coffee index is showing
23:33me document count is nine which means that my all nine documents are successfully associated with this index now
23:39if that is the case i am going to click on overview tab we have a something called search explorer the
23:46search explorer is a tool which is going to help you to search this content which is part of your
23:51knowledge base in the search explorer i can directly type something in the search box and i can click on
23:56the search button same like your search engine searching but instead of that we have an advanced
24:02section here which is a view option and i'll change the view to json view json view is basically going to
24:08help me to customize this with my own parameters inside this let's say right now the default search
24:13is showing star count equals to true so i want to search the whole content which is available there
24:19if i click on search it's showing me right now that we got some kind of a data which is in the form of
24:25odata now for your kind information odata is an open data protocol and using that this data is actually
24:32searched with this we have a lengthy json document which is maybe having few hundred lines inside that
24:38and it's showing me that the data results is coming we have total count of the document which is nine
24:44so it means that all nine documents are successfully indexed with this now let me try a few things here
24:49in this search so that we can understand that how exactly we can deep deep into this particular data
24:55and we can get some desired search results from that i'm changing this json and i'm saying that search
25:02locations chicago and then in that also i want to see the count true now in all the documents it's
25:08going to search for the location chicago and you can see right now they're showing me that okay there are
25:13three documents which are found which is actually having a location chicago uh the locations are fourth
25:18coffee chicago and eliodois now these are the three things which we got they're also showing me key
25:24phrase which are extracted with this so new drink trim seasonal back goods now all these are those key
25:29phrases the sentiment which are available on those reviews the image tags which are available everything
25:35is available this time it's not going through all the document is going through those three documents
25:39which are matching with my search and let's say i want to do another search where i'm saying that
25:45i want you to search sentiment which is negative and if i click on search out of those they are saying
25:51okay there is only one document we got one count where we have a sentiment which is negative
25:56negative and why because it's showing you the sentiment and it's showing you the key phrases
26:01which are associated with that sentiment uh it's showing you terrible experience reviews pastries time
26:07so maybe someone is actually talking about some negative reviews associated with that restaurant
26:13and he's saying that today i was truly disappointed with the how long i had to wait for the pastries so
26:18this is a review which they have counted for a negative review which is correct so your sentiment analysis
26:25your extraction of the name entities all the things are happening and working because of the ai
26:30enrichment right now now once we are done with this let's do one final check on our knowledge store
26:35because we know our knowledge store is created in that storage account so let's go back to our storage
26:40account and in this storage account i'll go back to containers i'll refresh it we have a new knowledge
26:48store kind of a container which is available which we have created as a part of this process
26:54now inside this you can see all these multiple folders which are created here are actually also
26:59created with the help of this process when i go inside this mostly you're going to get your object
27:04projections and json files with that you can click on this files you can click on the edit option and
27:09this data you'll be able to see here so whatever is there in that particular file you'll be able to see
27:14that thing here and you can see it's actually showing you when exactly it got modified whatever things which got
27:19used in this and whatever content which is stored with that so this is something which is your
27:25extracted knowledge store which is used while you are searching that thing also we got one additional
27:31container here this is something which we have not created it got created automatically with the help
27:36of this process and this is something which is coffee skill set image projections all the image
27:41projections are actually stored inside this if i go inside one of the document folder inside this
27:46this is showing me the data which is available with jpg extensions basically these are those images
27:53which got extracted from your data and if i click on edit option in this image i can see those images
28:00which are available in this document so this is how it is you can click on the file you can click on the
28:06edit and you'll see that what image is actually available inside that now all these images were actually
28:11part of this word document but they have extracted this and obviously somewhere in some images if i
28:17have some text that is also going to be extracted with the help of ocr this whole process is super smooth
28:23and because of this this is what we call knowledge mining because it's going to help you to extract the
28:29desired information and then it's going to make you and then it's going to serve it to you in a much more
28:36searchable manner not only that as i told you earlier if you want to associate this data with your
28:44dashboards kind of a tools like power bi or maybe you want to just associate this with some analytical
28:50service like microsoft fabric or microsoft synapse analytics you can easily do that with this i think
28:56we are done with this particular lab so thank you so much if you are watching this video till this
29:01particular end it means that you're really interested in this particular topic i'm glad that you like
29:07this thing thank you so much this is your friend maruti i'll see you tomorrow bye

Recommended