DP-900: Understanding Non-Relational Data Concepts | Azure Data Fundamentals - video Dailymotion

SkillTech Club

In this DP-900 lecture, we explore the fundamentals of non-relational data, a key topic for understanding modern database technologies in Microsoft Azure. Whether you're preparing for the DP-900 certification or looking to understand NoSQL databases, this session will cover how non-relational data is structured, stored, and managed in Azure.  🔍 What You’ll Learn in This Session: ✅ What is Non-Relational Data? – Key differences from relational databases ✅ Types of Non-Relational Databases – Document, Key-Value, Column-Family, Graph ✅ Understanding JSON, BSON, and NoSQL Data Models ✅ Exploring Azure Cosmos DB – A Multi-Model Database ✅ Use Cases for Non-Relational Data in Big Data & AI ✅ Comparing SQL vs. NoSQL – When to Use Each?  🛠️ Who Should Watch This? Beginners & IT professionals learning NoSQL database concepts Students & aspiring data professionals preparing for DP-900 Developers & cloud architects working with Azure Cosmos DB & NoSQL solutions Business analysts & decision-makers exploring modern database strategies  📌 Key Highlights: ✅ Simple explanations of non-relational data models ✅ Real-world use cases & practical examples ✅ Live demos of Azure Cosmos DB & NoSQL queries ✅ DP-900 exam-focused insights  💡 Master non-relational data concepts & get ahead in cloud database management!  Explore Our Other Courses and Additional Resorces on: https://skilltech.club/

Transcript

00:00Now after relational data, this is the time that in this video we are going to focus on

00:13concept of non-relational data.

00:16Obviously that this is not going to be storing data in a tabular format like relational but

00:22then what kind of formats are there, what kind of differences are there compared to

00:26relational data, that's what we are going to see in this video.

00:30We will mainly explore the characteristics of this non-relational data, we will see what

00:34kind of formats and file extensions are going to be useful in that.

00:38We will define the non-relational data types and then we will see that what is this NoSQL

00:45and what types of non-relational databases are available in the cloud nowadays.

00:51Now because we are dealing with non-relational data, we have to keep in mind that non-relational

00:56relational data can be unstructured or semi-structured.

01:00You can see on the screen, the data which I am showing you is a semi-structured data because

01:05in this case the key value pairs are going to be there.

01:09We have a customer whose keys are going to be like name, telephone, address, we maybe have

01:15another customer who is actually having name with title like Mr. and then we have telephone

01:22addresses.

01:23This is the data which is neither fully structured nor unstructured.

01:27This is a semi-structured data and then obviously this is a non-relational data so we have to

01:33store the data directly as a field or a key value pair.

01:38Multiple entities in the same collection or container with different fields are possible

01:42in non-relational data and that's the reason this is one of the very flexible and good way

01:48to store the unknown data where you do not know the data types, you do not know that what

01:52kind of flexible data can come into each and every key.

01:57In order to understand non-relational data in a better way we have to understand the semi-structured

02:03data first and then we have to understand unstructured data also.

02:07Most of the time semi-structured data is going to store data by fields and then most of the

02:12time these are the four different formats of the file which are used in the semi-structured

02:18data.

02:19People use json files where json files are mostly allowing you to store data in the form

02:24of key value pair with the curly brackets at the end.

02:28Each field is going to have a name and the value associated with that after the column so

02:33it's like key column value kind of syntax will be followed.

02:37One field can contain simple values or even sometimes sub-documents also which are again

02:42going to end with the curly brackets.

02:45Fields can also have multiple values or it can also held an array like in the previous

02:49slide the sample which I have shown you was nothing but something like a json file.

02:55Same way we have Avro.

02:58Avro is nothing but a row based format.

03:00It was created by Apache initially and then it was used by them.

03:05In Avro each record contains a header that describes the structure of the data in the

03:10record.

03:11So they follow the header based configuration for the data storage while json stores key

03:16value pair kind of thing.

03:19In Avro also the headers are going to be stored in the json format only and the data is going

03:24to be stored in the binary format.

03:26So it's like header will be json, data will be binary and that combination is there inside

03:31Avro.

03:32We also have third one which is ORC which stands for optimized row column format.

03:39This ORC organized data into columns rather than rows.

03:43In first two formats we store data and organize in the form of rows while ORC is going to focus

03:49on columns.

03:50That's the reason the name is actually optimized row column format.

03:54It was developed by Hortonworks for optimizing the read and write operations in Apache Hive

04:00and even today we are using this thing inside the Apache Hive.

04:05Hive is a data warehouse system which is invented by Hadoop data system and Hortonworks actually

04:12and that's going to allow you to store a very large data sets and huge data associated

04:17with that.

04:18The last but important one is something which is Parquee.

04:22Parquee is another column data format.

04:24So same like ORC, Parquee is also column data format but the change is, this is something

04:30which is created by Cloudera and Twitter.

04:33The Parquee file contains row groups.

04:36Data for each column is stored together in the same row group.

04:40So there are chances that one row group is going to have multiple data for each column

04:45which is stored inside that.

04:47Each row group contains one or more chunks of data.

04:51The Parquee file includes metadata that describes the set of rows found in each chunk.

04:58These four different formats are giving me an idea that this kind of semi-structured data

05:03I can store and I can use different formats as per the need or as per the technology which

05:08I am associated with.

05:11Any of these four formats which are widely used can be used as a semi-structured data which

05:15will be non-relational data.

05:18Now let's understand what is unstructured data.

05:21Well unstructured data are mostly going to have data like video, audio, some media streams

05:27or some documents which cannot have any proper structure which needs to be followed.

05:32It often used to extract data, organization and categorize or identity structures something

05:39like that.

05:40It frequently used in combination with machine learning and cognitive services capabilities

05:45because this audio, video and the streaming which you are using here can be generated by

05:50some devices like IOT devices.

05:53It can be used and generated by some kind of an existing APIs like cognitive APIs or vision

05:59API and then this data is going to be used and stored inside your data operations and analytics.

06:06With structured and unstructured data formats you also need to understand something which

06:10is known as NoSQL.

06:12Now what is NoSQL if you ask me then I will say that NoSQL is nothing but a key value store.

06:19It has two elements, a key and a value.

06:22The key uniquely identifies the item and the value holds the data for that particular item.

06:27When you are dealing with NoSQL it can be a key value store which you are using which

06:33is the most common way but other than that key value store we can also store data in the

06:37form of document, we can also use column family based databases like ORC which we have discussed

06:44and we can also use some graph databases in NoSQL.

06:47When you are using document based databases these are going to represent something which is

06:53totally opposite to key value store.

06:54In this case a document database is going to have each document with the unique ID and

07:01the document is going to store one set of record for your data.

07:05The fields inside the documents are going to be transparent to the database management system

07:10and then while retrieving the data you can also store data in a JSON format and you can

07:15retrieve it with the key value pairs which are stored inside that.

07:19If you are not willing to store data in a JSON format you can also use some other format

07:24like XML, YAML or BASIN or maybe you can store documents even as a plain text.

07:33A column family database organize data into rows and columns as I have given you an example

07:38of ORC or Parkey files are going to use this kind of databases.

07:42And then we have a fourth one which is graph database which is comparatively newer than the

07:48other three but this is also widely useful in almost all the modern social media and e-commerce

07:54websites where we want to store non-relational data which is actually going to have some kind

08:01of a logical relationship with the other data.

08:03I will explain this thing in this next slide.

08:07The graph database enables you to store entities but the main focus is on the relationship that

08:13these entities are going to have with each other.

08:15Like you can see I am showing you one diagram in this where we have employee and there are

08:21chances that this employee whose name is Sara is actually manager of another group of employees.

08:27We have employee called Alok, Inesa and Max, they are actually going to report to Sara and

08:34when we have this kind of relationship between one employee and another employee, to maintain

08:39the hierarchy and the relation between multiple employees or to maintain the hierarchy with

08:45the departments and employee, you have to make sure that the relationship should be maintained

08:50even in this kind of non-relational data structure.

08:53When you are dealing with graph databases, the faster retrieval of the data is going to

08:59be a base necessity and that's why almost all the graph databases available in the cloud

09:04are going to give you much higher throughput than the traditional RDBMS databases.

09:10A graph database are going to store two types of information.

09:14Nodes that you can think of the instance or entity kind of thing like here, employee and

09:20departments are the nodes actually and then you also need to specify the ages.

09:25The ages are actually going to define the relationship between nodes.

09:29So each square box here in this diagram is a node and each arrow which is connecting one

09:35node with another node is actually an age which defines the relationship between nodes.

09:41This kind of n number of relationships can be there and this kind of n number of nodes

09:45can be there inside one graph database.

09:48Each graph database are going to store hierarchical data structures and it's really flexible in

09:54the data modeling.

09:55It can store this data further into different formats including JSON and then it can be retrieved

10:01in your application using any programming languages.

10:05So in this video we have seen and explored the concept of non-relational data.

10:09We have seen the file format, the way we store the data and we have also seen the

10:14varieties which are available in that from different companies and technologies.

10:18décidé Otaly screwing 3

10:24foriliyor mujahuan ethyp

10:25logic for pertaining to stary

10:26correlated

10:27truth

10:28truth

10:29truth

10:30truth

10:31truth

10:32truth

10:33truth

DP-900: Understanding Non-Relational Data Concepts | Azure Data Fundamentals

Category

Transcript

Recommended