Category
📺
TVTranscript
00:00it sometimes seems we're being deluged with data wave upon wave of news and messages
00:17submerged by step counts constantly bailing out to make room for more
00:24we buy it surf it occasionally drowning it and with modern technology quantify ourselves
00:35and everything else with it data is the new currency of our time data has become almost
00:44a magic word for anything crime and lunacy and literacy and religion and drunkenness
00:54you name it somebody was gathering information about it it offers the ability to be
01:00transformationally positive it's in one sense just the reduction in uncertainty so what exactly is data
01:10how is it captured stored shared and made sense of the engineers of the data age are people that most
01:20of us have never heard of despite the fact that they brought about a technological and philosophical
01:28revolution and created a digital world that the mind boggles to comprehend this is the story of the word
01:38of our time how the constant flow of more and better data has transformed society
01:49and is even changing our sense of ourselves i can't believe this is my life now
01:55so come on in because the water's lovely
02:13my name is hannah fry i'm a mathematician and i'd like to begin with a confession
02:19i haven't always loved data truth is mathematicians just don't really like data that much and for
02:27most of my professional life i was quite happy sitting in a windowless room with my equations
02:32describing the world around me you can capture the arc of a perfect free kick or the beautiful aerodynamics
02:41of a race car the mathematics of the real world is clean and ordered and elegant
02:48everything that data absolutely isn't there was one moment that helped to change my mind
02:56it was in 2011 when i came across a little game that a teenage wikipedia user called mark j
03:03had invented now mark noticed if you hit the first link in the main text of any wikipedia page
03:10and then do the same for the next page a pattern emerges so the page for data for example
03:17links from set to maths to quantity to property and then philosophy which after a few more links will
03:26loop back onto itself now the page egg ends up in the same place and even that famously philosophical
03:34boy band one direction will take you all the way through to philosophy although you have to go through
03:40science to get there the same goes for fungi or hairspray marmalade even mice dust and socks it was a
03:55very strange finding and it called for some statistics
04:02another wikipedia user ilmari wrote a computer program to try and investigate this phenomenon
04:09now he discovered amazingly that for almost 95 percent of wikipedia pages you will end up getting to
04:18philosophy eventually now that's pretty cool but how did it change my mind about data well the pattern
04:27that mark j discovered and the data that was captured and analyzed it revealed a hidden mathematical
04:35structure because wikipedia is just a network with loops and chains hidden all over the place
04:42and it's something that can be described beautifully using mathematics
04:49for me this was the perfect example of how there are two parallel universes there's the tangible noisy messy
04:58one the one that you can see and touch and experience but there's also the mathematical one
05:05where i think the key to our understanding lies and data is the bridge between those two universes
05:15our understanding of everything from cities to crime global trade migration and even disease
05:25it's all underpinned by data
05:28take this for example rural wiltshire and a dairy farm gathering data from its cows wearing pedometers
05:44we can't be out here 24 7. the pedometers help us to have our eyes and ears everywhere it turns out when
05:52cows go into heat they move around a lot more than normal constant monitoring of their steps and some
05:59background mathematics reveal the prime time for insemination we'll be able to look at the data and
06:07within 24 hours there'll be greater chance of getting her in car data-driven farming is now big business
06:15turning a centuries-old way of life into precision science
06:23pretty much every industry you can think of now relies on data
06:32we all agree that we are undergoing a major revolution in human history
06:46the digital world replacing the analog world a world based on data uh that are made of uh
06:52codes rather than a world made of biological and physical data that is extraordinary why philosophy
06:58at this stage because when you face extraordinary challenges the worst thing you can do is to get
07:04close to it you need to take a long run-up the bigger the gap the longer the run-up and the run-up
07:10is called philosophy in the spirit of taking a long run-up we'll start with the word itself
07:20data is originally from the latin datum meaning that which is given data can be descriptions
07:28counts counts or measures
07:35of anything
07:39in any format
07:40it's anything that when analyzed becomes information which in turn is the raw material
07:49for knowledge the only true path to wisdom look at the data on data
07:56and before the scientific and industrial revolution the word barely gets a look in in english
08:04but then it starts to appear in print as scientists and the state gather observe and create more and
08:12more of it this arrival of the age of data would change everything
08:24industrial revolution britain for victorians booming industry and the growth of major cities
08:31were changing both the landscape and daily life beyond recognition
08:37into this scene stepped an unlikely man of numbers william farr one of the first people to manage data
08:45on an industrial scale william farr had a quite unusual upbringing in that he was actually the son of
08:51a farm laborer but who had managed to get a medical education was really very unusual for someone of his class
09:01in public health life expectancy and about causes of death
09:15for anyone interested in statistics there was only one place to be
09:21somerset house in london was home to the general register office where in 1839
09:28farr found his dream job from up there in the north wing william farr the apothecary medical
09:35journalist and top statistician would really rule the ruse now this place was almost like a factory
09:41here they would collect process and analyze vast amounts of data so in would come the census returns
09:49the records of every single birth death and marriage in the country and out would come the big
09:54picture the usable information that could help inform policy and reform society i think it's
10:02sometimes difficult for us to remember just how little people knew in the early 19th century about
10:06the changes that britain was going through so when farr did an analysis of population density and death
10:12rate he was able to show that life expectancy in liverpool was absolutely atrocious it was far far
10:17worse than the surrounding areas this came as a surprise to a lot of people who were believed that
10:21liverpool a coastal town was actually quite a salubrious place to live at somerset house far
10:29spearheaded a revolution in the systematic collection of data to uncover the real picture of this changing
10:37society its scale and ambition was described in a newspaper at the time in arched chambers of immense
10:46strength and extent are in many volumes the genuine certificates of upwards of 28 million persons
10:54born into life married or passed into the grave here every person was recorded equally a revolutionary idea
11:04here are to be found the records of non-entities side by side with those once learned in the law
11:15or distinguished in literature art or science but what really motivated william farr was not just data
11:24collection it was the possibility that data gathered could be analyzed to help overcome society's greatest ill
11:34history of all of the victorian diseases the terrifying thing was that you could wake up in the morning
11:42and feel absolutely fine and then be dead by the evening between the 1830s and the 1860s tens of thousands
11:52died in london alone the control of infectious diseases like cholera which no one fully understood
12:01became the greatest public health issue of the time however great london might have looked back then it
12:09would have smelled absolutely terrible at that point the victorians didn't have really a great way of
12:15disposing of human waste so it would have flowed down the gutters into open sewers and out into the thames
12:22now the city smelt so bad that it was pretty plausible that the foul air was responsible for
12:30carrying the disease farr collected a huge range of data during each cholera outbreak to try to identify
12:39what put people most at risk from the bad air he used income tax data to try and measure the affluence of
12:47the different boroughs that were affected by cholera he asked his friends at the royal observatory to
12:52provide data on the temperature and climatic conditions but the one that he thought was
12:57most convincing was about the topography is about the elevation above the thames using the data farr
13:04suggested a mathematical law of elevation its equations described how cholera mortality falls
13:11the higher you live above the thames now he published his report in 1852 which the lancet described
13:20as one of the most remarkable productions of type and pen in any age and country
13:28the only problem was that farr's work although elegant and meticulous was fundamentally flawed
13:35farr stuck to the prevailing theory that cholera was spread by air such as the power of the status quo
13:46but in 1866 five and a half thousand people died in just one square mile of london's east end
13:53and that data made farr change his mind when farr came to write his next report the data told a different
14:03story which proved the turning point in combating the disease the common factor among those who died
14:10was not elevation or air but sewage contaminated drinking water
14:19with this new report farr may seem to have contradicted much of his own work
14:25but i think that this is the perfect example of what data can do
14:29it provides that bridge essential to scientific discovery from theory to proof problem to solution
14:38good data even in huge volumes does not guarantee that you will arrive at the truth
14:45but eventually when the weight of the data tips the balance even the strongest held beliefs can be overcome
14:52of course it was the weight of the data itself which with the dawn of the 20th century was becoming
15:02increasingly hard to manage data stored long form in things like census ledgers could take the best
15:11part of a decade to process meaning the stats were often out of date
15:16when you are dealing with figures like these it's one thing
15:22but when you are counting the population like this it's quite a different matter a deceptively simple
15:28solution got what's now called the information revolution underway encoding data as holes punched in cards
15:38these cards are passed over sorting machines each of which handles 22 000 cards a minute
15:46by the 1950s data processing and simple calculations were routinely mechanized
15:53laying the groundwork for the next generation of data processing machines
15:58they would be put to pioneering work in a rather unlikely place
16:08in a grand london dining hall a group of men and women many in their 80s and 90s have gathered for a
16:15special work reunion
16:19at its peak their employer jay lions purveyor of fine british tea and cakes had hundreds of tea shops
16:26nationwide there are hundreds of items of food all these in a varying quantity each day are delivered to a precise
16:34timetable to the tea shops
16:38these people aren't former jay lions bakers or tea shop managers they were hired for their mathematical skills
16:47lions had a huge amount of data which has to be processed
16:51often very low value data so for example a transaction from a tea shop would be a cup of tea
16:59but each one had a voucher and had to be recorded and had to go into the accounts
17:05for business reasons and for management reasons every calculation you did not only you had to do it twice
17:12but you had to get it checked by someone else as well the handling of these millions and millions of
17:18pieces of data the storage of that data are the key of the business problem
17:26the lions team took the world by surprise when in 1951 they unveiled the lions electronic office or leo for
17:35short at this point only a handful of computers existed and they were used solely for scientific and
17:45military research so a business computer was a radical reimagining of what this brand new technology could be for
17:55each manager has a standing order depending on the day of the week
17:59the program is fed first laying down the sequence for the multiplicity of calculations leo will perform
18:15it was the first opportunity to the process of large volumes of clerical work take all the hard
18:21work out of it and put it on an automatic system before leo working out an employee's pay
18:28took an experienced clerk eight minutes but with leo that dropped to an astonishing one and a half
18:35seconds it was all so exciting because we were breaking new ground the whole time absolutely
18:44everything which we did has never been done before by anybody anywhere i don't think we realized the kind
18:54of transformation we were part of the post-war years saw a boom in the application of this new computing
19:04technology leo ran on paper tape and cards but soon machines with magnetic tape and discs were developed
19:13allowing for greater data storage and faster calculations
19:17as more businesses and institutions adopted these new machines application of mathematics to a whole host
19:26of new real world challenges took off and the word data went from relatively obscure to ubiquitous
19:36data has become almost a magic word for anything the truth is that is a kind of interface
19:52today between us and the rest of the world in fact between us and ourselves we understand our bodies
20:00in terms of data we understand society in terms of data we understand the physics of the universe in
20:05terms of data economy social science we play with data so essentially is what we interact with most regularly every day
20:17data underpins all human communication regardless of the format and it was the desire to communicate
20:25effectively and efficiently that led to one of the most important academic papers of the 20th century
20:36a mathematical theory of communication has justifiably been called the magna carta for the information age
20:45now it was written by a very young and bright employee of bell laboratories that's the american center for
20:52telecoms research that was founded by one of the inventors of the telephone alexander graham bell
20:58now this paper was written by claude shannon in 1948 and it would effectively lay out the theoretical
21:07framework for the data revolution that was just beginning those that knew him described shannon as a
21:15lifelong puzzle solver and inventor as he finds the correct path he registers the information in his
21:22memory later i can put him down in any part of the maze that he's already explored and he'll be able
21:27to go directly to the goal without making a single false turn during world war ii he worked on data
21:35encryption systems including one used by churchill and roosevelt
21:40but at bell labs claude shannon was trying to solve the very civilian problem of noisy telephone lines
21:56in the analog world of 20th century phones your speech was converted into an electrical signal
22:03using a handset like this and then transmitted down a series of wires the voice signals would travel
22:11along the wire be detected by the receiver at the other end and then be converted back into sound waves
22:18to reach the ear of whoever had picked up problem was the further the electrical signal traveled down
22:24the line the weaker it would get eventually you couldn't even hear the conversation for the amount of noise
22:30on the line and you could boost the signal but that will mean boosting the noise too shannon's genius idea
22:40was just as simple as it was beautiful the breakthrough was converting speech into an incredibly simple code
22:49hello first the audio wave is detected then sampled each point is assigned a code of ones and zeros
23:00and the resulting long string of digits can then be sent down the wire with the zeros as brief low voltage
23:07signals and ones as brief bursts of high voltage from this code the original audio can be cleanly
23:15reconstructed and regenerated at the other end hello shannon was the first person to publish the name for
23:24these ones and zeros the smallest possible pieces of information and they're called bits or binary digits
23:31and the real power of the bits and the mathematics behind it applies way beyond telephones
23:37they offered a new way for everything including text and pictures to be encoded as ones and zeros
23:51the possibility to store and share data digitally in the form of bits was clearly going to transform the
24:00world if anyone has to be identified as uh the genius who uh developed uh the foundational uh mathematics
24:10and the foundation of science for our age that is certainly uh called shannon now one thing has to be
24:17clarified um the theory developed by shannon uh is about data transmission has got nothing to do with
24:25meaning meaning truth relevance importance of the data transmitted so it doesn't matter whether the zero
24:32and one represent an answer to heads or tails or to the question will you marry me for a theory of
24:42information is data anyway and if it is a 50 50 chance that you will will not marry me or that is heads or tails
24:51the amount of information the shannon information communicated is the same shannon information is not
25:00information like you or i might think about it encoding any and every signal using just ones and zeros
25:08is a pretty remarkable breakthrough however shannon also came up with a revolutionary bit of mathematics
25:16now that equation there is the reason you can fit an entire hd movie on a flimsy bit of plastic
25:23or the reason why you can stream films online now i'll admit it might not look too nice but
25:32don't get put off yet because i'm going to explain how this equation works using scrabble
25:38imagine that i created a new alphabet containing only the letter a this bag would only have a tiles
25:48inside it and my chances of pulling out an a tile would be one be completely certain of what was going
25:54to happen using shannon's math the letter a contains zero bits of what's called shannon information
26:03let's say then i got a little bit more creative but not much and had an alphabet with two letters
26:08a and b and equal numbers of both in this bag now my chances of pulling out an a are going to be a
26:15half and each letter contains one bit of shannon information of course when transmitting real messages
26:25you'll use the full alphabet but english as with every other language has some letters that are used
26:33more frequently than others if you take a quite common letter like h which appear about 5.9 percent
26:41of the time this will have a shannon information of 4.1 bits and incidentally a scrabble score of four
26:52but of course there are some much more exotic and rare letters like z for instance which appears about
26:590.07 percent of the time that gives it 10.5 bits and a scrabble score of 10.
27:10bits measure our uncertainty if you're guessing a three-letter word and you know this letter is z
27:17it gives you a lot of information about what the word could be
27:20but if you know it's h because it's a more common letter with less information you're more uncertain
27:29about the answer
27:33now if you wrap up all that uncertainty in together you end up with this the shannon entropy
27:40it's the sum of the probability of each symbol turning up times the number of bits in each symbol
27:46and this very clever bit of insight and mathematics means that the code for any message can be
27:54quantified not every letter or any other signal for that matter needs to be encoded equally
28:03the digital code behind a movie like this one of my dog molly for example can usually be compressed
28:10by up to 50 percent without losing any information but there's a limit
28:19compressing more might make it easier to share or download but the quality can never be the same as the
28:27original you can't really overstate the impact that shannon's work has had because without it we
28:38wouldn't have jpegs or zip files or hd movies or digital communications but it doesn't just stop
28:45there because while the mathematics of information theory doesn't tell you anything about the meaning
28:50of data it does begin to open up a possibility of how we can understand ourselves and our society
28:58because pretty much anything and everything can be measured and encoded as data
29:11we say that signals flow through human society that people use signals to get things done that
29:16our social life is in many ways the sending back and forth of signals so what is a signal it's in one
29:23sense just the reduction in uncertainty what it means to receive a signal is to be less uncertain than
29:39you were before and so another way to think of measuring or quantifying a signal is in that change
29:45in uncertainty using shannon's mathematics to quantify signals is common in the world of complexity science
29:54it's rather less familiar to historians i love maths i love its precision i love its beauty
30:00i absolutely love its certainty and that simon can bring that mathematical worldview that mathematical
30:20certainty to what i work with the reason behind this remarkable marriage between history and science
30:28is the analysis of the largest single body of digital text ever collated about ordinary people
30:36it's the proceedings of london's old bailey the central criminal court of england and wales which
30:42hosted close to 200 000 trials between 1674 and 1913 there are 127 million words of everyday speech
30:53um in the mouths of orphans and women and servants and ne'er-do-wells of criminals certainly but also people
31:03from every rank and station in society and that made them unique what's exciting about the old bailey and
31:12the size of the data set the length the magnitude of it is that not only can we detect a signal but we're
31:19able to look at that signal's emergence over time shannon's mathematics can be used to capture the amount
31:27of information in every single word and like the alphabet the less you expect a word the more bits of
31:35information it carries imagine that you walk into a courtroom uh at the time and you hear a single word
31:43the question we ask is how much information does that word carry about the nature of the crime being
31:50tried you hear the word the it's common across all trials and so it gives you no bits of information
32:01most words you hear are poor signals of what's going on
32:04but then you hear purse it conveys real information then comes coin grab and struck the more rare a word
32:19the more bits of information it carries the stronger this signal becomes one of the clearest signals that
32:28we see in the old bailey one of the clearest processes that comes out is something that uh is known as the
32:33civilizing process it's an increasing sensitivity to and attention to the uh distinction between
32:44violence and non-violent crime if for example um somebody hit you and stole your handkerchief
32:52in the 18th century context in 1780 you would concentrate on the handkerchief more worried about
32:58a few pence worth of worth of dirty linen than the fact that somebody just broke your nose
33:03or cracked a rib the fact that a hundred years later by 1880 every concern every focus both in terms of
33:13the words used in court but also in terms of what people were brought to court for focused on that
33:19broken nose and that cracked rib speaks to a fundamental change in how we think about the world
33:26and how we think about how social relations work look at the strongest word signals for violent crime
33:34across the period in the 18th century the age of the highwayman words relating to property theft dominate
33:43but by the 20th century it's physical violence itself and the impact on the victim that carry the most weight
33:51that notion that one can trace change over time by looking at language and how it's used who deploys
33:59it in what context that i think gives this kind of work it's real power there are billions of words
34:06there's all of google books there is every printed newspaper there is um every speech made in parliament
34:13every sermon given at most churches all of it is suddenly data and capable of being analyzed
34:24the rapid development of computers in the mid-20th century transformed our ability to encode store
34:30and analyze data it took a little longer for us to work out how to share it
34:37this place is home to one of the most important uk scientific institutions although it's one you've
34:47probably never heard of before but since the 1900s this place has advanced all areas of physics radio
34:55communications engineering material science aeronautics even ship design npl the national physical
35:04laboratory in southwest london is where the first atomic clock was built and where radar and the
35:10automatic computer engine or ace were invented the ace computer was the brainchild of alan turing who
35:19came to work here right after the second world war now turing's contributions to the story of data are
35:25undoubtedly vast but more important for our story is another person who worked here with turing
35:32someone who arguably is even less well known than this place donald davies
35:39davies worked on secret british nuclear weapons research during the war later joining turing at npl
35:48climbing the ranks to be put in charge of computing in 1966 as well as the new digital computers davies had
35:58a lifelong fascination with telephones and communication his mother had worked in the post
36:04office telephone exchange so even when he was a kid he had a real understanding of how these phone
36:08calls were rooted and rerouted through this growing network and that was the perfect training for what was
36:15what was donald davies like then he was a super boss because he was very approachable um everybody realized
36:34he'd got huge intellect i mean but not not difficult with it very nice guy davies innovation was to develop
36:43with his team a way of sharing data between computers a prototype network donald had spotted that
36:51there was a need to connect computers together and to connect people to computers not by punch cards or
36:57paper tape on a motorcycle but over the wires where you can move files or programs or run a program
37:04remotely on another computer and the telephone network is not really suited for that
37:09in the pre-digital era sending an encoded file along a telephone line meant that the line was engaged for
37:17as long as the transmission took so the opportunity here was because we owned the site 78 acres with some
37:2550 buildings we could build a network davy's team sidestepped the telephone problem by laying high
37:32bandwidth data cables before instituting a new way of moving data around the network
37:41and the technique he came up with was packet switching the idea being that you take whatever
37:47it is you're going to send you chop it up into uniform pieces uh like having a standard envelope and
37:53you put the pieces into the envelope and you post them off and they go separately through the network and
37:57get reassembled at the far end to demonstrate this idea roger and i are convening npl's first ever
38:05packet switching data dash which is a bit more complicated than your average sports day event
38:13the course is a data network there are two computers represented here as the start and finish signs
38:22those computers are connected via a series of network cables and nodes in our case cables are lines of
38:29cones and the connecting nodes are hula hoops
38:35having built it all we need now are some willing volunteers and here they are npl's very own apprentices
38:44so welcome to our packet switching sports day we've got two teams uh red and blue both teams are
38:54pretending to be data and they're going to have to race you're going to start over there where it says
39:00start kind of obvious and you're trying to get through to the end as quickly as you possibly can
39:06you can't just go anywhere you have to go through these hoops to get to the finish line these these little
39:12nodes in our network you're only allowed to travel along the lines of the cones but only if there's
39:18nobody else along that line all clear okay there is one catch all of you who are in the red team
39:26we're going to tie your feet together so you've got to travel around our network as one big chunk of
39:34data those of you who are in blue you're allowed to travel on your own right it's slightly easier the
39:40objective is for both teams to deposit their bean bags in the goal in the right order one to five
39:52get in the hoop get in the hoop yeah bring out your competitive spirit here we've got packets
39:56versus big chunks of data i'm going to time you everyone ready okay over to you roger
40:02remember you can't go down the roof until it's clear the red and blue teams are exactly the same
40:13size let's say five megabytes each but their progress through the network is clearly very different
40:30very good okay blues you took 13 seconds pretty impressive reds 20 seconds that is a victory for
40:39the packet switchers well done you guys well done you guys the impact that packet switching has had on
40:46the world i mean it sort of came from here and then spread out elsewhere it did indeed we uh gave the
40:52the world packet switching and the world of course being america they took it on and ran with it
41:01this little race donald davies packet switching was adopted by the people that would go on to
41:07build the internet and today the whole thing still runs on this idea
41:13let's say i want to email you a picture of molly first it will be broken up into over a thousand
41:22data packets each one is stamped with the address of where it's from and where it's going to which
41:29routers check to keep the packets moving regardless of the order they arrive the image is reassembled and
41:37there she is this is quite a cool thing right that you've got one of the original creators of packet
41:45switching right here and you can ask them i mean every time you're like well do anything really
41:51why is my internet running so slowly don't ask me
41:55we've come a very long way in just a few decades around 3.4 billion people now have access to the
42:09internet at home and they're around four times the number of phones and other data sharing devices
42:15online the so-called internet of things
42:19just by being alive in the 21st century with our phones our tablets our smart devices
42:28all of us are familiar with data i really embrace your inner nerd here because every time you wander
42:34around looking at your screen you are gobbling up and churning out absolutely tons of the stuff
42:40and our relationship with data has really changed it's no longer just for specialists it's for everyone
42:47there's one city in the uk that's putting the sharing and real-time analysis of data
42:54at the heart of everything it does bristol using digital technology we take the city's pulse
43:04this data is the route to an open smart livable city a city where optical wireless and mesh networks
43:14combine to create an open urban canopy of connectivity taking the pulse of the city under a canopy of
43:25connectivity might sound a bit sci-fi or like something from a broadband advert but if you just
43:31hold on to your cynicism for a second because bristol are trying to build a new type of data sharing network
43:39for a citizen there's a city center area which now has next generation or maybe the generation after
43:46next of super fast broadband and then that's coupled to a wi-fi network as well the question is what can
43:52you do with it we would have a wide area network of very very simple internet of things uh sensing devices
44:07so just monitor a simple signal like air quality or traffic queued in a traffic jam once you've got
44:12all this network infrastructure you can get an awful lot of a really huge amount of data arriving to you
44:19in real time
44:23what's happening here is a city scale experiment to try and develop and test what's going to be called
44:29the programmable city of the future it relies on bristol's futuristic network vast amounts of data from
44:38as many centers as possible and a computer system that can simulate and effectively reprogram the city
44:47the computer system can intervene it could reroute traffic and we can actually radio out to individuals
44:53so maybe they'll get a message on their smartphone or perhaps a wrist mounted device saying if you have
44:58asthma perhaps you should get indoors once you create that capacity for anything and everything
45:05in the city to be connected together you can really start to reimagine how a city might operate we are
45:12starting to experiment with driverless cars and in order for driverless cars to work they have to be
45:18able to communicate with the city infrastructure so your car needs to speak to the traffic lights the
45:24traffic lights need to speak to the car the cars need to speak to each other all of that requires a
45:29completely different set of infrastructure of course as the amount of data a city can share
45:37grows the computing power needed to do something useful with it must grow too
45:45and for that we have the cloud
45:48for example imagine trying to analyze all of bristol's traffic data weather and pollution
45:55data on your home computer it could take a year
46:01well you could reduce that to a day by getting 364 more computers but that's expensive
46:09a cheaper option is sharing the analysis with other computers over the internet which google worked out
46:15first but they published the basics and now free software exists to help anyone do the same
46:24big online companies rent their spare computers for a few pence an hour
46:29so now anyone like me or you can do big data analytics quickly for a few quid
46:36such computing power is something we could never have dreamt of just a few years ago but it will only
46:48fulfill its potential if we can share our own data in a safe and transparent way if bristol council wanted
46:57to know where your car was uh at all times but could use that information to sort of minimize traffic
47:04terms how would you feel about something like that uh i'm not sure if i particularly like it i think
47:09it's up to me where i leave my car i know i understand the idea of justifying it with all
47:14these great other ideas but i still probably wouldn't like it very much if they are using it for a better
47:19purpose then yeah but one should know that how they are using it and why they'll be using it for what
47:24purpose i'd like to um imagine a world in which all the data that was retained was used for the greater
47:31good of mankind but i can't imagine a circumstance like that in a world that we have today we live in
47:37a modern society where if you don't let your data out there not in the public domain but sort of in the
47:44in the secure business domain then you can't take part in society really unsurprisingly people are
47:51pretty wary about what happens to their data we need to be careful that civil liberties are not eroded
47:59because otherwise the technology is likely to be rejected i think it's an area where we as a
48:04society have yet to sort of fully understand what the what the correct way forward is and therefore
48:11it is very much a discussion it's not a lecture it's not a code it's one where we are co-producing and
48:18co-forming these sorts of rules with people in the city in order to sort of help us work out what the
48:24right and wrong things to do are it will be intriguing to watch bristol grapple with the
48:30technological and ethical challenges of being our first data-centric city
48:37you know this context internal things uh new uh forms of healthcare um smart cities what we're seeing
48:46is an increase in transparency you can see through the body you can see through the house you can see
48:52through the city and the square you can three you can see through society now transparency may be
48:58good uh it's something that we may need uh to handle carefully in order to extract the value from those
49:04data to improve your uh lifestyle uh your social interactions uh the way in which a city works and
49:12so on but it also needs to be carefully handled because it's touching the ultimate nerve of what it means to
49:19be human so how much data should you give away traffic management is one thing but when it comes
49:27to healthcare the stakes the risks and benefits are even higher and in bristol with a project called
49:37sphere they're pushing the boundaries here too the population is getting older and an aging population
49:45need more intense healthcare but it's very difficult to pay for that healthcare in institutions paying
49:52for nurses and doctors so the key insight of the sphere team was that it's now possible to arrange in
49:58a house lots of small devices where each device is just monitoring a simple set of signals about what's
50:04going on in that house there might be monitors for your heart rate or your temperature but there might
50:09might also be monitors that notice as you're going up and down stairs whether you're limping or not
50:15they've invited me to go and spend a night in this very experimental house but unfortunately i'm not
50:22allowed to tell you where it is the project is a live-in experiment and will soon roll out to 100
50:29homes across bristol it's a gigantic data challenge overseen by professor ian craddock so that's one up there
50:36then yes that's one of the video uh sensors and we have some more sensors in the kitchen we have another
50:41video camera in the hall and some environmental sensors and a few more in here okay so you can tell when
50:47you're passing through the house can generate 3d video body position location and movement data from
50:55a special wearable how much data are you collecting then so when we scale from this house to 100 houses in
51:02in bristol in total we'll be storing over two petabytes of data for the project lord so i don't
51:09on my computer at home i don't even have a terabyte hard drive and you're talking about 20 000 of those
51:14yes i mean you know the interaction of people with their environment and and with each other is a very
51:19very complicated and very variable thing and that's why it's a very challenging area especially for
51:25you know data analysts machine learners to make sense of this big mass of data
51:29i'm happy to find out that the research doesn't call for cameras in the bedroom or bathroom
51:37but i do have to be left entirely on my own for the night
51:42the very first thing that i'm going to do is pour myself a nice bloody big glass of wine
51:52so that nice glass of wine that i'm enjoying now isn't completely guilt-free because i've got to admit to
51:57it to uh the university of bristol i have to keep a log of everything i do so that the data from my
52:05stay can be labeled with what i actually got up to in this way i'll be helping the process of machine
52:11learning teaching the team's computers how to automatically monitor things like cooking washing
52:17and sleeping signals in the data of normal behavior
52:27in the interest of science i was also asked to do some things that are less expected
52:37the team need to learn to detect out of the ordinary behavior too if they want to one day spot
52:43specific signs of ill health right i'm gonna run this back to the kitchen now
52:52it's a fairly strange experience i think the temperature sensors the humidity sensors
52:58the motion sensors even the wearable and you know i don't have a problem with it at all
53:04for some reason the body position is the one that's getting me on the flip side though i would go
53:10absolutely crazy to have this data set this is the most wonderful my goodness me everything you
53:16could learn about humans it'd be so brilliant
53:26one thing i wanted to do was to do something completely crazy just to see if they can spot it
53:31in the data just to kind of test them i can't believe this is my life now
53:48anyone can get the data from my stay online if they fancy trying to find my below the radar escape
53:56the man in charge of machine learning professor peter flack has the first look
54:02between nine and ten you were cooking correct then uh you went into the lounge you had your meal in the
54:09lounge i don't know you know what i ate on the sofa and you were watching crap television i was watching
54:13crap television yeah we found out we didn't switch the crap television sensor on so that's not on here but
54:19um okay so you were in the lounge um sort of until 11 30 correct then you went upstairs there's a very
54:28clear signal here and then from then on there isn't a lot of movements i was in bed i guess you were in
54:35bed sleeping normal activities like cooking or being in bed are relatively straightforward to spot but what
54:43about the weird stuff this is yesterday again so i can see it i can see the moment you can see the
54:51moment i can see it yeah um there's something happening here which is sort of rather quick
54:59you've been in the lounge for quite a while and then suddenly there's a chain brief move to the kitchen
55:05here and then very quick cleaning up in the lounge i wasted good wine on this experiment humans are
55:15extraordinarily good at spotting most patterns for machines the task is much more challenging but once
55:24they've learned what to look for they can do it tirelessly i suppose in the long run if you are going
55:33to scale this up to more houses you can't have people sifting through these graphs trying to find
55:40i mean you have to train computers to do them you have to train computers to do them one challenge that
55:44we are facing is that our models our machine learning classifiers and models need to be robust
55:51against changes in layout changes in personal behavior changes in the number of people that are in
55:58a house and maybe we are wildly optimistic about what it can do but we are we are in the process of
56:04trying to find out what it can do at what cost at what uh invasion into privacy and then we can have
56:13a discussion about whether as a society we want this or not if this type of technology rolls out machines
56:21will be modeling us in mathematical terms and intervening to help keep us healthy in real time
56:28and that's completely new it's true that our fascination with machine or artificial intelligence
56:38is as old as computers themselves claude shannon and alan turing both explored the possibilities of
56:44machines that could learn but it's only today with torrents of data and pattern finding algorithms
56:53the intelligent machines will realize their potential
57:00you'll hear a lot of heady stuff about what's going to happen when we mix big data with artificial
57:05intelligence a lot of people understandably very anxious about it but for me despite how much the
57:12world has changed the core challenge is the same as it always was it doesn't matter if you are william
57:18far in victorian london trying to understand cholera or in one of bristol's wired up houses all you're
57:25trying to do is to understand patterns in the data using the language of mathematics and machines can
57:33certainly help us to find those patterns but it takes us to find the meaning in them we should be
57:40worried about what we're going to do with these smart technologies not about the smart technologies in
57:46themselves they are in our hands to shape our future they will not shape our futures for us
58:00in the blink of an eye we have gone from a world where data information and knowledge belonged only to
58:07the privileged few so what we have now where it doesn't matter if you're trying to work out where to
58:12go on holiday next or researching the best cancer treatments data has really empowered all of us
58:20now of course there are some concerns about big corporations hoovering up the data traces that we
58:26all leave behind in our everyday lives but i for one am an optimist as well as a rationalist and i think
58:34if we can marshal together the power of data then the future lies in the hands of the many
58:41and not just the few and that for me is the real joy of data
58:54you
59:08you
59:12you