Showing posts with label Big Data. Show all posts
Showing posts with label Big Data. Show all posts

Sunday, June 30, 2019

How Big Data Analytics Can Help You Improve And Grow Your Business?

Big Data Analytics

There are certain problems that can only solve through big data. Here we discuss the field big data as "Big Data Analytics". The big data came into the picture we never thought how commodity hardware is used to store and manage the data which is reliable and feasible as compared to the costly sources. Now let us discuss a few examples of how big data analytics is useful nowadays.
When you go to websites like Amazon, Youtube, Netflix, and any other websites actually they will provide some field in which recommend some product, videos, movies, and some songs for you. What do you think about how they do it? Basically what kind of data they generated on these kind websites. They make sure to analyze properly. The data generated is not small it is actually big data. Now they analysis these big data they make sure whatever you like and whatever you are the preferences accordingly they generate recommendations for you.
If you go to Youtube you have noticed it knows what kind of songs or what videos you wanna watch next.  Similarly, Netflix knows what kind of movies you like it. If you visit Amazon they know what kind of product you would prefer to buy it. So, how it actually happens, it happens only due to big data analytics. There is one example which is about Walmart. So what happens when Walmart uses big data analytics to profit from it. Now you will think about how they did it. Let us discuss, they study what the purchased pattern of the different customers. Their owner makes a strike on a particular area and when they made an analysis of it. So, they found out that people tend to buy emergency stuff like a flashlight, life jacket, and a little bit other stuff and also a lot of people buy chocolate. If you read the example you see how big data analytics can help improve or grow your business and can find better insights from the data you have.

Big data Analytics

Big Data Collected by Smart Meter

In earlier, have you notices the data was collected from the meter in our home to measure the electricity consumed. It is actually sending the data from one month but nowadays IBM created the smart meter due to the use of smart meter it actually collects data after every 15 minutes. Whatever energy we have consumed after every fifteen minutes it will send data and due to this big data is generated. If we see in the below picture we have 96 million reads per day for every million meters. This amount of data generated by the smart meter is pretty huge data. " Managing the large volume and velocity of information generated by short interval read of smart meter data can overwhelm existing IT resources. "
Smart Meter

Problem With Smart Meter Big Data

IBM realized that it is generating a huge amount of data is important for them to give something from that data. For what they need to do? They need to do to make sure to analyze this data. So they realize that big data can solve a lot of problems and they can get better business insight through that. Let's move forward what type of analysis they do on that data. 

How Smart Meter Big Data Is analyzed

So before analyzing that data, they came to know that energy utilization and billing was only increasing. Now after analyzing big data, they came to know that during peak load user require more energy and during off-peak times that users require less energy. So what advantages they must get from this analysis. One thing that we can think of right now is they can tell the industries to use their machinery only during off-peak times. So that load will be pretty much balanced and we can say even that time-of-use pricing encourages cost severe e-tail like industrial heavy machines to used off-peak times. It will save money as well because of off-peak time pricing will be less than peak time prices. So this just one analysis. 

IBM Smart Meter Solution

Over here we first dump all our data that we get in this data warehouse after that it is very important to make sure that our user data is secure. Then what happens we need to clean the data as we discussed earlier as well there might be many fees that we don't require. So we need to make sure we have only useful material in our dataset and then we perform certain analysis.
 In order to use this suite that IBM offered us efficiently. We have to take care of a few things.
  • we have to be able to manage the smart meter data now there are a lot of data coming from all these million smart meters. So we have to able to manage that large volume of data and also be able to retain it because maybe, later on, we might need it for some kind of regulatory requirements and something. 
  • To monitor the distribution grid so we can improve and optimize the overall grid reliability. So we can identify the abnormal condition which is causing any kind of problem.
  • We can also take care of the optimizing the unit commitment. Optimizing the unit commitment companies can satisfy their customers, even more, they can reduce the power outages so that their customers cant angry. They can identify more problems and then reduce it.
  • Optimizing energy trading means that we can advise the customers when they should use their appliances in order to maintain that balance in the power load.
  • Forecast and schedule load companies must be able to predict when they can profitably sell the excess power and when they need to hedge the supply.

ONCOR Using IBM Smart Meter Solution

Now let's discuss how ONCOR has made use of the i-beam solution. So anchor is an electric delivery company and it is the largest electrical distribution and transmission company in Texas and it is one of the six largest in the United States. They have more than three million customers and their services area covers almost 117 thousand square miles and they begin the advanced feeder program in 2008 and they have deployed almost 3.25 million meters serving customers of North and South Texas. When they were implementing they kept three things in mind. 
  • The first one is "it should be instrumented". So this solution utilizes smart electricity meters so that they can accurately measure the electricity usage of household in every 15 minutes because we already discussed that smart meter sending out data every 15 minutes and it provided data inputs. which is essential for consumer insights.
  •  It should be "Interconnected". Now the customer will have detail information about the electricity they are consuming and it creates a very enterprise-wide view of all the meter assets. It also helps them to improve service delivery.
  • To make your "customer intelligent ". Now it is getting monitored already about how each of the household or each customer is consuming the power. So now they are able to advise the customer about may be to tell them to wash their clothes at night times. Because they are using a lot of appliances during the day time and maybe they divide it up. So they can use some of the appliances at an off-peak hour so they can save more money. This is beneficial for both the customers and the company as well.   

Saturday, June 15, 2019

Introduction to Data Science: What is Big Data?

What Is Big Data

First, we will discuss how big data is evaluated step by step process.

Evolution of Data

How the data evolved and how the big data came.
Nowadays the data have been evaluated from different sources like the evolution of technology, IoT(Internet of Things), Social media like Facebook, Instagram, Twitter, YouTube, many other sources the data has been created day by day.

1. Evolution of  Technology

We will see how technology is evolved as we see from the below image at the earlier stages we have the landline phone but now we have smartphones of Android, IoS, and HongMeng Os (Huawei)  that are making our life smarter as well as our phone smarter.
Apart from that, we have heavily built a desktop for processing of Mb's data that we were using a floppy you will remember how much data it can be stored after that hard disk has been introduced which can stored data in Tb. Now due to modern technology, we can be stored data in the cloud as well.
Similarly, nowadays we noticed that self-driving Car comes up. Now you must be thinking about why we are telling that you noticed the enhancement of the technology we are generating a lot of data. Let's see the example of your phones, Have you ever notices how much data is generated due to your fancy smartphones in your every action even one video is sent through any WhatsApp or any other  Messenger App that generate data. Now, this is just an example you have no idea how much data you generated because of every action you do. This data is not in the format that the Relational databases can handle and apart from that even the volume of the data has also increase exponentially.
Now we are talking about self-driving cars basically this car having sensors that record every minor detail like the size of the obstacle, the distance of the obstacle and many more then it decides how to respond. You can imagine how much data is generated for each kilometer drive on that car. Let's move on to the next evolution of the data.
Evolution of Technology. 

2. IoT

I think you people must hear about IOT if we recall the previous paragraph about the self-driving car it is nothing but its an example of IOT. Let me discuss what exactly it is. IOT connects the physical device with the internet and makes a device smarter. Nowadays we have noticed the smart AC, TV, etc, So we will take an example of Smart Air Conditioners this device monitor your body temperature and outside temperature accordingly maintain what should be the temperature of the room.
Now in order to do this first, it accumulates the data from where it can accumulate data from the internet through sensors that monitoring data from your body temperature and surrounding. Basically from various sources that might you know about is actually fetching the data and accordingly it decide what should be the temperature of your room. Now actually we see because of in IOT we are generating a huge amount of data. As we are seeing in the below image there are a lot of IoT devices in future 2020 there will be 50 billion IoT devices. We will not discuss there how IOT will generate such a huge amount of smart devices. Now we will move forward and discuss another factor that generates big data.

3. Social Media

Social media is one of the most important factors in the evolution of big data. Nowadays everyone using Facebook, Instagram, Youtube, Twitters and a lot of social media websites. As we see these social media websites have soo much data. e.g  If we have our personal details like our name, age apart from that with each picture we like, reacts and comments it also generates data. Even Facebook pages that we go around liking that also generates data. Nowadays we can see that a lot of people sharing videos on Facebook so that is generating a huge amount of data. The most challenges part is here that the data is not presenting in structure mannered and same time it is huge in size. As we see that not only data is generated in huge amount but it also generated in a different format. e.g Data generated with videos that are actually in an unstructured format the same goes for images, So there are numerous means million of ways that data are generated nowadays that are conveying to big data. 

4. Other Factors

All of us must visit websites like Amazon, Flipkart, etc. Suppose we want to buy a t-shirt or jeans so we search for a lot of t-shirts or jeans somewhere our search history will be stored. If we buy for the first time so there will be our purchase history as well along with personal details and there is numerous way in which didn't know that we generating data and also Amazon is not present earlier. So that time there is no way such a huge amount of data was generated. Similarly, data is evolving due to some other reason as well like Banking & Finance, Media & Entertainment, Healthcare, and Transportation, etc.
So now the main point as what exactly the big data is, how we consider the data as big data.
Other Factors

What is Big Data 

Now look at the proper definition of big data "is the term for the collection of large and complex data sets that it becomes difficult to process using on-hand database system tools or traditional database applications".
What we understand from this that our traditional system or our old system can process our data?
No, there is too much data to process. When the traditional system was invented at the beginning we never decapitated that we have to deal with such numerous amount of data.
How do we consider some data as big data or how do we consider to classify data as big data? So we have 5 V's of big data.
Big Data

5 V's of Big Data

If we can see some people write about 3 V's and some people write that there are 3 V's but here we will discuss the 5 V's. So look it the below discussion to understand how the data become big data due to these five characteristics

1. Volume

The first V of the big data is the volume of the data which tremendously large. So if we look at the diagram the volume of the data is increasing exponentially. We were dealing with 4.4 zettabytes of data in 2017 it will increase up to 44 zettabytes in 2020 which is equal to 44 trillion gigabytes. So that is really huge data.

2. Variety

All the humongous data coming from multiple sources that is the second V's variety. We deal with different kind of files that is all in once mp3 files, videos, Jason, CSV, TSV and many more. Now if we look at these data that are Structure, Un-Structured and Semi-Structured all together. Let us explain from the below diagram. We have Audio file, Video, Png, JSON, Log file, emails various format of data.  Now, this data is classified into three forms.

I. Structured Format

In Structured format, we have a proper scheme of our data we will know what are column would be there and basically, we know about the scheme of our data, so it is in structured format means in tabular form.

II. Semi-Structured Format

The second is the Semi-Structured format, So we can see from the diagram it is nothing but JSON, XML, CS V, TS V, and email where is scheme is not defined properly.

III. UN-Structured Format

In UN-Structured form, We have Log file, Audio file, video file, and all type images file consider in the UN-Structured format.

3. Velocity

It is also because of the speed of accumulation of this variety of data altogether which brings us to our third V's is called velocity. Let us explain from the diagram we were using mainframe computer system huge computer but having less data because there were fewer people were working with the computer at that time. As the computer evolve to become the client-server model and the time came for the web application and the internet boots. As day by day, the web application increase on the internet and now everyone is using these applications from the computer as well as from their mobile devices. More user more appliances, more apps, and more mobile devices enhance a lot of data.
When we talk about people to generate data our first thing coming in our mind is social media. If you think that how much data is generating by an Instagram alone on your post and stories.
We will talk about every social media application. If you see the below diagram for every 60 seconds social media apps generate, Twitters generate about 100 hundred Tweets in every minute, on Facebook 695,000 status update,  11 million Instagrams messages, 698,445 Google searches, 168 million emails sent in every one minute,  which is almost equal to 1,820 Terabytes of data generated, also mobile users are increasing in every minute. There are 217 new mobile users are added in every minute.  So that is a lot of data to calculate, to arrange in a proper manner so it becomes big data.


Now the bigger problem is here to extract useful data. So due to this reason, we come to the next V's that is Value. First, we need to mine useful content from our data basically we make sure that we have some useful field in our dataset and after that, we perform some certain analytics on that data we have to clean it.  after analysis on the dataset, it has some value that is it will help us in business to grow that can be found inside which is possible earlier. Whatever the big data or data has been generated it makes sense it will help us to grow our business and have some value.

5. Veracity

Now getting the value from that data is a big challenge that brings us to the next V's is Veracity.
So that big data has a lot of uncertainty and inconsistencies. When we are dumping such a huge amount of data some of the data package bound to a loss in processing. So we need to do that to fill up these missing data then start mining again then processes it and then come up with good inside possible. If we look at the below diagram some of the data is missing, some of is minimum value and some of the data have a large value.

We have a lot of problem in big data and a lot of opportunities that we will discuss in the next article.