what is bigdata?
“Big data” is high-volume, velocity, and variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.
- It refers to a massive amount of data that keeps on growing exponentially with time.
- It is so voluminous that it cannot be processed or analyzed using conventional data processing techniques.
- It includes data mining, data storage, data analysis, data sharing, and data visualization.
- The term is an all-comprehensive one including data, data frameworks, along with the tools and techniques used to process and analyze the data.
History of big data?
Although the concept of big data itself is relatively new, the origins of large data sets go back to the 1960s and ’70s when the world of data was just getting started with the first data centres and the development of the relational database.
Around 2005, people began to realize just how much data users generated through Facebook, YouTube, and other online services. Hadoop (an open-source framework created specifically to store and analyze big data sets) was developed that same year. NoSQL also began to gain popularity during this time.
So what are types of Big Data?
Structured is one of the types of big data and By structured data, we mean data that can be processed, stored, and retrieved in a fixed format. It refers to highly organized information that can be readily and seamlessly stored and accessed from a database by simple search engine algorithms. For instance, the employee table in a company database will be structured as the employee details, their job positions, their salaries, etc., will be present in an organized manner.
Unstructured data refers to the data that lacks any specific form or structure whatsoever. This makes it very difficult and time-consuming to process and analyze unstructured data. Email is an example of unstructured data. Structured and unstructured are two important types of big data.
Semi-structured is the third type of big data. Semi-structured data pertains to the data containing both the formats mentioned above, that is, structured and unstructured data. To be precise, it refers to the data that although has not been classified under a particular repository (database), yet contains vital information or tags that segregate individual elements within the data. Thus we come to the end of types of data. Let’s discuss the characteristics of the data.
So what is the use of big data and who uses it?
The amount of data available to companies is growing rapidly. With the increase in volume, variation, and veracity of data, the common analysis techniques are out of the picture. This is where Big Data jumps in. Big Data analytics allows for the analysis of this huge amount of data to bring out previously incomprehensible insights. All our activities online the sites we visit, the posts we like, things we share, purchases we make, videos we watch practically everything is recorded, monitored and analyzed. With this huge amount of data comes to a host of advantages and so does the complexities. All industries are trying to leverage the opportunities this data offers. In the course of figuring out the uses of Big Data, many industries have advanced by miles from their competitors. The uses of Big Data vary between theory and practice. Theoretically what we have imagined is yet to be achieved but we are moving forward. Here we have summarized a list of Big Data uses that can be incorporated in every industry.
Company’s like Facebook, Google, etc.. use to store data that we create daily to earn revenue by filtering our data and extracting useful information out of it.
So what is the problem while storing bigdata?
As we all know, there is data, lots of it: historical data, sure, but also new data generated from social media apps, clickstream data from web applications, IoT sensor data, and on and on. The amount of data is larger than ever, coming in at ever-increasing rates, and in many different formats
there are mainly two problems that come across while storing bigdata:
- Data Volume: People are more connected than ever before, and this interconnection leads to more and more data sources, resulting in an amount of data that is larger than ever before (and constantly growing). The increased volume of data requires ever-increasing computing power to derive value (meaning) from the data. Traditional computing methods simply don’t work on the volume of data accumulating today!
- Data velocity: The speed and directions from which data come into the enterprise is increasing due to interconnection and advances in network technology, so it is coming in faster than we can make sense out of it. And the faster the data come in and more varied the sources, the harder it is to derive value (meaning) from the data. Traditional computing methods don’t work on data coming in at today’s speeds!
So what the solution for Big Data?
Most of the company uses a concept called Distributed storage to encounter this problem
Nowadays, a wide set of systems and application, especially in high-performance computing, depends on distributed environments to process and analyses huge amounts of data. As we know, the amount of data increases enormously, and the goal to provide and develop efficient, scalable and reliable storage solutions has become one of the major issues for scientific computing. The storage solution used by big data systems is Distributed File Systems (DFS), where DFS is used to build a hierarchical and unified view of multiple file servers and shares on the network. In this paper, we will offer Hadoop Distributed File System (HDFS) as DFS in big data systems and we will present an Event-B is a formal method that can be used in modeling, where Event-B is a mature formal method which has been widely used in several industrial projects in many domains, such as automotive, transportation, space, business information, medical device and so on, And will propose using the Rodin as a modeling tool for Event-B, which integrates modeling and proving as well as the Rodin platform is open source, so it supports a large number of plug-in tools.
Thanks for reading this article! Leave a comment below if you have any questions.