semweb – Semantic Web and Expert Systems – Big Data

Big Data

Fr De

Big Data is the term for a collection of records so large and complex that it becomes difficult to process them using traditional database management systems or tools. The challenge here lies in the import, storage, saving, search, publication, transfer, analysis and visualization of records. Instead of the conventional processing systems, big data necessitates the parallel use of dozens to thousands of computers and the use of appropriate tools for balancing performance.

The size and description of big data changes on yearly basis. In 2012, it ranged from a few terabytes to a few dozen petabytes. The purpose and challenge of big data is the constant improvement of conventional data management systems and the development of new technologies. For example /bigdata-db/, /bigdata-hd/ or NoSQL /nosql/.

Examples of uses
Worth mentioning are big science /bigs1/, /bigs2/, RDIF /rfid/, sensor networks, social networks, big social data analysis /bsda/, internet documents, indexing of internet content for the use of search in astronomy, meteorology, genomics, bio-chemistry, biology and other multi-disciplines, scientific research such as military intelligence, medical records and other disciplines.

Big data and semantic web
Big data technologies allow a superior performance during the execution of ordinary data operations – they are therefore particularly appropriate as a substrate to implement semantic web operations (LODifying, logical linking, exporting, importing, synchronizing). Choosing a suitable Big data technology substrate can secure success for the next 5 to 10 years. As a rule, a large amount of data (in the gigabyte and terabyte range and more) provides a broader basis for algorithms and heuristics. The use of a lean and scalable substrate increases the precision and significance as compared to conventional methods /bdsw/.

Big data and legacy systems
In the context of computing, a legacy system refers to established company applications that are partly supported by mainframes or are run on older operating systems. Often, legacy systems constitute the heart of a company that can only be replaced after careful consideration, or must be forcefully replaced after the withdrawal of support from the provider.

With the creation and release of increasing volumes of data, a legacy system can soon reach its limits of complexity, be inadequate for data processing and become a bottleneck for the company. The company can however regain stability in data management through the careful evaluation of an appropriate replacement system, and careful planning of data migration and system detachment.