6 Most Terrible Big Data Practices To Avoid


2. Using RDBMS schema as files

RDBMS is a type of database management system which stores data in the form of related tables. They are best at handling structured data, offer high but not continuous availability, and are terrible at easily entering, distributing and synchronizing data that is widely dispersed from a geographical standpoint. Relational databases makes it more complicated for simple read access.

3. Creating data ponds

A data lake is a large object-based storage repository used to collect new data sources like weblogs, messaging, media, social and sensor data and holds the data in its native format until it is needed.

But hold on when you are slicing and dicing the data, means different answers for same questions. This idea of creating mini repositories thus fails ending up unto different views of data.

4. Treating HBase as RDBMs

HBase and other column-oriented databases are often compared to RDBMSs. Although they differ noticeably as HBase is a database -oriented data storage system whereas HBase is a row oriented databases.

The only real commonality between HBase and your RDBMS is that both have something resembling a table. If you will try to represent your whole RDBMS sceme as in HBase, you will surely end up in a mess.

Read More: 4 Great Socially Conscious Enterprises