A columnar database management system (CDBMS) is a type of database management system (DBMS) that writes and reads data tables by columns instead of rows. Standard databases store data in rows. Typically, there are more rows in standard databases than there are columns in columnar databases.
The goal of columnar database management systems is to more efficiently write and read data from a hard disk, ultimately speeding up query time. This is achieved by querying subsets of data in only columns, without needing to read large amounts of rows of information.
Rows vs. columns
In row-oriented databases, data is stored sequentially in rows. These types of databases are best suited for online transaction processing (OLTP) systems that will primarily be used to query user-specific values. For example, customer data would be stored in multiple fields within the same row, such as the customer’s name, address and email.
In a columnar database, customer information would be broken up into different columns. All customer names would be stored in one column while all customer emails would be stored in another. This makes columnar databases best for online analytical processing (OLAP) systems.
Use and benefits of CDBMS
Columnar databases are often used in data warehousing and big data processing where businesses use large amounts of data from varying sources for business intelligence (BI) purposes. Columnar database management systems are considered to be the future of business intelligence.
Organizations need to be able to quickly and easily query relevant information to inform data-driven business decisions. Data stored in columns makes it easier to bypass non-relevant data. This is particularly beneficial for aggregation queries.
The largest benefit of using a CDBMS is the improved query times because the data being read is highly compressed. But columnar database systems are also self-indexing, so they use less disk space than other databases.
The future of CDBMS
The benefits of columnar database management systems over row-oriented systems may soon be overshadowed by newer technologies. Specifically, in-memory analytics. This technology shifts focus from writing and reading data to or from a hard disk and instead queries data using random access memory (RAM). This further increases speed, performance and reliability when querying data.