How Data Marts Lead to Better Business (2024)

A significant hurdle for many companies is effective and useful data management. It can bedifficult to comb through vast amounts of data to find meaningful information, it’softensiloed, queries can be time consuming and it’s commonly outdated or inaccurate. A datawarehouse can easily contain hundreds of terabytes, growing day by day. And only a fractionof the data stored is relevant to any given business division, department or product line.Enter the data mart.

In the data mart model, marketing, sales, finance and each operational business unit hasaccess to the segment of the data warehouse and the information that is most relevant tothem. Instead of live queries run against mammoth tables, queries run through a data martare tight and targeted — and fast. Just as important, data marts provide frameworks todeliver actionable reports more rapidly.

The result: Groups within a business can make more efficient use of the data that helps themrun.

What Is a Data Mart?

At its core, a data mart is a subject-oriented subset of a data warehouse. While stillmaintaining the value of a data warehouse, a data mart enables a company to serve businessdivisions and product lines with access to the data that is relevant to their individualoperations.

However, a data mart is not simply access to raw data. Instead, data marts help turn raw datainto actionable information quickly to help business units succeed. They accomplish this byproviding pre-built summaries and queries specifically designed for the internal businessleaders in the department or business unit to which the data in the mart applies.

Key Takeaways

  • A data mart is a focused subset of a data warehouse designed to present actionableinformation quickly to a specific department, business unit or product line.
  • Data marts blend data from a variety of sources — owned and licensed — toanswerspecific business questions.
  • Performance is critical with data marts. Slow queries not only impact user experiencebut also cost more resources.

Data Marts Defined

Data marts are like little “storefronts of data” that align with a departmentwithin a largerbusiness. The human resources department might have a data mart that draws from thecompany’s data warehouse, but because it includes only information about employees—salaries, locations, performance, chosen health and retirement benefits — ad hocqueries,reports and analytics all happen much faster than if run against the full data warehouse. Afinance department data mart might contain all financial information for the entire firm;sales might have a data mart that enables lightning-fast pipeline analysis, showing whichproducts are selling best and which are selling worst.

Data marts maintain the integrity and value of an overall data warehouse strategy whileexposing business divisions and product lines to only the data relevant to their operations.And as the cost of database technologies continues to decline, data-driven business insightsfrom data warehouses and data marts are becoming more accessible to small and midsizecompanies.

Reasons to Use a Data Mart

Any company that is managing a large enough set of data that they use the term “datawarehouse” will likely benefit from deploying a data mart strategy. Here are three keyreasons many companies choose to use data marts:

  • Regional offices can be more responsive to their local market by having access to datarelevant to local customers. Yet, data they enter into records — sales callsummariesand marketing data — is fed into the central data warehouse.
  • Specific departments can have rapid access to critical information with a data mart.Instead of relying on queries against all data, their data mart can combine live andscheduled queries against only the relevant subset of data.
  • Ad hoc queries will generate different results depending on the parameters used. Thiscan lead to confusion for critical parts of the business. Data marts can make sure theentire division is looking at the same set of data based on the same parameters.

Data Mart Use Cases

Since data marts are so efficient at providing department-level data, they can be useful formany different departments around an organization.

A shipping department, for example, can use a data mart and a dashboard to keep track of thetime it takes for an order to go from placement to delivery. In this case, the data martinteracts with internal sales data (order placed) and combines it with carrier data (orderdelivered) to show how long the process takes. Similarly, the shipping department’sdatamart would delve into process data collected by the department to help pinpoint ways to makeshipping more efficient.

Sales might be the most data-driven part of any organization. A sales data mart could combineweek-over-week, month-over-month and year-over-year data for an easy and efficient view,say, of the performance of call centers vs. an internal sales team. A sales data mart canalso tap into results from the shipping department data mart to show order deliveries andreturns.

Meanwhile, the marketing department data mart can track the flow of inbound leads by channeland product, help manage a nurturing program and report on how many leads ultimatelyconverted to customers.

Who Uses Data Marts? And How?

Any company that uses key performance indicators (KPIs) tomeasure success or that has already collected enough data to be using a data warehouse willlikely find that a data mart strategy can help them be more competitive. Some of thecharacteristics of the types of companies that would benefit from a data mart include thefollowing.

  • Companies with long sales cycles can use a data mart to help different departments alignon the needs of the sales process.
  • Companies selling complex, bespoke solutions can use data marts to manage complexproposals so when the business is won they can more quickly move into implementation.
  • Companies with many years of data can use a data mart to help departments focus oncurrent data while still maintaining the historical information.
  • Companies with tight profit margins can use a data mart to reveal efficiencies that willlead to more profit, while also analyzing current processes to preserve existing profit.
  • Companies with large product sets will find that data marts enable them to manage allthose products better by focusing each product group on its individual mission.

3 Types of Data Marts

There are fundamentally three types of data marts: dependent, independent and a combinationof the two called hybrid. What they have in common is that all three present asubject-specific set of data to the business teams that will benefit most from thatparticular dataset.

Data warehouse pioneers Ralph Kimball and Bill Immon defined the first two. The Immondependent model advocates that data should flow into a central data warehouse and then besegmented into individual data marts. In contrast, Kimball proposed the independent model,in which data flows directly into individual data marts and then is aggregated into a datawarehouse. With hybrid data marts, organizations take the best of each model and apply it todifferent departments according to their requirements.

Dependent data mart:

A Dependent data mart is built on top of a central data warehouse.Practically speaking, the data warehouse controls all the data. All data sources, includinglicensed third-party data, is loaded first into the central data warehouse and then thesubset of the data that is needed is propagated out to the data mart.

An advantage of this model is that the bulk of data administrationhappens in the central repository, requiring less technical know-how at the data mart level.Also, critical issues like technology, data and storage (including backups) are handledcentrally. Conversely, if the data warehouse goes offline for any reason — planned orunplanned — dependent data marts can go down with them.

Independent data mart:

Independent data marts do not rely on a central data warehouse. In fact,this model proposes that a central data warehouse does not necessarily even need to exist.Instead, independent data marts can be set up to act as standalone entities — minidatawarehouses focused on the needs of specific divisions. An advantage to this model is thatindividual business units can run the data mart that suits them best.

Of course, with this independence comes the need for technicaladministrative expertise at each data mart. Plus, if data will need to be aggregated acrossdata marts — for executive-level reporting, for instance — you will need toconstructqueries that access multiple data marts. Therefore, with independent data marts it becomesmore important to institute an organizational taxonomy that provides standard naming fortables and fields — or a thesaurus that maps tables and fields among all thecompany’s datamarts — to make cross-data-mart reports easier to generate.

Hybrid data mart:

Some organizations find it practical to consider a hybrid model where somedata marts are dependent on a central warehouse and some exist on their own. For example, itmight be more efficient to use this model as a transitionary step for new data marts. Newsubject-specific data sources might be easier to deploy as independent data marts. Once theyhave proven their value, they can be deployed through to the central data warehouse, ifneeded. Or the hybrid model might be a good path to integrate acquisitions. The acquiringcompany could treat the subsidiary’s data as an independent data mart while planningto makeit conform to the business’s own data policies later, and then integrate it.

A significant consideration is the human resources required to run thehybrid model. It requires technical administrative expertise at both the central datawarehouse and at the data mart level.

Which type of data mart a business chooses depends on a lot of factors, including how thecompany is organized. These four questions can help an organization determine which typesuits it best:

  • Are business units managed as standalone entities or are they considered part of awhole?
  • Does company culture encourage information sharing or do individual business units hoardknowledge?
  • Are individual business units staffed to handle the intricacies of data management?
  • Similarly, is the central database team committed to respond rapidly to the needs ofdiscreet business units or do bureaucracy and egos interfere?

Data Mart vs. Data Warehouse

While a data warehouse is a repository for all the data that helps a business run, a datamart is a condensed subset of business data designed for a specific purpose, business unitor department. Data marts draw on fewer, more specialized data sources. A data mart strategymight not need to include a data warehouse. Instead, the data warehouse might be theaggregate of all your data marts.

Data Mart vs. Data Lake

Typically, the raw data in data lakes has a lot less structure and has yet to be cleaned andnormalized. Data marts, on the other hand, are the result of highly structured, cleaned andnormalized data. More importantly, though, is that data marts are designed to providespecific solutions to individual groups while data lakes are meant for more open-endedanalyses — even unanticipated ones.

Data Mart vs. Database

A database is a foundational element of an organization’s data management technology.Itstores data the organization owns and often third-party licensed data in a way that can beretrieved via queries. Structured query language (SQL) is the most prevalent way that datais output from a database. A database might feed multiple data marts and, depending on thesize of your data set and your data strategy, a data mart might draw from more than onedatabase.

Comparing 4 Data Management Elements

Data MartData LakeData WarehouseDatabase
Query-based reportsYNYY
Cleaned and normalized dataYNYY
Business-facingYNNN
Answers specific questionsYNNN
Data ScopeTargetedBroadBroadBroad
SearchableMaybeMaybeYY

Structure of a Data Mart

There are three schema-level and interrelated data architectures for data marts: star,snowflake and denormalized tables.

Star:

The star structure is the most straightforward of the three, and thusreduces the complexity of deploying data marts. In the star structure, business-level datais broken out into tables of facts (for example, sales data). These tables interact withrelevant dimensions. For example, a sales facts table may relate directly to a dimensiontable that lists products.

To visualize the schema and understand where the star label comes from,look at a representation of a sales facts table with at least four attributes: date,location, product and quantity. A “location” dimension lists all availablestores andconnects to the facts table via a store identifier. Similarly, a dimension of availableproducts provides a centralized and official list of all products. These are connected tothe sales facts table via a product identifier. As these dimensions blossom out, you startto see a star pattern where a central table interacts with a single one-dimensional layer ofrelated tables.

Snowflake:

Alternatively, and perhaps more realistically, consider the case where astar-structured data mart contains dimension tables that are themselves subject to furtherdimensions. Using the “location” example, it seems reasonable that a locationtable might besubject to a geography dimension that connects stores to regions via a geography identifier.If this is the way the data warehouse is structured and the data mart’s requirement istoprovide a report showing store location and region information, then you’re using asnowflake model.

Denormalized tables:

The star and snowflake structures require report queries to go throughsometimes extensive “joins” to connect data in multiple tables or even multipledatabases.Depending on how much data is involved, these joins will reduce the responsiveness of thereports. For reports based on scheduled queries, this isn’t a user-facing problem butthelength of time it takes for live queries may frustrate business leaders. It could also be asignificant resource hog.

An alternative is to use denormalized tables, eliminating the joins andthus performing queries more efficiently. A denormalized tables structure brings togetherall the data needed for a data mart report into one table which will produce faster queriesand will likely generate redundant data. While this redundant data makes inserts and updatesmore expensive, the bet with denormalized tables is that the efficiencies of queriesoutweigh those costs.

Advantages of Data Marts

The biggest advantage of data marts is efficiency, both in terms of costs and data access.Data marts cost much less to deploy than a data warehouse and access to data is much fasterbecause data marts refer to smaller datasets. Queries into a central data warehouse can belong and arcane as they negotiate with irrelevant data. A well-constructed data martstrategy can provide business unit and departmental leaders with very fast access to thedata they need.

What’s more, frequently accessed data is not necessarily frequently updated. Somequeriesthat were previously conducted live can be presented as scheduled queries in a data mart.This gives team members access to the information they need while using only a smallfraction of the computing resources previously used. But even live-updated data is deliveredmore efficiently through a data mart simply because it is drawn from a more focused set ofdata.

Another advantage of data marts is they can be independent of each other, so an outage at thecentral data warehouse does not have to effect individual data marts.

And when a data mart includes licensed third-party data, a key advantage is that the licensecost should be lower because the user base for the data is smaller than if it were in a datawarehouse.

Disadvantages of Data Marts

Since a data mart contains only the data needed by a single business group, it does not onits own provide visibility into the broader set of data a business might need. Similarly, inan independent data mart model that excludes a central data warehouse, the business may nothave ready access to cross-data-mart reporting for certain kinds of high-level reports.

Some additional disadvantages:

  • Data marts are not necessarily the right solution for every group and too many datamarts will become difficult to manage.
  • A data mart strategy that automatically propagates data might result in unbudgetedexpenses. Data and technology licensing fees are a large part of the cost for any datastrategy; if a company licenses marketing data for the 10 people in its marketingdepartment, but data is propagated out to other data marts, it might be on the hook foradditional fees.
  • If deploying independent data marts, often-overlooked details like field name syntaxcould suddenly become critically important to align at a high level. Otherwise, youcould run into trouble when building reports to run across data marts.

6 Steps to Implement a Data Mart

In broad strokes, there are six specific phases to deploying a data mart strategy.

  1. Requirement-gathering

    Comprehensive requirement-gathering helps to plan out the design that will result inthe logical, physical and technical characteristics of a strong data martdeployment. Work with the teams who will use the data mart in theorganization’sbusiness units, departments and product lines to determine what specific kinds ofdata they need to access and the terms under which they need access. Use thisinformation to plan out what kinds of data will appear in which data marts.

  2. Designing data mart strategy

    The most important step in creating a data mart strategy is to determine theorganization’s business goals and strategy, which will be manifested in thedatamart design. During this phase, most organizations decide on their data martarchitecture and make other decisions that will have a long-lasting impact on howthe data marts are used. If the organization has a data warehouse, review theexisting data warehouse schema as well as licensed third-party data to determinecriteria that will enable data to flow properly in the new data mart plan. At thesame time, consider how the existing data warehouse technology and architecture cansupport your data mart needs. Some modifications will likely need to be made, bothin the system itself as well as the licensing. When designing data marts forfar-flung regional branches, consider potential service interruptions. Not all partsof the world have excellent connectivity; the need to keep people working mightdictate key elements of data mart design.

  3. Constructing data mart architecture

    This phase is where you make decisions and purchases to create and deploy thephysical and logical structures of the data mart architecture. Determine whatdatabase to use. If the data mart strategy is based on existing data warehousetechnology, plan out and deploy needed modifications. Pay special attention toexisting licenses, users’ data access needs and speeds and service reliabilitytoavoid unplanned expenses down the road. An important consideration at this phase isthe interface that business teams will use to access data — user experience isimportant to make sure business teams don’t struggle to access theinformation.Though it may seem early, this is also a good time to consider future administrativeactivities. Make a specific plan for logging and analyzing user activity and accessstatistics (including load and response times). This will be enormously helpful whenfielding support questions. Backups and redundancy are also critical to build intoyour system.

  4. Populating data mart architecture

    During this phase you execute the data-flow plan between the data warehouse, ifyou’re using one, and individual data marts. If there’s no datawarehouse, thenexecute data flow from the appropriate sources. Questions to consider include: Whereis the data that you need to include (owned as well as licensed)? What are the terms(both legal and technical) in which that data is available? What fields will be usedto join and connect disparately sourced datasets? How will data be cleaned andnormalized? Look at what metadata and indices will help make queries more efficientand help less technical users speak intelligently about that they’re asking.Also,identify what components of your data mart will require live querying versus whichcan be scheduled out.

  5. Accessing the data marts

    This is the phase when the planned subsets of your overall data warehouse can firstbe accessed. Set up specific queries and reports so that they can be accessedthrough the data mart interface. Some reports will be the product of a scheduledtask, others will be based on live queries and some will be ad-hoc. Run a limitedpilot to a defined set of users. This is a good opportunity to make sure users canaccess the information they need and that back-end systems and feeds are workingaccording to plan. Consider how the data mart structure responds to failure, such asif access to the data warehouse fails. Make sure business leaders get informativeerror messages. At this stage it’s important to document everything to helpadministrators know how different data mart components are generated so that futuremodifications can be more efficient. Keep the documentation up to date.

  6. Managing data marts

    As with any technology deployment, managing data marts is an ongoing process, fromuser-access issues to support queries. Monitor performance and usage statistics foreach data mart. Are all the reports being accessed? If data marts are not beingused, it’s important to figure out why. It might be that business teams needtraining or that their needs were not fully defined during requirements gathering.Establish a launch period during which reports and queries can be tuned to make surethey’re giving the department or business unit the data they need. The lengthofthis period depends on the complexity of the deployment but make sure it’s afiniteperiod — and is widely communicated. Periodically review how third-party dataisflowing into the system to make sure the data appears properly in the individualdata marts. Test backup and failover processes to make sure they will work whenneeded. And, of course, monitor security issues such as access coming from outsideyour virtual private network (VPN).

Best Practices for Implementing Data Marts

There are several important best practices to keep in mind when developing a data martstrategy, many of which may also apply to a company’s broader data strategy.

Define the scope. Spend the time needed to define the core scope andpolicies around which data marts need to exist and what elements they should contain. Thisupfront effort cascades through the entire process and helps ensure that any questions thatcome up are answered according to the same set of rules.

Think about scalability. Experience tells us the volume of data entering thesystem will increase over time — sometimes dramatically. It’s important toaccount for thisin the data mart design.

Remember responsiveness. Nothing abuses resources and hurts user experiencelike slow query execution. Architect for speed and set up systems to identify slow queries.Remember, today’s fast queries may be slow by tomorrow’s standards, so ongoingmonitoring isimportant, too.

Data Mart Example

To illustrate how an effective data mart can help a business unit, consider a marketingdepartment tracking campaign performance. Not only do they want to know what sales aregenerated, but also how campaign exposure led prospects to become customers.

The marketing department data mart will draw data from the web analytics platform to measureresponsiveness to campaigns and trace activity on the site. It also taps remarketing and addata to identify potential customers’ previous activity. Then it connects that data totop-level sales data such as revenue generated by campaigns and other campaigns to whichprospects and customers reacted.

Instead of drawing this data from the entire history stored in the data warehouse, thismarketing department needs only to access data from the past two years, enough to enablethem to see trends and adjust for seasonal variables.

Meanwhile, the data mart combines arcane marketing codes with a product identifier table tocreate user-readable labels that make the report easier to understand.

Physical vs. Cloud vs. Virtualized Data Marts

There are many ways to execute the construction of a data mart. Data must move fromindependent sources, a central data warehouse or a data lake into discreet and independentdata marts. These data marts, in turn, might exist in on-premises hardware or could beaccessed through a cloud architecture.

An alternative to actually moving data around is to use virtualized (or virtual) data marts.With virtualized data marts, the data is not physically moved but instead accessed throughvirtual tables that have limited access to the central data warehouse. Virtualized datamarts — which can be cloud-based or housed on-premises — don’t require theoverhead orfailure risk of actually moving data.

As with any cloud deployment, a key consideration when evaluating vendors is to make surethat the data you own is not irretrievably locked into that system. Make sure there is aportability plan in place for if or when you need to change vendors.

Future of Data Marts

Organizations are relying more and more on data and, with that, the need to efficiently useand gain insight from that data is also rising. In fact, company spending on big dataproducts (hardware and software combined) will nearly double between 2020 and 2027. And by2027, global spending on software for managing data will reach 45% of the data budget(opens in newtab), up from 20% in 2014.

An issue that will grow with this is data quality. Companies surveyed in 2020 estimated thatthe average per-company cost of poor data quality approaches $13 million per year. And 27% of respondents(opens in newtab)said that new demand for self-service was thebiggest challenge they face in data management.

Artificial intelligence tools and automated systems will help to mitigate the problem of dataquality. Since data marts access only a subject-specific segment of a company’soverall dataassets, data marts will be a key tool to help companies make effective use of the onslaughtof information.

Conclusion

Data marts are a logical and increasingly important step in companies’ effort to turnrawdata into actionable information that specific parts of a business can use to improveperformance. Since data marts require extensive planning and definition, they tend to reducead-hoc querying and help put an entire department on the same page.

#1 Cloud
ERP Software

Free ProductTour(opens in new tab)

Data Mart FAQs

What is meant by data mart?

When people use the term data mart, they are referring to a focused set of data that isdesigned to answer the data needs of a particular part of a larger organization.

What is data mart with example?

A data mart is a subject-specific subset of a company’s data that focuses on the needsof aspecific part of an organization. For example, a marketing data mart would include portionsof the overall data that relate to marketing such as leads gathered, campaign results andconversions. But the data mart for marketing excludes unrelated data such as shippinginformation, company finances, employee salaries, etc.

Does my data mart strategy require having a central data warehouse?

No, data marts can be constructed as standalone, subject-specific, independent data martsthat do not require a central data warehouse. Keep in mind, though, that there will likelybe a need for executive-level reports that run across data marts.

How is a data mart different than a data warehouse?

The key difference between a data mart and a data warehouse is that data marts run queriesagainst subsets of data from the warehouse. For example, a marketing department needs toaccess only specific data and may only need data from the past two years. A data warehouse,on the other hand, is a central repository for all information and could extend back many,many years. Because the data in a mart is a subset, queries can run faster and moreefficiently.

Should every department have their own data mart?

While data marts provide a simplified view of data from a broader data warehouse, it’simportant to be judicious in deploying them. Organizations have to decide how broadly todeploy data marts, but since there is some overhead with every data mart it’s not gooddatapolicy for every department to have their own data mart.

We already use BI tools to report on data. Do we need data marts, too?

Data marts are efficient because they contain only subsets of data required by differentdepartments. Likewise, BI tools also become more efficient when querying a limited set ofdata. If a central data warehouse is not too big, BI tools accessing the data may besufficient. But data warehouses will grow, so having a strategy in place to segment data toincrease query efficiency is still a good idea.

What is data mart and its types?

A data mart is a subject-specific set of data designed for a particular department. There arethree types of data marts. “Dependent” data marts are populated from a centraldatarepository. “Independent” data marts are standalone entities and might or mightnot beattached to a central data warehouse. “Hybrid” data marts enable an organizationto haveboth dependent and independent data marts.

Why do we need data mart?

A data mart is a good solution for a data-driven organization that has a large central datawarehouse. With data marts, different departments use data and resources more efficientlybecause they only access the data that relates to them.

How Data Marts Lead to Better Business (2024)
Top Articles
Latest Posts
Article information

Author: Sen. Emmett Berge

Last Updated:

Views: 5664

Rating: 5 / 5 (60 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Sen. Emmett Berge

Birthday: 1993-06-17

Address: 787 Elvis Divide, Port Brice, OH 24507-6802

Phone: +9779049645255

Job: Senior Healthcare Specialist

Hobby: Cycling, Model building, Kitesurfing, Origami, Lapidary, Dance, Basketball

Introduction: My name is Sen. Emmett Berge, I am a funny, vast, charming, courageous, enthusiastic, jolly, famous person who loves writing and wants to share my knowledge and understanding with you.