Data warehouses, as the word “warehouses” suggests, are a method for distributing and storing data from larger systems to smaller, less-expensive computer platforms.
Companies that plan to downsize or distribute data to departmental LANs will find data warehousing an effective solution. Some basic components must be defined, however, in order to flesh out the warehouse analogy.
The central data center of a corporation is the main collection point for data and is analogous to the manufacturing center of the warehouse scenario. The central data center sends data to the various departmental data warehouses, so that it can be used at a local level.
Just as a manufacturing site does not store too many months’ worth of its products, the central data center stores only data that is needed daily. The warehouses store excess data locally to free the data center for production.
In today’s corporate environment, data is used in the process of making decisions that affect the future health of an organization. Providing data for management and departmental operations is not accomplished without first overcoming several obstacles. In fact, according to Launchscore.com, even small businesses are experiencing the benefits of data products formerly used by only large corporations.
A large portion of corporate data is used in reports that group, summarize and present information to those who will make decisions based on trends or problems that are apparent through the data. Information systems departments often have a backlog of requests for data from various management organizations throughout the corporation.
Before IS departments can grant users access to corporate data, they must deal with some basic issues. They must be concerned with security of the data, computer resource allocation, the additional cost of making this data available to management, and problems with the data itself, such as integrity and definition of the data.
Problems accessing data
Providing data security is a top priority in most companies’ IS departments. IS managers do not usually grant a user carte blanche access to a system. Rather, security measures restrict access by locking certain users out of fields, files or databases.
Establishing and maintaining security is a never-ending task, so it stands to reason that the fewer people who have access to the system, the easier it is to protect the system’s data.
This is where data warehousing comes into play, moving subsets of data to smaller, local databases. IS staff members can then monitor the extraction of data, so they know that users are accessing only the data they have permission to view.
It should be mentioned that a data warehouse database has its own set of security and is capable of restricting users from unauthorized access.
Because the data warehouse is used locally, only a portion of the corporate users will have access to its data, making security easier to enforce.
The storage aspect of data warehousing solves the problem of archiving historical data. An IS department can become overwhelmed with the amount of data that is generated through daily business transactions, so on-line data is limited data needed for daily operations. Other data is archived to a secondary storage medium (usually magnetic tape).
If the archived data is needed by management, it can be remounted to the primary storage hard disks.
Data integrity
Since important business decisions are made based upon data that is analyzed, data integrity is another crucial issue. It is necessary to ensure that the data used to populate the data warehouse is: the data selected by the user; free from errors (misspelled words, missing data, etc.); and contains all the records that should be included within the data subset selected.
Simply granting users access to data does not make it available to them: Data may be kept in various files of different formats. Users may not know the location or names of the data they seek, and often they are not sure of exactly what data they need to make a decision. Users should have a method for finding the data they need, so some form of data dictionary should define all the files or tables and their respective field elements.
Data warehousing can extract data from a variety of sources and create a new consolidated view of that data for use by management. This will make it much easier for the user to query the data, but will add the job of building extraction procedures to the IS department’s workload.
Once an extraction of the main data is made, the IS staff must be concerned with how to distribute the data to warehouses, how to automate the extraction process, and how to notify the users that new data has been added to the warehouse.
Critical issues
Thus far, we have mentioned only the savings to the information system department, but users of the data warehouse enjoy cost savings as well. Development of executive information systems applications is faster when the data is consolidated. Reports and queries are easier to produce, so the users can build their own. If the warehouse data structure is carefully planned, it will be easier to use and maintain.
Business changes over time, so it is important for a data warehouse to be flexible, in order to continue to be useful. If new data is needed in the subset of data, the system should make it convenient to incorporate those new data elements into the data extraction process.
Multiple front-end software applications should be able to access the data, so users are not required to learn new applications in order to use the data.
Maintenance of the data warehouse is also critical. The more data to be managed, the more difficult the job of managing that data becomes. If users of the data contained within the data warehouse need only five years’ worth of historical data, for example, then the warehouse should contain only five years of historical data.