Project that use personal data should have a disciplined approach to the use of that data. They should account for:
Information architecture is the process of creating structure for a body of information or content. It includes the following components:
Which of the following is NOT a preventative action for creating high quality data?
Technical Metadata provides data about the technical data, the systems that store data, and the processes that move between systems.
In the context of big data the Three V’s refer to: Volume, Velocity and Validity
Wat data architecture designs represent should be clearly documented. Examples include:
Which Data Architecture artefact contains the names of key business entities, their
relationships, critical guiding business rules and critical attributes?
Common understanding of the core business concepts and terminology is the objective of which deliverable?
A limitation of the centralized approach include: Maintenance of a decentralized repository is costly.
The term data quality refers to only the characteristics associated with high quality data.
Controlling data availability requires management of user entitlements and of structures that technically control access based on entitlements.
'Planning, implementation and control activities for lifecycle management of data and
information, found in any form or medium', pertains to which knowledge area?
An advantage of a centralized repository include: Quick metadata retrieval, since the repository and the query reside together.
When reviewing data access plans, sequential searching is slowing the database. One
way to fix this is:
Which of the following is a directive that codifies principles and management intent
into fundamental rules governing the creation, acquisition, integrity, security, quality,
and use of data and information?
Achieving near-real-time data replication, using a source accumulation technique,
triggers on:
Organizations should evaluate several maturity assessment models for data management, and for Data Governance, before adopting one or before developing a custom maturity assessment model because:
Please select the correct general cost and benefit categories that can be applied consistently within an organization.
Media monitoring and text analysis are automated methods for retrieving insights from large unstructured or semi-structured data, such as transaction data, social media, blogs, and web news sites.
Organizations are legally required to protect privacy by identifying and protecting sensitive data. Who usually identifies the confidentiality schemes and identify which assets are confidential or restricted?
Access to data for Multidimensional databases use a variant of SQL called MDX or Multidimensional expression.
Once the most critical business needs and the data that supports them have been identified, the most important part of the data quality assessment is actually looking data, querying it to understand data content and relationships, and comparing actual data to rules and expectations.
You are a reporting Data Analyst. A new Management Report has been requested. What is the most effective way to ensure you receive the appropriate data at the correct level of accuracy to meet the business need?
Which of the following provides the strongest tangible reason for driving initiation of a Data Governance process in an enterprise?
Several global regulations have significant implications on data management practices. Examples include:
The roles associated with enterprise data architecture are data architect, data modellers and data stewards.
Data parsing is the process of analysing data using pre-determined rules to define its content or value.
The first two steps of the Reference data Change request process, as prescribed DMBOk2, include:
Data Integration and Interoperability is dependent on these other areas of data management:
Operational Metadata describes details of the processing and accessing of data. Which one is not an example:
The Data Governance Council (DGC) manages data governance initiatives, issues, and escalations.
Data governance program must contribute to the organization by identifying and delivering on specific benefits.
A controlled vocabulary is a defined list of explicitly allowed terms used to index, categorize, tag, sort and retrieve content through browsing and searching.
When data is classified as either security data or regulatory data, the result will be:
An organization can enhance its Data Governance program and thereby improve its approach to enterprise data management. This is important for the following reason:
Data modelling tools are software that automate many of the tasks the data modeller performs.
One of the percentages to measure success of a records management system implantation is the percentage of the identified corporate records declared as such and put under records control.
Data governance requires control mechanisms and procedures for, but not limited to, identifying, capturing, logging and updating actions.
If data is not integrated with care it presents risk for unethical data handling. These ethical risks intersect with fundamental problems in data management including: Limited knowledge of data’s origin and lineage; Data of poor quality; Unreliable Metadata; and Documentation of error remediation.
All DMM and Data Governance assessments should identify its objectives and goals for improvement. This is important because:
Factors that have shown to play a key role in the success in the success of effective data management organizations does not include:
A hacker is a person who finds unknown operations and pathways within complex computer system. Hackers are only bad.
The percentage of enterprise computers having the most recent security patch
installed is a metric of which knowledge area?
With reliable Metadata an organization does not know what data it has, what the data represents and how it moves through the systems, who has access to it, or what it means for the data to be of high quality.
A sandbox is an alternate environment that allows write-only connections to production data and can be managed by the administrator.
A change management program supporting Data Governance should focus communication on what?
Data management professionals who understand formal change management will be more successful in bringing about changes that will help their organizations get more value from their data. To do so, it is important to understand:
Risk classifications describe the sensitivity of the data and the likelihood that it might be sought after for malicious purposes.
Obfuscating or redacting data is the practice of making information anonymous ot removing sensitive information. Risks are present in the following instances:
Please select the answers that correctly describes the set of principles that recognizes salient features of data management and guide data management practice.
The impact of the changes from new volatile data must be isolated from the bulk of the historical, non-volatile DW data. There are three main approaches, including:
Communication should start later in the process as too many inputs will distort the vision.
Business glossary is not merely a list of terms. Each term will be associated with other valuable metadata such as synonyms, metrics, lineage, or:
When measuring the value of data architecture one should be most concerned about
Implementing a BI portfolio is about identifying the right tools for the right user communities within or across business units.
There are several reasons to denormalize data. The first is to improve performance by:
Data professional should not balance the short-term versus long-term business interests.
Data quality management is a key capability of a data management practice and organization.
When doing reference data management, there many organizations that have standardized data sets that are incredibly valuable and should be subscribed to. Which of these organizations would be least useful?
CMDB provide the capability to manage and maintain Metdata specifically related to the IT assets, the relationships among them, and contractual details of the assets.
All assessments should include a roadmap for phased implementation of the recommendations. This is important because:
In matching, false positives are three references that do not represent the same entity are linked with a single identifier.
Those responsible for the data-sharing environment have an obligation to downstream data consumers to provide high quality data.
The categories of the Data Model Scorecard with the highest weightings include:
Data stewardship is the least common label to describe accountability and responsibility for data and processes to ensure effective control and use of data assets.
Data modelling is most infrequently performed in the context of systems and maintenance efforts, known as SDLC.
Data Management maturity has many goals for accomplishment including having a positive effect on culture. This is important to a Data Governance program for the following reason:
While the focus of data quality improvement efforts is often on the prevention of errors, data quality can also be improved through some forms of data processing.
Time-based patterns are used when data values must be associated in chronological order and with specific time values.
Integrating data security with document and content management knowledge areas.
guides the implementation of:
According to the DMBoK2, by creating Data Management Services, IT involves the Data Governance Council:
When recovering from multiple system failures, what is the biggest difficulty faced
by a DBA?
The advantage of a decentralized data governance model over a centralized model is:
The data-vault is an object-orientated, time-based and uniquely linked set of normalized tables that support one or more functional areas of business.
Typically, DW/BI projects have three concurrent development tracks, including:
What are some of the business drivers for the ethical handling of data that Data Governance should satisfy?
Malware refers to any infectious software created to damage, change or improperly access a computer or network.
The accuracy dimension of data quality refers to the degree that data correctly respresents ‘real-life’ entities.
Data Storage and Operations: The design, implementation and support of stored data to maximize its value.
Functionality-focused requirements associated with a comprehensive metadata solution, include:
A deliverable in the data modelling and design context diagram is the logical data model.
Content needs to be modular, structured, reusable and device and platform independent.
A goal of data architecture is to identify data storage and processing requirements.
The database administrator (DBA) is the most established and the most widely adopted data professional role.
The need to manage data movement efficiently is a primary driver for Data Integration and Interoperability.
Differentiating between data and information. Please select the correct answers based on the sentence below: Here is a marketing report for the last month [1]. It is based on data from our data warehouse[2]. Next month these results [3] will be used to generate our month-over-month performance measure [4].
A ‘Golden Record’ means that it is always a 100% complete and accurate representation of all entities within the organization.
Data management organizational constructs include the following type of model.
Data architect: A senior analyst responsible for data architecture and data integration.
The impact of the changes from new volatile data must be isolated from the bulk of the historical, non-volatile DW data. There are three main approaches, including:
Content refers to the data and information inside a file, document or website.
ISO 8000 will describe the structure and the organization of data quality management, including:
Release management is critical to batch development processes that grows new capabilities.
To mitigate risks, implement a network-based audit appliance, which can address most of the weaknesses associated with the native audit tools. This kind of appliance has the following benefits:
An effective team is based on two simple foundations: trust and a common goal.
Enterprise service buses (ESB) are the data integration solution for near real-time sharing of data between many systems, where the hub is a virtual concept of the standard format or the canonical model for sharing data in the organization.
A staff member has been detected inappropriately accessing client records from
usage logs. The security mechanism being used is an:
Which of the following are must-do for any successful Data Governance programme?
Effectiveness metrics for a data governance programme includes: achievement of goals and objectives; extend stewards are using the relevant tools; effectiveness of communication; and effectiveness of education.
To build models, data modellers heavily rely on previous analysis and modelling work.
Please select the incorrect item that does not represent a dimension in the Data Values category in Data Quality for the Information age.
Because Data Governance activities require coordination across functional areas, the DG program must establish an ___________ that defines accountabilities and intersections.
A completely distributed architecture maintains a single access point. The metadata retrieval engine responds to user requests by retrieving data from source systems in real time.
XML provides a language for representing both structures and unstructured data and information.
A weakness or defect in a system that allows it to be successfully attacked and
compromised is a:
Measuring the effects of change management on in five key areas including: Awareness of the need to change; Desire to participate and support the change; Knowledge about how to change; Ability to implement new skills and behaviors; and Reinforcement to keep the change in place.
Repositories facilitate the collection, publishing and distribution of data in a centralized and possibly standardized way. Data is most often used to:
Change Data Capture is a method of reducing bandwidth by filtering to include only data that has been changed within a defined timeframe.
Document and content management is defined as planning, implementation and control activities for storage management of data and information found in any form or medium.
The load step of the ETL is physically storing or presenting the results of the transformation into the source system.
ISO 8000 will describe the structure and organization of data quality management, including:
When developing a Data Governance operating framework, what areas should be considered?
The European Commission Article 29 Data Protection Working Party provides a set of criteria to evaluate anonymization methods. What do they recommend?
The number of entities in a relationship is the arity of the relationship. The most common are:
Record management starts with a vague definition of what constitutes a record.
The business glossary application is structured to meet the functional requirements of the three core audiences:
Data profiling also includes cross-column analysis, which can identify overlapping or duplicate columns and expose embedded value dependencies.
The language used in file-based solutions is called MapReduce. This language has three main steps:
A dimensional physical data model is usually a star schema, meaning there is one structure for each dimension.