Main article: Database design
The first task of a database designer is to
produce a conceptual data model that reflects
the structure of the information to be held in the
database. A common approach to this is to
develop an entity-relationship model, often with
the aid of drawing tools. Another popular
approach is the Unified Modeling Language . A
successful data model will accurately reflect the
possible state of the external world being
modeled: for example, if people can have more
than one phone number, it will allow this
information to be captured. Designing a good
conceptual data model requires a good
understanding of the application domain; it
typically involves asking deep questions about
the things of interest to an organisation, like
"can a customer also be a supplier?", or "if a
product is sold with two different forms of
packaging, are those the same product or
different products?", or "if a plane flies from
New York to Dubai via Frankfurt, is that one
flight or two (or maybe even three)?". The
answers to these questions establish definitions
of the terminology used for entities (customers,
products, flights, flight segments) and their
relationships and attributes.
Producing the conceptual data model
sometimes involves input from business
processes , or the analysis of workflow in the
organization. This can help to establish what
information is needed in the database, and what
can be left out. For example, it can help when
deciding whether the database needs to hold
historic data as well as current data.
Having produced a conceptual data model that
users are happy with, the next stage is to
translate this into a schema that implements the
relevant data structures within the database.
This process is often called logical database
design, and the output is a logical data model
expressed in the form of a schema. Whereas the
conceptual data model is (in theory at least)
independent of the choice of database
technology, the logical data model will be
expressed in terms of a particular database
model supported by the chosen DBMS. (The
terms data model and database model are often
used interchangeably, but in this article we use
data model for the design of a specific
database, and database model for the modelling
notation used to express that design.)
The most popular database model for general-
purpose databases is the relational model, or
more precisely, the relational model as
represented by the SQL language. The process
of creating a logical database design using this
model uses a methodical approach known as
normalization . The goal of normalization is to
ensure that each elementary "fact" is only
recorded in one place, so that insertions,
updates, and deletions automatically maintain
consistency.
The final stage of database design is to make
the decisions that affect performance,
scalability, recovery, security, and the like. This
is often called physical database design . A key
goal during this stage is data independence ,
meaning that the decisions made for
performance optimization purposes should be
invisible to end-users and applications. Physical
design is driven mainly by performance
requirements, and requires a good knowledge of
the expected workload and access patterns, and
a deep understanding of the features offered by
the chosen DBMS.
Another aspect of physical database design is
security. It involves both defining access
control to database objects as well as defining
security levels and methods for the data itself.
Models
Collage of five types of database models.
Main article: Database model
A database model is a type of data model that
determines the logical structure of a database
and fundamentally determines in which manner
data can be stored, organized, and manipulated.
The most popular example of a database model
is the relational model (or the SQL
approximation of relational), which uses a table-
based format.
Common logical data models for databases
include:
Hierarchical database model
Network model
Relational model
Entity–relationship model
Enhanced entity–relationship model
Object model
Document model
Entity–attribute–value model
Star schema
An object-relational database combines the two
related structures.
Physical data models include:
Inverted index
Flat file
Other models include:
Associative model
Multidimensional model
Multivalue model
Semantic model
XML database
Named graph
External, conceptual, and internal views
Traditional view of data [25]
A database management system provides three
views of the database data:
The external level defines how each group of
end-users sees the organization of data in the
database. A single database can have any
number of views at the external level.
The conceptual level unifies the various
external views into a compatible global view.
[26] It provides the synthesis of all the external
views. It is out of the scope of the various
database end-users, and is rather of interest to
database application developers and database
administrators.
The internal level (or physical level) is the
internal organization of data inside a DBMS (see
Implementation section below). It is concerned
with cost, performance, scalability and other
operational matters. It deals with storage layout
of the data, using storage structures such as
indexes to enhance performance. Occasionally it
stores data of individual views ( materialized
views ), computed from generic data, if
performance justification exists for such
redundancy. It balances all the external views'
performance requirements, possibly conflicting,
in an attempt to optimize overall performance
across all activities.
While there is typically only one conceptual (or
logical) and physical (or internal) view of the
data, there can be any number of different
external views. This allows users to see
database information in a more business-
related way rather than from a technical,
processing viewpoint. For example, a financial
department of a company needs the payment
details of all employees as part of the
company's expenses, but does not need details
about employees that are the interest of the
human resources department. Thus different
departments need different views of the
company's database.
The three-level database architecture relates to
the concept of data independence which was
one of the major initial driving forces of the
relational model. The idea is that changes made
at a certain level do not affect the view at a
higher level. For example, changes in the
internal level do not affect application programs
written using conceptual level interfaces, which
reduces the impact of making physical changes
to improve performance.
The conceptual view provides a level of
indirection between internal and external. On
one hand it provides a common view of the
database, independent of different external view
structures, and on the other hand it abstracts
away details of how the data is stored or
managed (internal level). In principle every level,
and even every external view, can be presented
by a different data model. In practice usually a
given DBMS uses the same data model for both
the external and the conceptual levels (e.g.,
relational model). The internal level, which is
hidden inside the DBMS and depends on its
implementation (see Implementation section
below), requires a different level of detail and
uses its own types of data structure types.
Separating the external , conceptual and internal
levels was a major feature of the relational
database model implementations that dominate
21st century databases. [26]
The first task of a database designer is to
produce a conceptual data model that reflects
the structure of the information to be held in the
database. A common approach to this is to
develop an entity-relationship model, often with
the aid of drawing tools. Another popular
approach is the Unified Modeling Language . A
successful data model will accurately reflect the
possible state of the external world being
modeled: for example, if people can have more
than one phone number, it will allow this
information to be captured. Designing a good
conceptual data model requires a good
understanding of the application domain; it
typically involves asking deep questions about
the things of interest to an organisation, like
"can a customer also be a supplier?", or "if a
product is sold with two different forms of
packaging, are those the same product or
different products?", or "if a plane flies from
New York to Dubai via Frankfurt, is that one
flight or two (or maybe even three)?". The
answers to these questions establish definitions
of the terminology used for entities (customers,
products, flights, flight segments) and their
relationships and attributes.
Producing the conceptual data model
sometimes involves input from business
processes , or the analysis of workflow in the
organization. This can help to establish what
information is needed in the database, and what
can be left out. For example, it can help when
deciding whether the database needs to hold
historic data as well as current data.
Having produced a conceptual data model that
users are happy with, the next stage is to
translate this into a schema that implements the
relevant data structures within the database.
This process is often called logical database
design, and the output is a logical data model
expressed in the form of a schema. Whereas the
conceptual data model is (in theory at least)
independent of the choice of database
technology, the logical data model will be
expressed in terms of a particular database
model supported by the chosen DBMS. (The
terms data model and database model are often
used interchangeably, but in this article we use
data model for the design of a specific
database, and database model for the modelling
notation used to express that design.)
The most popular database model for general-
purpose databases is the relational model, or
more precisely, the relational model as
represented by the SQL language. The process
of creating a logical database design using this
model uses a methodical approach known as
normalization . The goal of normalization is to
ensure that each elementary "fact" is only
recorded in one place, so that insertions,
updates, and deletions automatically maintain
consistency.
The final stage of database design is to make
the decisions that affect performance,
scalability, recovery, security, and the like. This
is often called physical database design . A key
goal during this stage is data independence ,
meaning that the decisions made for
performance optimization purposes should be
invisible to end-users and applications. Physical
design is driven mainly by performance
requirements, and requires a good knowledge of
the expected workload and access patterns, and
a deep understanding of the features offered by
the chosen DBMS.
Another aspect of physical database design is
security. It involves both defining access
control to database objects as well as defining
security levels and methods for the data itself.
Models
Collage of five types of database models.
Main article: Database model
A database model is a type of data model that
determines the logical structure of a database
and fundamentally determines in which manner
data can be stored, organized, and manipulated.
The most popular example of a database model
is the relational model (or the SQL
approximation of relational), which uses a table-
based format.
Common logical data models for databases
include:
Hierarchical database model
Network model
Relational model
Entity–relationship model
Enhanced entity–relationship model
Object model
Document model
Entity–attribute–value model
Star schema
An object-relational database combines the two
related structures.
Physical data models include:
Inverted index
Flat file
Other models include:
Associative model
Multidimensional model
Multivalue model
Semantic model
XML database
Named graph
External, conceptual, and internal views
Traditional view of data [25]
A database management system provides three
views of the database data:
The external level defines how each group of
end-users sees the organization of data in the
database. A single database can have any
number of views at the external level.
The conceptual level unifies the various
external views into a compatible global view.
[26] It provides the synthesis of all the external
views. It is out of the scope of the various
database end-users, and is rather of interest to
database application developers and database
administrators.
The internal level (or physical level) is the
internal organization of data inside a DBMS (see
Implementation section below). It is concerned
with cost, performance, scalability and other
operational matters. It deals with storage layout
of the data, using storage structures such as
indexes to enhance performance. Occasionally it
stores data of individual views ( materialized
views ), computed from generic data, if
performance justification exists for such
redundancy. It balances all the external views'
performance requirements, possibly conflicting,
in an attempt to optimize overall performance
across all activities.
While there is typically only one conceptual (or
logical) and physical (or internal) view of the
data, there can be any number of different
external views. This allows users to see
database information in a more business-
related way rather than from a technical,
processing viewpoint. For example, a financial
department of a company needs the payment
details of all employees as part of the
company's expenses, but does not need details
about employees that are the interest of the
human resources department. Thus different
departments need different views of the
company's database.
The three-level database architecture relates to
the concept of data independence which was
one of the major initial driving forces of the
relational model. The idea is that changes made
at a certain level do not affect the view at a
higher level. For example, changes in the
internal level do not affect application programs
written using conceptual level interfaces, which
reduces the impact of making physical changes
to improve performance.
The conceptual view provides a level of
indirection between internal and external. On
one hand it provides a common view of the
database, independent of different external view
structures, and on the other hand it abstracts
away details of how the data is stored or
managed (internal level). In principle every level,
and even every external view, can be presented
by a different data model. In practice usually a
given DBMS uses the same data model for both
the external and the conceptual levels (e.g.,
relational model). The internal level, which is
hidden inside the DBMS and depends on its
implementation (see Implementation section
below), requires a different level of detail and
uses its own types of data structure types.
Separating the external , conceptual and internal
levels was a major feature of the relational
database model implementations that dominate
21st century databases. [26]
No comments:
Post a Comment