Week 5 Lectures

Week 5 Lecture Materials

Designing the File or Database

Key Points and Objectives

1. The objectives in the design of data storage organization are:

      A. The data must be available when the user wants to use it.
      B. The data must be accurate and consistent.
      C. Efficient storage of data as well as efficient updating and retrieval
      D. It is necessary that information retrieval be purposeful.
      E. The information obtained from the stored data must be in an integrated form to be useful for
     managing, planning, controlling, or decision making.

     2.   There are two approaches to the storage of data in a computer-based system. The first method is to
     store the data in individual files, each unique to a particular application. The second approach to the
     storage of data in a computer-based system involves building a database, which is a formally defined
     and centrally controlled store of data intended for use in many different applications.

     3.   A file can be designed and built quite rapidly, and the concerns for data availability and security are
     minimized. Also, systems analysts can choose an appropriate file structure according to the required
     processing speed of the particular application system.

4. The effectiveness objectives of the database include:

      A. Ensuring that data can be shared among users for a variety of applications
      B. Maintaining data that are both accurate and consistent
      C. Ensuring all data required for current and future applications will be readily available
      D. Allowing the database to evolve and the needs of the users to grow
      E. Allowing users to construct their personal view of the data without concern for the way the
     data are physically stored

5. Entities are objects or events for which data is collected and stored.

     6.   Relationships are associations between entities. They may be one-to-one, one-to-many or many-to-
     many. Relationships are shown with a either a zero on the relationship line (representing none), a
     small vertical line crossing the relationship line (representing one), or a crowsfoot symbol,
     representing many.

7. Attributes are a characteristic of an entity, sometimes called a field.

8. Records are a collection of data items that have something in common.

9. Keys are data items in a record used to identify the record. Key types are:

      A. Primary key, unique for the record
      B. Secondary key, a key which may not be unique
      C. Concatenated key, a combination of two or more data items for the key
      D. Foreign key, a data item in one record that is the key of another record

     10. A file contains groups of records used to provide information for operations, planning, management,
     and decision making. Files can be used for storing data for an indefinite period of time, or they can
     be used to store data temporarily for a specific purpose. The following types of files are available
     for different purposes: master file, table file, transaction file, work file, and report file.

11. Master files have long records and contain all pertinent information about an entity. Transaction
records are short and contain information used to update master files.

     12. According to the specific characteristics of a particular application system, systems analysts may
     choose different organizational structures for the file design. The available file organizations are:
     sequential organization, linked lists, hashed file organization, indexed organization, and
     indexed-sequential organization. VSAM (Virtual Storage Access Method) is a newer method for
     storing both sequential and indexed-sequential files.

     13. A database, unlike a file, is intended to be shared by many users. It is clear that each user sees the
     data in different ways. Systems analysts may design a database following a specific logic to satisfy
     users of their application systems. The three main types of database structures are: hierarchical
     database structures, network database structures, and relational database structures.

14. Normalization is the transformation of complex user views and data to a set of smaller, stable, and
easily maintainable data structures. The three steps of data normalization are:

      A. Remove all repeating groups and identify the primary key.
      B. Ensure that all nonkey attributes are fully dependent on the primary key.
      C. Remove any transitive dependencies, attributes which are dependent on other nonkey
     attributes.

15. The entity-relationship diagram may be used to determine record keys.

16. Guidelines for creating master files or database relations are:

      A. Each separate entity should have its own master file or database relation.
      B. A specific, non-key data field should exist on only one master file or relation.
      C. Each master file or relation should have programs to create, read, update and delete records.

17. The following guidelines are available to retrieve and present data in a database effectively:

      A. Choose a relation from the database.
      B. Join two relations together.
      C. Project columns from the relation.
      D. Select rows from the relation.
      E. Derive new attributes.
      F. Index or sort rows.
      G. Calculate totals and performance measures.
      H. Present data.

Object-Oriented Systems Analysis and Design

Key Points and Objectives

1. Object-oriented techniques work well in situations where complicated information systems are
undergoing continuous maintenance, adaptation and design.

2. Six ideas characterize object-oriented programming:

      A. An object, which represents a real-world thing or event
      B. A class, or group of related objects
      C. Messages, sent between objects
      D. Encapsulation, only an object makes changes through its own behavior
      E. Inheritance, a new class created from another class
      F. Polymorphism, meaning that a derived class behavior may be different from the base class

3. Object-oriented analysis and design is based on a five-layer model:

      A. Class/object layer notes the classes and objects.
      B. Structure layer captures various structures of classes and objects, such as one-to-many
     relationships and inheritance.
      C. Attribute layer provides details the attributes of classes.
      D. Service layer notes messages and object behaviors.
      E. Subject layer divides the design into implementation units or team assignments.

4. There are five general types of objects:

      A. Tangible things
      B. Roles
      C. Incidents
      D. Interactions
      E. Specifications details

5. Use the following criteria to determine whether a new class of objects is justified.

      A. There is a need to remember the object.
      B. There is a need for certain behaviors of the object.
      C. An object has multiple attributes.
      D. A class has more than one object instantiation (unless it is a base class).
      E. Attributes have a meaningful value for each object in a class.
      F. Services behave the same for every object in a class.
      G. Objects implement requirements that are derived from the problem setting.
      H. Objects do not duplicate attributes and services that could be derived from other objects in
     the system.

6. There are two basic types of structures that might be imposed on classes and objects:

      A. Generalization-Specialization Structure (Gen-Spec), which connect class-to-class.
      B. Whole-Part Structure, which are collections of different objects that compose another whole
     object.

7. Instance connections are references between objects such as associations or relationships indicated
by a single line between objects using the same cardinality notation as Whole-Part Structures.

8. Services (or methods or procedures) must be analyzed. There are three activities:

      A. Object state analysis, showing changes of state
      B. Service specification: creating, storing, retrieving, connecting, accessing and deleting
     objects
      C. Message specification, consisting of control and data flow

9. Object-oriented design activities are grouped into four major components:

      A. The problem domain component
      B. The human interface component
      C. The data management component
      D. The task management component

10. The problem domain component consists of:

      A. Reuse design
      B. Implementation structures
      C. Language accommodation

11. There are three ways to design the data management component:

      A. Build storage services into each Class-&-Object in the design.
      B. Create an Objectserver Class-&-Object that provides all database services.
      C. Create a Storable Class.

Logical Data Modelling Design

Logical data modelling is the design activity performed by systems analysts that involves representing the data required in a format that is close to the physical implementation requirements. It is a bottom-up activity that generally starts with the actual data requirements on the forms and reports used in an information system.

Logical Data Modelling Design Outputs
Set of Normalized Relations. A set of normalized Relations is produced to reflect the complete data requirements of the system.
Updated Data Dictionary. Each Relation and Attributes within the Relations are described, and the descriptions added to the existing Data Dictionary. There is generally overlap between the Attributes on an E-R Diagram and those in the Relational Data Model.

Logical Data Modelling Process
The process of producing a logical data model for the system involves four basic steps
· a logical data model is created for each user view,
· the logical data models for each view are consolidated into one logical data model,
· the conceptual data model is converted into a logical data model, and
· the consolidated logical data model is compared to and combined with the converted logical data model to produce a single logical data model for the system.

Relations
Relations are the table-like structures (with rows and columns) used to represent the data. The notation for Relations consists of the Relation name followed by the names of the Attributes of the Relation in brackets. The Primary Key of the Relation is underlined and generally appears at the start of the Relation.Relation Properties

Relations have a number of properties that separate them from a table of data including
· each row in the Relation is distinct form all other rows,
· each Attribute in the Relation is distinct from all other Attributes,
· the values for each Attribute come from the same domain, and
· the values for each Attribute are simple and single-valued.

Relation Inconsistencies
There are three operations that can be performed on a Relation that may result in inconsistencies in the data in a Relation. The operations change the data in a Relation through inserting a new row, updating the value of an Attribute in one or more rows, and deleting rows. The logical data modelling activity attempts to eliminate the likelihood of inconsistencies occurring by developing well-structured Relations that have been normalized.

Normalization
Normalization is the process of producing a logical data model for the system that contains well-structured Relations and thereby avoids data inconsistencies as a result of operations being performed on the data. Normalization involves decomposing Relations in a series of steps until the resulting Relations are all well-structured Relations. The steps correspond to a normal form starting with the First Normal Form (1NF) and continuing up to the Fifth Normal Form (5NF). Testing each Relation against the rules for 1NF, 2NF and 3NF is the most common form of Normalization. The latter normal forms are not generally applied in practice.

Normal Form Tests
The test for 1NF is to check that all Attributes have a single value, which is one of the properties of a Relation. The test for 2NF is to check that all non-key Attributes are fully functionally dependent on the primary key of the Relation. The test for 3NF is to check that there are no transitive dependencies between the non-key Attributes. When Relations do not satisfy the normal form test, they are generally decomposed into a number of Relations, which are then individually tested. The normal forms build on each other such that only when Relations have been assessed as 1NF are they are tested for 2NF, and only when they have been assessed as 2NF are they tested for 3NF.

Foreign Keys
An important aspect of decomposing Relations into smaller normalized Relations is to maintain the original links between the Attributes. This is done with the introduction of a Foreign Key, the same Attribute that is a Primary Key in a Relation, into another Relation.

Converting E-R Diagrams
The process to create a Relational Data Model from an E-R Diagram involves a number of steps
· the Entities and their Attributes are represented as Relations,
· the Relationships between Entities including Attributes on the Relationships are represented as Relations, and
· all of the Relations produced are normalized.

Entities and Attributes
Converting an Entity and its Attributes into a Relation is a simple process of using the Entity name for Relation name and using the Primary Key and Attribute names from the Entity as the Primary Key and Attribute names for the Relation.
A Weak Entity has an additional requirement and that is the introduction of the Primary Key of the related Entity in addition to the Primary Key of the Weak Entity as part of the composite Primary Key of the Relation.

Relationships and Attributes
Converting a Relationship and its Attributes is a more complicated process as the degree and cardinality of the Relationship affect the manner in which it is converted. There are two general approaches in the conversion
· the Primary Key of one of the Entities involved in the Relationship is included as a Foreign Key in the Relation for the other Entity, and
· a new Relation is created using the Primary Key of all Entities involved in the Relationship as the composite Primary Key for the new Relation.
Specific combinations of degree and cardinality use one or other of the approaches identified.