An entity is a principal data object that is of significant interest to the user. It is usually a person, place, thing, or event to be recorded in the database. If the data model were a language, entities would be nouns. The demonstration database provided with your software contains the following entities: customer, orders, items, stock, catalog, cust_calls, call_type, manufact, and state.
You can probably list several entities for your database immediately. Make a preliminary list of all the entities you can identify. Interview the potential users of the database for their opinions about what must be recorded in the database. Determine basic characteristics for each entity, such as "at least one address must be associated with a name." All the decisions you make about the entities become your business rules. The telephone directory example on page Figure 1 provides some of the business rules for the example in this chapter.
Later, when you normalize your data model, some of the entities can expand or become other data objects. For more information, see Normalizing a Data Model.
When the list of entities seems complete, check the list to make sure that each entity has the following qualities:
List only entities that are important to your database users and that are worth the trouble and expense of computer tabulation.
List only types of things, not individual instances. For instance, symphony might be an entity, but Beethoven's Fifth would be an entity instance or entity occurrence.
List only entities that exist independently and do not need something else to explain them. Anything you might call a trait, a feature, or a description is not an entity. For example, a part number is a feature of the fundamental entity called part. Also, do not list things that you can derive from other entities; for example, avoid any sum, average, or other quantity that you can calculate in a SELECT expression.
Be sure that each entity you name represents a single class. It cannot be separated into subcategories, each with its own features. In the telephone directory example in Figure 1, the telephone number, an apparently simple entity, actually consists of three categories, each with different features.
These choices are neither simple nor automatic. To discover the best choice of entities, you must think carefully about the nature of the data you want to store. Of course, that is exactly the point of a formal data model. The following section describes the telephone directory example in detail.
Suppose that you create a database for a personal telephone directory. The database model must record the names, addresses, and telephone numbers of people and organizations that the user needs.
First define the entities. Look carefully at a page from a telephone directory to identify the entities that it contains. Figure 1 shows a sample page from a telephone directory.
The physical form of the existing data can be misleading. Do not let the layout of pages and entries in the telephone directory mislead you into trying to specify an entity that represents one entry in the book: an alphabetized record with fields for name, number, and address. You want to model the data, not the medium.
At first glance, the entities that are recorded in a telephone directory include the following items:
Do these entities meet the earlier criteria? They are clearly significant to the model and are generic.
A good test is to ask if an entity can vary in number independently of any other entity. A telephone directory sometimes lists people who have no number or current address (people who move or change jobs) and also can list both addresses and numbers that more than one person uses. All three of these entities can vary in number independently; this fact strongly suggests that they are fundamental, not dependent.
Names can be split into personal names and corporate names. You decide that all names should have the same features in this model; that is, you do not plan to record different information about a company than you would record about a person. Likewise, you decide that only one kind of address exists; you do not need to treat home addresses differently from business addresses.
However, you also realize that more than one kind of telephone number exists. Voice numbers are answered by a person, fax numbers connect to a fax machine, and modem numbers connect to a computer. You decide that you want to record different information about each kind of number, so these three types are different entities.
For the personal telephone directory example, you decide that you want to keep track of the following entities:
Later in this chapter you can learn how to use the E-R diagrams. For now, create a separate, rectangular box for each entity in the telephone directory example, as Figure 2 shows. Diagramming Data Objects shows how to put the entities together with relationships.