DATA FLOW DIAGRAM (DFD) IN SOFTWARE ENGINEERING

Bharathi. G

 Data Flow Diagram (DFD) in Software Engineering

A Data Flow Diagram (DFD) is a graphical representation that shows how data moves through a system, using a standardized set of symbols and notations. It illustrates the flow of information and the processes that transform it, making complex systems easier to understand.

DFDs are often used as part of formal methodologies like the Structured Systems Analysis and Design Method (SSADM). While they may resemble flowcharts or Unified Modeling Language (UML) diagrams, they differ in purpose—they are not designed to show detailed program logic but to focus on data movement and transformation.




Purpose of Data Flow Diagrams

DFDs simplify the description of business processes by replacing long, textual requirements with clear visual representations of data flow and process sequences.

They are used to
  • Capture the results of business analysis.
  • Show how data moves through and is transformed by application processes.
  • Represent both automated and manual workflows.
DFDs often start as high-level overviews and are refined throughout the development process to show increasing detail.

A Brief History of DFDs

Data Flow Diagrams emerged in the late 1970s, before UML became standard. They were inspired by the data flow graph computation models of David Martin and Gerald Estrin at UCLA.

The concept was popularized by Larry Constantine and Ed Yourdon in their book Structured Design, which introduced structured analysis—a methodology that reshaped software engineering and led to object-oriented design.

Chris Gane, Trish Sarson, and Tom DeMarco further contributed by standardizing the symbols and notations still used in DFDs today.

Early DFDs transformed the way teams approached software engineering by:
  • Mapping workflows clearly.
  • Identifying where data is stored.
  • Connecting system design directly to business processes.

Rules for Creating a DFD

Most DFDs follow these core guidelines:
  1. Label data flows clearly – Use short, descriptive text to show what data is being transferred.
  2. Name processes with verb phrases – Clearly state the transformation occurring.
  3. Label data stores with noun phrases – Indicate the type of data stored.
  4. Ensure proper inputs and outputs – Each process and data store must have at least one input and output.
  5. No direct connection between external entities and data stores – Data must pass through a process first.
  6. Avoid crossing data flows – This keeps diagrams clean and readable.

Main Components of a DFD

A complete DFD typically includes four core elements:



  1. External Entities – Sources or destinations of data outside the system.
  2. Processes – Activities that transform input data into output data.
  3. Data Stores – Repositories where data is held for later use.
  4. Data Flows – Paths showing how data moves between entities, processes, and stores.

External Entities

In a Data Flow Diagram (DFD), external entities mark where data enters or exits the system. They are placed at the diagram’s boundaries to show the points of data input and output for the entire process or system. An external entity can be a person, a group, or another system.
For example, in a DFD illustrating the process of making a purchase and receiving a sales receipt, the customer could be an external entity. These entities are also referred to as terminators, actors, sources, or sinks.

Processes

Processes are activities that modify, transform, or act upon data to keep it moving through the system. These may include operations such as calculations, sorting, validation, redirection, or any other data transformation needed for the workflow.
For instance, during a customer purchase, one process could be credit card payment verification.

Data Stores

Data stores represent places where information is held for future use in a DFD. They could be databases, files, documents, or any other storage system.
Examples in a product fulfillment DFD include:
  • A product inventory database
  • A customer address database
  • A delivery schedule spreadsheet

Data Flows

Data flows show the paths along which information moves between external entities, processes, and data stores.
For example, in an e-commerce DFD, the connection between a user entering login details and the authentication gateway represents a data flow.

Difference Between Logical and Physical DFDs

  • Logical DFDs use abstract terms to represent how information moves logically through a system. They outline high-level processes and activities without focusing on technology specifics.
  • Physical DFDs provide more concrete details about the actual information flow, including databases, software applications, and system components. They also often illustrate the exact actions performed on the data and the resources involved.

Generally:
  • Logical DFDs are favored by business analysts, line managers, and enterprise architects for conceptual design.
  • Physical DFDs are preferred by development teams for implementation planning.

Notations and Symbols in DFDs

The exact symbols used in DFDs can vary depending on the chosen methodology. While standard conventions exist, some organizations develop their own notations (though this is generally discouraged).







Common DFD Notations

Popular data flow diagram (DFD) notations include:
  • Gane and Sarson
  • DeMarco and Yourdon
  • SSADM
  • UML – often applied to illustrate software architecture.

Meaning of Core DFD Elements:

  • External Entities – Represent parties or systems that send data into, or receive data from, the system.
  • Data Flows – Show how data moves into, out of, and within the system.
  • Data Stores – Indicate where information is held, typically databases or files.
  • Processes – Depict activities that transform or manipulate data.
Each notation uses distinct symbols, which can make DFDs difficult to interpret if you’re unfamiliar with the specific methodology. For example, in Gane and Sarson diagrams, processes have rounded rectangles and entities use square-corner boxes. Yourdon and DeMarco, on the other hand, represent processes as circles and entities as rectangles. SSADM swaps several of these conventions. Similarly, DeMarco and Yourdon use parallel lines for data stores, while other methods vary in their depictions.
For clarity, organizations should select and consistently follow one notation style.

DFD Layers and Levels

DFDs can be broken down into progressively detailed levels:
  • Level 0 – Also called a context diagram, gives a high-level overview of the system.
  • Level 1 – Expands the overview to include subprocesses and more detail.
  • Level 2 – Breaks subprocesses into further detailed steps.
  • Level 3 – Rarely used, but can describe highly complex systems.
Additional levels are possible, though generally unnecessary for most applications.

Steps to Create a DFD

  1. Define the system or process to model.
  2. Identify and group external entities, data flows, processes, and data stores.
  3. Create a Level 0 context diagram showing basic connections.
  4. Develop Level 1 diagrams with additional processes, flows, and storage branching from the Level 0 elements.
  5. Add deeper levels as needed for detail.
  6. Review each diagram to ensure completeness and accuracy.

Types of DFDs

Most examples are tailored to a single methodology, which makes understanding them easier within that framework. Unlike UML or flowcharts (focused on software or control flow), DFDs often provide a business or functional perspective.

Example: A school’s culinary program represented using the Gane and Sarson method.

Tools for Creating DFDs

While DFDs can be drawn by hand, most are created digitally. General-purpose graphics or presentation software can be used, but may impose layout limits. Specialized DFD software often supports specific methodologies, includes symbol libraries, and offers better scalability.
Examples of tools:
  • Canva
  • ConceptDraw
  • Creately
  • Lucidchart
  • Miro
  • SmartDraw
  • Venngage
  • Visual Paradigm
  • Wondershare EdrawMax
Choosing a tool that matches the chosen methodology can improve workflow, especially when importing/exporting diagrams between systems.

Advantages of DFDs

  • Clear visualization – Offers an easy-to-understand view of data movement.
  • Better understanding – Encourages insight into processes and potential improvements.
  • Improved data resource mapping – Helps identify and manage data storage, processes, and interactions
  • Troubleshooting – Makes it easier to spot bottlenecks or inefficiencies.
  • Enhanced documentation – Facilitates communication about system processes.













Our website uses cookies to enhance your experience. Learn More
Accept !